FINK NLP: A Natural Language Processing Toolkit for Structured Analysis of Multilingual Interview Data
Uloženo v:
| Název: | FINK NLP: A Natural Language Processing Toolkit for Structured Analysis of Multilingual Interview Data |
|---|---|
| Autoři: | Spitale, Giovanni, orcid:0000-0002-6812- |
| Přispěvatelé: | Germani, Federico |
| Informace o vydavateli: | Zenodo |
| Rok vydání: | 2025 |
| Sbírka: | Zenodo |
| Témata: | nlp, Natural Language Processing |
| Popis: | FINK NLP is a modular Jupyter-based pipeline designed for the structured extraction, organization, and analysis of multilingual interview transcripts stored as .docx files. It performs metadata parsing from filenames, text ingestion using textract, and corpus structuring into a DataFrame. The notebook supports selective subsetting by language, module, category, or expression. It integrates spaCy for lemmatization, gensim for topic modeling (LDA), and multiple Python visualization libraries (matplotlib, seaborn, wordcloud, pyLDAvis) to facilitate qualitative and quantitative content analysis. This repository includes the output tabular data (redacted for data protection) and the visualization outputs. |
| Druh dokumentu: | other/unknown material |
| Jazyk: | unknown |
| Relation: | https://zenodo.org/records/15394889; oai:zenodo.org:15394889; https://doi.org/10.5281/zenodo.15394889 |
| DOI: | 10.5281/zenodo.15394889 |
| Dostupnost: | https://doi.org/10.5281/zenodo.15394889 https://zenodo.org/records/15394889 |
| Rights: | Creative Commons Attribution 4.0 International ; cc-by-4.0 ; https://creativecommons.org/licenses/by/4.0/legalcode |
| Přístupové číslo: | edsbas.BC14446F |
| Databáze: | BASE |
Buďte první, kdo okomentuje tento záznam!
Nájsť tento článok vo Web of Science