Dynamic Topic Models of 'Dynamic Topic Modelling for Exploring the Scientific Literature on Coronavirus: An Unsupervised Labelling Technique'

Gespeichert in:
Bibliographische Detailangaben
Titel: Dynamic Topic Models of 'Dynamic Topic Modelling for Exploring the Scientific Literature on Coronavirus: An Unsupervised Labelling Technique'
Autoren: Guillén-Pacho, Ibai, orcid:0000-0001-7801-
Weitere Verfasser: Badenes-Olmedo, Carlos, Corcho, Oscar
Verlagsinformationen: Zenodo
Publikationsjahr: 2024
Bestand: Zenodo
Schlagwörter: Topic Models, Dynamic Topic Models, Dynamic Topic Labelling, Topic Labelling
Beschreibung: This resource includes the models generated for the work Dynamic Topic Modelling for Exploring the Scientific Literature on Coronavirus: An Unsupervised Labelling Technique. Each zip file has the models with the different configurations (number of topics) for each type and, in addition, an evaluation script (bench.py) and different files necessary for this (localizer, timestamps, CORPUS etc.) are included. The requirements for reusing these models are as follows: Unzip all files and install the required packages ("requirements.txt" file). Download the precompiled DTM implementation of https://github.com/magsilva/dtm/tree/master/bin or compile manually the original implementation https://github.com/blei-lab/dtm Download the DTM wrapper from https://github.com/piskvorky/gensim/releases/tag/3.8.3 ("gensim-3.8.3/gensim/models/wrappers/dtmmodel.py"). Download the DETM python implementation of https://github.com/quynhneo/detm. To run model evaluation: modify the imports in the "bench.py" file to match the DETM and DTM models location and their full path (instructions in the file documentation). To repeat our topic study: follow the "notebook.ipynb" instructions. The overview of this resource is: RESOURCES├── BERTopic│ ├── BERTopic_100│ ├── BERTopic_100_probabilities.npy│ ├── BERTopic_100_topics│ ├── BERTopic_100_topic_words│ ├── BERTopic_200│ ├── BERTopic_200_probabilities.npy│ ├── BERTopic_200_topics│ ├── BERTopic_200_topic_words│ ├── BERTopic_300│ ├── BERTopic_300_probabilities.npy│ ├── BERTopic_300_topics│ ├── BERTopic_300_topic_words│ ├── BERTopic_400│ ├── BERTopic_400_probabilities.npy│ ├── BERTopic_400_topics│ └── BERTopic_400_topic_words│├── DETM│ ├── detm_deberta_model1 # 100 topics model │ ├── detm_deberta_model1_beta.mat│ ├── detm_deberta_model2 # 200 topics model │ ├── detm_deberta_model2_beta.mat│ ├── detm_word2vec_model1 # 100 topics model │ ├── detm_word2vec_model1_beta.mat│ ├── detm_word2vec_model2 # 200 topics model │ ├── detm_word2vec_model2_beta.mat│ └── min_df_3333│ └── .│├── DTM_ALL│ ├── ...
Publikationsart: other/unknown material
Sprache: English
ISSN: 2364-4168
Relation: https://zenodo.org/records/12750327; oai:zenodo.org:12750327; https://doi.org/10.5281/zenodo.12750327
DOI: 10.5281/zenodo.12750327
Verfügbarkeit: https://doi.org/10.5281/zenodo.12750327
https://zenodo.org/records/12750327
Rights: Creative Commons Attribution 4.0 International ; cc-by-4.0 ; https://creativecommons.org/licenses/by/4.0/legalcode
Dokumentencode: edsbas.B9B268CD
Datenbank: BASE
Beschreibung
Abstract:This resource includes the models generated for the work Dynamic Topic Modelling for Exploring the Scientific Literature on Coronavirus: An Unsupervised Labelling Technique. Each zip file has the models with the different configurations (number of topics) for each type and, in addition, an evaluation script (bench.py) and different files necessary for this (localizer, timestamps, CORPUS etc.) are included. The requirements for reusing these models are as follows: Unzip all files and install the required packages ("requirements.txt" file). Download the precompiled DTM implementation of https://github.com/magsilva/dtm/tree/master/bin or compile manually the original implementation https://github.com/blei-lab/dtm Download the DTM wrapper from https://github.com/piskvorky/gensim/releases/tag/3.8.3 ("gensim-3.8.3/gensim/models/wrappers/dtmmodel.py"). Download the DETM python implementation of https://github.com/quynhneo/detm. To run model evaluation: modify the imports in the "bench.py" file to match the DETM and DTM models location and their full path (instructions in the file documentation). To repeat our topic study: follow the "notebook.ipynb" instructions. The overview of this resource is: RESOURCES├── BERTopic│ ├── BERTopic_100│ ├── BERTopic_100_probabilities.npy│ ├── BERTopic_100_topics│ ├── BERTopic_100_topic_words│ ├── BERTopic_200│ ├── BERTopic_200_probabilities.npy│ ├── BERTopic_200_topics│ ├── BERTopic_200_topic_words│ ├── BERTopic_300│ ├── BERTopic_300_probabilities.npy│ ├── BERTopic_300_topics│ ├── BERTopic_300_topic_words│ ├── BERTopic_400│ ├── BERTopic_400_probabilities.npy│ ├── BERTopic_400_topics│ └── BERTopic_400_topic_words│├── DETM│ ├── detm_deberta_model1 # 100 topics model │ ├── detm_deberta_model1_beta.mat│ ├── detm_deberta_model2 # 200 topics model │ ├── detm_deberta_model2_beta.mat│ ├── detm_word2vec_model1 # 100 topics model │ ├── detm_word2vec_model1_beta.mat│ ├── detm_word2vec_model2 # 200 topics model │ ├── detm_word2vec_model2_beta.mat│ └── min_df_3333│ └── .│├── DTM_ALL│ ├── ...
ISSN:23644168
DOI:10.5281/zenodo.12750327