A Comparison of Latent Semantic Analysis and Latent Dirichlet Allocation in Educational Measurement
Saved in:
| Title: | A Comparison of Latent Semantic Analysis and Latent Dirichlet Allocation in Educational Measurement |
|---|---|
| Language: | English |
| Authors: | Jordan M. Wheeler, Allan S. Cohen, Shiyu Wang |
| Source: | Journal of Educational and Behavioral Statistics. 2024 49(5):848-874. |
| Availability: | SAGE Publications. 2455 Teller Road, Thousand Oaks, CA 91320. Tel: 800-818-7243; Tel: 805-499-9774; Fax: 800-583-2665; e-mail: journals@sagepub.com; Web site: https://sagepub.com |
| Peer Reviewed: | Y |
| Page Count: | 27 |
| Publication Date: | 2024 |
| Document Type: | Journal Articles Reports - Research |
| Descriptors: | Semantics, Educational Assessment, Evaluators, Reliability, Responses, Mathematical Models, Correlation, Language Usage, Item Analysis, Test Items, Measurement Techniques, Algorithms, Scoring, Thinking Skills, Simulation, Comparative Analysis |
| DOI: | 10.3102/10769986231209446 |
| ISSN: | 1076-9986 1935-1054 |
| Abstract: | Topic models are mathematical and statistical models used to analyze textual data. The objective of topic models is to gain information about the latent semantic space of a set of related textual data. The semantic space of a set of textual data contains the relationship between documents and words and how they are used. Topic models are becoming more common in educational measurement research as a method for analyzing students' responses to constructed-response items. Two popular topic models are latent semantic analysis (LSA) and latent Dirichlet allocation (LDA). LSA uses linear algebra techniques, whereas LDA uses an assumed statistical model and generative process. In educational measurement, LSA is often used in algorithmic scoring of essays due to its high reliability and agreement with human raters. LDA is often used as a supplemental analysis to gain additional information about students, such as their thinking and reasoning. This article reviews and compares the LSA and LDA topic models. This article also introduces a methodology for comparing the semantic spaces obtained by the two models and uses a simulation study to investigate their similarities. |
| Abstractor: | As Provided |
| Entry Date: | 2024 |
| Accession Number: | EJ1442196 |
| Database: | ERIC |
| Abstract: | Topic models are mathematical and statistical models used to analyze textual data. The objective of topic models is to gain information about the latent semantic space of a set of related textual data. The semantic space of a set of textual data contains the relationship between documents and words and how they are used. Topic models are becoming more common in educational measurement research as a method for analyzing students' responses to constructed-response items. Two popular topic models are latent semantic analysis (LSA) and latent Dirichlet allocation (LDA). LSA uses linear algebra techniques, whereas LDA uses an assumed statistical model and generative process. In educational measurement, LSA is often used in algorithmic scoring of essays due to its high reliability and agreement with human raters. LDA is often used as a supplemental analysis to gain additional information about students, such as their thinking and reasoning. This article reviews and compares the LSA and LDA topic models. This article also introduces a methodology for comparing the semantic spaces obtained by the two models and uses a simulation study to investigate their similarities. |
|---|---|
| ISSN: | 1076-9986 1935-1054 |
| DOI: | 10.3102/10769986231209446 |
Nájsť tento článok vo Web of Science