Evaluating the Construct Validity of an Automated Writing Evaluation System with a Randomization Algorithm
Saved in:
| Title: | Evaluating the Construct Validity of an Automated Writing Evaluation System with a Randomization Algorithm |
|---|---|
| Language: | English |
| Authors: | Myers, Matthew C. (ORCID |
| Source: | International Journal of Artificial Intelligence in Education. Sep 2023 33(3):609-634. |
| Availability: | Springer. Available from: Springer Nature. One New York Plaza, Suite 4600, New York, NY 10004. Tel: 800-777-4643; Tel: 212-460-1500; Fax: 212-460-1700; e-mail: customerservice@springernature.com; Web site: https://link.springer.com/ |
| Peer Reviewed: | Y |
| Page Count: | 26 |
| Publication Date: | 2023 |
| Document Type: | Journal Articles Reports - Research |
| Education Level: | Junior High Schools Middle Schools Secondary Education Elementary Education Grade 7 Grade 8 |
| Descriptors: | Construct Validity, Automation, Writing Evaluation, Algorithms, Scoring, Persuasive Discourse, Essays, Middle School Students, Grade 7, Grade 8, Programming Languages, Scores, Sentences, Concept Formation, Text Structure, Formative Evaluation, Feedback (Response), Computer Assisted Testing |
| DOI: | 10.1007/s40593-022-00301-6 |
| ISSN: | 1560-4292 1560-4306 |
| Abstract: | This study evaluated the construct validity of six scoring traits of an automated writing evaluation (AWE) system called "MI Write." Persuasive essays (N = 100) written by students in grades 7 and 8 were randomized at the sentence-level using a script written with Python's NLTK module. Each persuasive essay was randomized 30 times (n = 3000 total randomizations), and the mean trait scores for each set of randomized iterations were compared to those of the control text across all traits. We were specifically interested in evaluating the effects of randomization on the high-level traits of "idea development" and "organization." Given the rubrics and qualitative feedback provided by MI Write, we hypothesized that these high-level traits ought to be sensitive to sentence-level randomization (i.e., scores should decrease). Overall, complete randomizations did not consistently significantly impact trait scoring for these high-level writing traits. In fact, more than a third of the essays saw significant increases in one or both high-level traits despite randomization, indicating a disconnect between MI Write's formative feedback and its underlying constructs. Findings have implications for consumers and developers of AWE. |
| Abstractor: | As Provided |
| Entry Date: | 2023 |
| Accession Number: | EJ1388568 |
| Database: | ERIC |
Be the first to leave a comment!
Full Text Finder
Nájsť tento článok vo Web of Science