Evaluation of Cohen's kappa and other measures of inter-rater agreement for genre analysis and other nominal data
Cohen's kappa (κ) is often recommended for nominal data as a measure of inter-rater (inter-coder) agreement or reliability. In this paper we ask which term is appropriate in genre analysis, what statistical measures are valid to measure it, and how much the choice of units affects the values ob...
Uložené v:
| Vydané v: | Journal of English for academic purposes Ročník 53; s. 101026 |
|---|---|
| Hlavní autori: | , |
| Médium: | Journal Article |
| Jazyk: | English |
| Vydavateľské údaje: |
Amsterdam
Elsevier Ltd
01.09.2021
Elsevier Science Ltd |
| Predmet: | |
| ISSN: | 1475-1585, 1878-1497 |
| On-line prístup: | Získať plný text |
| Tagy: |
Pridať tag
Žiadne tagy, Buďte prvý, kto otaguje tento záznam!
|
| Shrnutí: | Cohen's kappa (κ) is often recommended for nominal data as a measure of inter-rater (inter-coder) agreement or reliability. In this paper we ask which term is appropriate in genre analysis, what statistical measures are valid to measure it, and how much the choice of units affects the values obtained. We find that although both agreement and reliability may be of interest, only agreement can be measured with nominal data. Moreover, while kappa may be appropriate for macrostructure or corpus analysis, it is inappropriate for move or component analysis, due to the requirement of κ that the units be predetermined, fixed, and independent. κ further assumes that all disagreements in category assignment are equally likely, which may not be true. We also describe other measures, including correlation, chi square, and percent agreement, and demonstrate that despite its limitations, percent agreement is the only valid measure in many situations. Finally, we demonstrate why choice of unit has a large effect on the value calculated. These findings also apply to other studies in applied linguistics using nominal data. We conclude that the methodology used needs to be clearly explained to ensure that the requirements have been met, as in any other statistical testing.
•For nominal data, only interrater agreement is valid; correlation is invalid.•Kappa may be suitable for macrostructure or corpus, but not move analysis.•Rater-determined move boundaries preclude predetermined, fixed coding units.•Neither sequential sentences nor semi-ordered categories are independent.•Details must be reported to ensure statistical testing requirements are met. |
|---|---|
| Bibliografia: | ObjectType-Article-1 SourceType-Scholarly Journals-1 ObjectType-Feature-2 content type line 14 |
| ISSN: | 1475-1585 1878-1497 |
| DOI: | 10.1016/j.jeap.2021.101026 |