Weighting schemes and incomplete data: A generalized Bayesian framework for chance-corrected interrater agreement

Van Oest (2019) developed a framework to assess interrater agreement for nominal categories and complete data. We generalize this framework to all four situations of nominal or ordinal categories and complete or incomplete data. The mathematical solution yields a chance-corrected agreement coefficie...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:Psychological methods Jg. 27; H. 6; S. 1069
Hauptverfasser: van Oest, Rutger, Girard, Jeffrey M
Format: Journal Article
Sprache:Englisch
Veröffentlicht: United States 01.12.2022
Schlagworte:
ISSN:1939-1463, 1939-1463
Online-Zugang:Weitere Angaben
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
Abstract Van Oest (2019) developed a framework to assess interrater agreement for nominal categories and complete data. We generalize this framework to all four situations of nominal or ordinal categories and complete or incomplete data. The mathematical solution yields a chance-corrected agreement coefficient that accommodates any weighting scheme for penalizing rater disagreements and any number of raters and categories. By incorporating Bayesian estimates of the category proportions, the generalized coefficient also captures situations in which raters classify only subsets of items; that is, incomplete data. Furthermore, this coefficient encompasses existing chance-corrected agreement coefficients: the S-coefficient, Scott's pi, Fleiss' kappa, and Van Oest's uniform prior coefficient, all augmented with a weighting scheme and the option of incomplete data. We use simulation to compare these nested coefficients. The uniform prior coefficient tends to perform best, in particular, if one category has a much larger proportion than others. The gap with Scott's pi and Fleiss' kappa widens if the weighting scheme becomes more lenient to small disagreements and often if more item classifications are missing; missingness biases play a moderating role. The uniform prior coefficient often performs much better than the S-coefficient, but the S-coefficient sometimes performs best for small samples, missing data, and lenient weighting schemes. The generalized framework implies a new interpretation of chance-corrected weighted agreement coefficients: These coefficients estimate the probability that both raters in a pair assign an item to its correct category without guessing. Whereas Van Oest showed this interpretation for unweighted agreement, we generalize to weighted agreement. (PsycInfo Database Record (c) 2023 APA, all rights reserved).
AbstractList Van Oest (2019) developed a framework to assess interrater agreement for nominal categories and complete data. We generalize this framework to all four situations of nominal or ordinal categories and complete or incomplete data. The mathematical solution yields a chance-corrected agreement coefficient that accommodates any weighting scheme for penalizing rater disagreements and any number of raters and categories. By incorporating Bayesian estimates of the category proportions, the generalized coefficient also captures situations in which raters classify only subsets of items; that is, incomplete data. Furthermore, this coefficient encompasses existing chance-corrected agreement coefficients: the S-coefficient, Scott's pi, Fleiss' kappa, and Van Oest's uniform prior coefficient, all augmented with a weighting scheme and the option of incomplete data. We use simulation to compare these nested coefficients. The uniform prior coefficient tends to perform best, in particular, if one category has a much larger proportion than others. The gap with Scott's pi and Fleiss' kappa widens if the weighting scheme becomes more lenient to small disagreements and often if more item classifications are missing; missingness biases play a moderating role. The uniform prior coefficient often performs much better than the S-coefficient, but the S-coefficient sometimes performs best for small samples, missing data, and lenient weighting schemes. The generalized framework implies a new interpretation of chance-corrected weighted agreement coefficients: These coefficients estimate the probability that both raters in a pair assign an item to its correct category without guessing. Whereas Van Oest showed this interpretation for unweighted agreement, we generalize to weighted agreement. (PsycInfo Database Record (c) 2023 APA, all rights reserved).Van Oest (2019) developed a framework to assess interrater agreement for nominal categories and complete data. We generalize this framework to all four situations of nominal or ordinal categories and complete or incomplete data. The mathematical solution yields a chance-corrected agreement coefficient that accommodates any weighting scheme for penalizing rater disagreements and any number of raters and categories. By incorporating Bayesian estimates of the category proportions, the generalized coefficient also captures situations in which raters classify only subsets of items; that is, incomplete data. Furthermore, this coefficient encompasses existing chance-corrected agreement coefficients: the S-coefficient, Scott's pi, Fleiss' kappa, and Van Oest's uniform prior coefficient, all augmented with a weighting scheme and the option of incomplete data. We use simulation to compare these nested coefficients. The uniform prior coefficient tends to perform best, in particular, if one category has a much larger proportion than others. The gap with Scott's pi and Fleiss' kappa widens if the weighting scheme becomes more lenient to small disagreements and often if more item classifications are missing; missingness biases play a moderating role. The uniform prior coefficient often performs much better than the S-coefficient, but the S-coefficient sometimes performs best for small samples, missing data, and lenient weighting schemes. The generalized framework implies a new interpretation of chance-corrected weighted agreement coefficients: These coefficients estimate the probability that both raters in a pair assign an item to its correct category without guessing. Whereas Van Oest showed this interpretation for unweighted agreement, we generalize to weighted agreement. (PsycInfo Database Record (c) 2023 APA, all rights reserved).
Van Oest (2019) developed a framework to assess interrater agreement for nominal categories and complete data. We generalize this framework to all four situations of nominal or ordinal categories and complete or incomplete data. The mathematical solution yields a chance-corrected agreement coefficient that accommodates any weighting scheme for penalizing rater disagreements and any number of raters and categories. By incorporating Bayesian estimates of the category proportions, the generalized coefficient also captures situations in which raters classify only subsets of items; that is, incomplete data. Furthermore, this coefficient encompasses existing chance-corrected agreement coefficients: the S-coefficient, Scott's pi, Fleiss' kappa, and Van Oest's uniform prior coefficient, all augmented with a weighting scheme and the option of incomplete data. We use simulation to compare these nested coefficients. The uniform prior coefficient tends to perform best, in particular, if one category has a much larger proportion than others. The gap with Scott's pi and Fleiss' kappa widens if the weighting scheme becomes more lenient to small disagreements and often if more item classifications are missing; missingness biases play a moderating role. The uniform prior coefficient often performs much better than the S-coefficient, but the S-coefficient sometimes performs best for small samples, missing data, and lenient weighting schemes. The generalized framework implies a new interpretation of chance-corrected weighted agreement coefficients: These coefficients estimate the probability that both raters in a pair assign an item to its correct category without guessing. Whereas Van Oest showed this interpretation for unweighted agreement, we generalize to weighted agreement. (PsycInfo Database Record (c) 2023 APA, all rights reserved).
Author Girard, Jeffrey M
van Oest, Rutger
Author_xml – sequence: 1
  givenname: Rutger
  orcidid: 0000-0003-0693-0156
  surname: van Oest
  fullname: van Oest, Rutger
  organization: Department of Marketing, BI Norwegian Business School
– sequence: 2
  givenname: Jeffrey M
  orcidid: 0000-0002-7359-3746
  surname: Girard
  fullname: Girard, Jeffrey M
  organization: Department of Psychology, University of Kansas
BackLink https://www.ncbi.nlm.nih.gov/pubmed/34766799$$D View this record in MEDLINE/PubMed
BookMark eNpNkD1PwzAYhC1URD9g4QcgjywBJ3Zim61UfEmVWECM0RvndRtInNZ2hcqvJ4giccPdDadnuCkZud4hIecpu0oZl9cdRjZIpNkRmaSa6yQVBR_962MyDeGdsVRwJU7ImAtZFFLrCdm-YbNax8ataDBr7DBQcDVtnOm7TYsRaQ0RbuicrtChh7b5wprewh5DA45aDx1-9v6D2t5TswZnMDG992gi_mAieg-DUVh5HPAunpJjC23As0POyOv93cviMVk-Pzwt5ssEuFIxEVJbmwtdK-BC5VAUTPGaAbNSIvK8AqNMypAJo_JKWZWxQmusKxSstjbLZuTyl7vx_XaHIZZdEwy2LTjsd6HMci2FEqnUw_TiMN1VHdblxjcd-H35d1P2DfV_bOs
CitedBy_id crossref_primary_10_1007_s11336_023_09919_4
crossref_primary_10_1007_s11336_022_09881_7
ContentType Journal Article
DBID CGR
CUY
CVF
ECM
EIF
NPM
7X8
DOI 10.1037/met0000412
DatabaseName Medline
MEDLINE
MEDLINE (Ovid)
MEDLINE
MEDLINE
PubMed
MEDLINE - Academic
DatabaseTitle MEDLINE
Medline Complete
MEDLINE with Full Text
PubMed
MEDLINE (Ovid)
MEDLINE - Academic
DatabaseTitleList MEDLINE - Academic
MEDLINE
Database_xml – sequence: 1
  dbid: NPM
  name: PubMed
  url: http://www.ncbi.nlm.nih.gov/entrez/query.fcgi?db=PubMed
  sourceTypes: Index Database
– sequence: 2
  dbid: 7X8
  name: MEDLINE - Academic
  url: https://search.proquest.com/medline
  sourceTypes: Aggregation Database
DeliveryMethod no_fulltext_linktorsrc
Discipline Psychology
EISSN 1939-1463
ExternalDocumentID 34766799
Genre Journal Article
GroupedDBID ---
--Z
-~X
.-4
07C
0R~
123
29P
354
53G
5VS
7RZ
ABIVO
ABNCP
ACHQT
ACPQG
AEHFB
ALMA_UNASSIGNED_HOLDINGS
AWKKM
AZXWR
CGNQK
CGR
CS3
CUY
CVF
ECM
EIF
EPA
F5P
FTD
HVGLF
HZ~
ISO
LW5
NPM
O9-
OHT
OPA
OVD
P2P
ROL
SES
SPA
TEORI
TN5
UHS
XJT
YNT
ZPI
3KI
7X8
ABVOZ
AFFHD
PHGZT
ID FETCH-LOGICAL-a388t-479ff549d8a3485a66083d0a0f77ee35bac8c10e04c85b8f820699edbe40dff22
IEDL.DBID 7X8
ISICitedReferencesCount 3
ISICitedReferencesURI http://www.webofscience.com/api/gateway?GWVersion=2&SrcApp=Summon&SrcAuth=ProQuest&DestLinkType=CitingArticles&DestApp=WOS_CPL&KeyUT=000733142400001&url=https%3A%2F%2Fcvtisr.summon.serialssolutions.com%2F%23%21%2Fsearch%3Fho%3Df%26include.ft.matches%3Dt%26l%3Dnull%26q%3D
ISSN 1939-1463
IngestDate Sun Nov 09 11:20:15 EST 2025
Thu Jan 02 22:53:12 EST 2025
IsDoiOpenAccess false
IsOpenAccess true
IsPeerReviewed true
IsScholarly true
Issue 6
Language English
LinkModel DirectLink
MergedId FETCHMERGED-LOGICAL-a388t-479ff549d8a3485a66083d0a0f77ee35bac8c10e04c85b8f820699edbe40dff22
Notes ObjectType-Article-1
SourceType-Scholarly Journals-1
ObjectType-Feature-2
content type line 23
ORCID 0000-0002-7359-3746
0000-0003-0693-0156
OpenAccessLink https://hdl.handle.net/11250/2995263
PMID 34766799
PQID 2597484179
PQPubID 23479
ParticipantIDs proquest_miscellaneous_2597484179
pubmed_primary_34766799
PublicationCentury 2000
PublicationDate 2022-12-01
PublicationDateYYYYMMDD 2022-12-01
PublicationDate_xml – month: 12
  year: 2022
  text: 2022-12-01
  day: 01
PublicationDecade 2020
PublicationPlace United States
PublicationPlace_xml – name: United States
PublicationTitle Psychological methods
PublicationTitleAlternate Psychol Methods
PublicationYear 2022
SSID ssj0014384
Score 2.400345
Snippet Van Oest (2019) developed a framework to assess interrater agreement for nominal categories and complete data. We generalize this framework to all four...
SourceID proquest
pubmed
SourceType Aggregation Database
Index Database
StartPage 1069
SubjectTerms Bayes Theorem
Humans
Observer Variation
Reproducibility of Results
Title Weighting schemes and incomplete data: A generalized Bayesian framework for chance-corrected interrater agreement
URI https://www.ncbi.nlm.nih.gov/pubmed/34766799
https://www.proquest.com/docview/2597484179
Volume 27
WOSCitedRecordID wos000733142400001&url=https%3A%2F%2Fcvtisr.summon.serialssolutions.com%2F%23%21%2Fsearch%3Fho%3Df%26include.ft.matches%3Dt%26l%3Dnull%26q%3D
hasFullText
inHoldings 1
isFullTextHit
isPrint
link http://cvtisr.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwpV07T8MwELaAMnTh_SgvGYnVIo2T2GZBBVGxUHUA0a1y7DPqQPoEqfx67tKUTkhILJEyRLIc--671_cxdgWGilnaCHCREon0VmjnUpF5lSdKWx-DL8UmVKejez3TrRJu06qtcmkTS0Pth45y5NcxIV9Nelm3o7Eg1SiqrlYSGuusJhHK0MVUvVUVIZG6qiobgRZBLulJJYb7MCvhMSlR_gYtSxfT3v7v4nbYVgUueWtxGnbZGhR7rP5j4-b7bPxapkLRX3EMa-EdptwWnhNFA9EEz4BTy-gNb_G3BR_14As8v7NzoGFLHpatXByxLqehYQfCkcCHQ-TKiXtiQtwTE24xji8zjwfspf3wfP8oKtUFYaXWM0q1hYBRo9dWJjq1WYYozUc2CkoByDS3TrtmBFHidJrrQATwxoDPIYl8CHF8yDaKYQHHjKMjDNYbxDi5xTi1aXOD4ZEzNvcaX7IGu1xuZx9PNZUqbAHDj2l_taENdrT4J_3Rgn6jLxOVZcqYkz98fcrqMc0rlP0nZ6wW8E7DOdt0n7PBdHJRHhd8drpP35cDzaA
linkProvider ProQuest
openUrl ctx_ver=Z39.88-2004&ctx_enc=info%3Aofi%2Fenc%3AUTF-8&rfr_id=info%3Asid%2Fsummon.serialssolutions.com&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&rft.genre=article&rft.atitle=Weighting+schemes+and+incomplete+data%3A+A+generalized+Bayesian+framework+for+chance-corrected+interrater+agreement&rft.jtitle=Psychological+methods&rft.au=van+Oest%2C+Rutger&rft.au=Girard%2C+Jeffrey+M&rft.date=2022-12-01&rft.eissn=1939-1463&rft.volume=27&rft.issue=6&rft.spage=1069&rft_id=info:doi/10.1037%2Fmet0000412&rft_id=info%3Apmid%2F34766799&rft_id=info%3Apmid%2F34766799&rft.externalDocID=34766799
thumbnail_l http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/lc.gif&issn=1939-1463&client=summon
thumbnail_m http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/mc.gif&issn=1939-1463&client=summon
thumbnail_s http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/sc.gif&issn=1939-1463&client=summon