Valid sequential inference on probability forecast performance
Summary Probability forecasts for binary events play a central role in many applications. Their quality is commonly assessed with proper scoring rules, which assign forecasts numerical scores such that a correct forecast achieves a minimal expected score. In this paper, we construct e-values for tes...
Uložené v:
| Vydané v: | Biometrika Ročník 109; číslo 3; s. 647 - 663 |
|---|---|
| Hlavní autori: | , |
| Médium: | Journal Article |
| Jazyk: | English |
| Vydavateľské údaje: |
Oxford University Press
01.09.2022
|
| Predmet: | |
| ISSN: | 0006-3444, 1464-3510 |
| On-line prístup: | Získať plný text |
| Tagy: |
Pridať tag
Žiadne tagy, Buďte prvý, kto otaguje tento záznam!
|
| Abstract | Summary
Probability forecasts for binary events play a central role in many applications. Their quality is commonly assessed with proper scoring rules, which assign forecasts numerical scores such that a correct forecast achieves a minimal expected score. In this paper, we construct e-values for testing the statistical significance of score differences of competing forecasts in sequential settings. E-values have been proposed as an alternative to $p$-values for hypothesis testing, and they can easily be transformed into conservative $p$-values by taking the multiplicative inverse. The e-values proposed in this article are valid in finite samples without any assumptions on the data-generating processes. They also allow optional stopping, so a forecast user may decide to interrupt evaluation, taking into account the available data at any time, and still draw statistically valid inference, which is generally not true for classical $p$-value-based tests. In a case study on post-processing of precipitation forecasts, state-of-the-art forecast dominance tests and e-values lead to the same conclusions. |
|---|---|
| AbstractList | Summary
Probability forecasts for binary events play a central role in many applications. Their quality is commonly assessed with proper scoring rules, which assign forecasts numerical scores such that a correct forecast achieves a minimal expected score. In this paper, we construct e-values for testing the statistical significance of score differences of competing forecasts in sequential settings. E-values have been proposed as an alternative to $p$-values for hypothesis testing, and they can easily be transformed into conservative $p$-values by taking the multiplicative inverse. The e-values proposed in this article are valid in finite samples without any assumptions on the data-generating processes. They also allow optional stopping, so a forecast user may decide to interrupt evaluation, taking into account the available data at any time, and still draw statistically valid inference, which is generally not true for classical $p$-value-based tests. In a case study on post-processing of precipitation forecasts, state-of-the-art forecast dominance tests and e-values lead to the same conclusions. |
| Author | Henzi, Alexander Ziegel, Johanna F |
| Author_xml | – sequence: 1 givenname: Alexander surname: Henzi fullname: Henzi, Alexander email: alexander.henzi@stat.unibe.ch – sequence: 2 givenname: Johanna F orcidid: 0000-0002-5916-9746 surname: Ziegel fullname: Ziegel, Johanna F email: johanna.ziegel@stat.unibe.ch |
| BookMark | eNotjztrwzAUhUVJoU7atbPWDm6k6GUvhRL6gkCXtqu5kq9AxZZcyRny75uQTOcc-DjwLckipoiE3HP2yFkr1jakEec1FLBMmitScallLRRnC1IxxnQtpJQ3ZFnK72lqpSvy9AND6GnBvz3GOcBAQ_SYMTqkKdIpJws2DGE-UJ8yOigznTAf-whH5pZcexgK3l1yRb5fX7627_Xu8-1j-7yrnZB8rhupHOiNxNbaxvfIwCqOxoHsZWssR83BG7QSGTZgzEZpb5mxzhjrVY9iRR7Ov2k_dVMOI-RDx1l3Eu_O4t1FXPwDNLxSBg |
| CitedBy_id | crossref_primary_10_1093_biomet_asac043 crossref_primary_10_1109_TIT_2024_3444458 crossref_primary_10_12688_f1000research_74223_2 crossref_primary_10_1093_biomet_asae049 crossref_primary_10_1093_jrsssb_qkae011 crossref_primary_10_1287_opre_2021_0792 crossref_primary_10_1214_23_STS894 crossref_primary_10_1073_pnas_2302098121 crossref_primary_10_1287_mnsc_2023_01659 crossref_primary_10_1109_ACCESS_2025_3605681 crossref_primary_10_1016_j_spl_2025_110515 crossref_primary_10_1002_qj_4647 crossref_primary_10_1016_j_ijforecast_2024_11_003 |
| ContentType | Journal Article |
| Copyright | 2021 Biometrika Trust 2021 |
| Copyright_xml | – notice: 2021 Biometrika Trust 2021 |
| DBID | TOX |
| DOI | 10.1093/biomet/asab047 |
| DatabaseName | Oxford Journals Open Access Collection |
| DatabaseTitleList | |
| Database_xml | – sequence: 1 dbid: TOX name: Oxford Journals Open Access Collection url: https://academic.oup.com/journals/ sourceTypes: Publisher |
| DeliveryMethod | fulltext_linktorsrc |
| Discipline | Statistics Biology |
| EISSN | 1464-3510 |
| EndPage | 663 |
| ExternalDocumentID | 10.1093/biomet/asab047 |
| GroupedDBID | -DZ -E4 -~X ..I .2P .55 .DC .GJ .I3 0R~ 1TH 23N 3R3 4.4 482 48X 53G 5GY 5RE 5VS 5WA 6J9 6OB 70D 79B 8U8 AAIJN AAJKP AAJQQ AAMVS AAOGV AAPQZ AAPXW AARHZ AAUAY AAUQX AAVAP AAWDT AAWIL ABAWQ ABBHK ABDFA ABDTM ABEJV ABEUO ABFAN ABGNP ABIME ABIXL ABJNI ABLJU ABNGD ABNKS ABPFR ABPIB ABPPZ ABPQH ABPQP ABPTD ABQLI ABQTQ ABSMQ ABVGC ABWST ABXSQ ABXVV ABYWD ABZBJ ABZEO ACBEA ACFRR ACGFO ACGFS ACGOD ACHJO ACIPB ACIWK ACMTB ACNCT ACPQN ACPRK ACTMH ACUBG ACUFI ACUKT ACUTJ ACUXJ ACVCV ACYTK ACZBC ADEYI ADEZT ADGZP ADHKW ADHZD ADIPN ADLSF ADMHC ADNBA ADOCK ADODI ADQBN ADRDM ADRTK ADULT ADVEK ADYVW ADZXQ AECKG AEGPL AEGXH AEHUL AEJOX AEKKA AEKPW AEKSI AEMDU AENEX AENZO AEPUE AETBJ AEUPB AEWNT AFFZL AFIYH AFOFC AFRAH AFSHK AFVYC AFXHP AFYAG AGINJ AGKEF AGKRT AGLNM AGMDO AGQXC AGSYK AHXPO AIAGR AIHAF AIJHB AJDVS AJEEA AJEUX AJNCP ALMA_UNASSIGNED_HOLDINGS ALRMG ALTZX ALUQC ALXQX ANAKG ANFBD APIBT APJGH APWMN AQDSO ASAOO ASPBG AS~ ATDFG ATGXG ATTQO AVWKF AXUDD AZFZN AZVOD BAYMD BCRHZ BEYMZ BHONS BQUQU BTQHN C45 CAG CDBKE COF CS3 CXTWN CZ4 DAKXR DFGAJ DILTD DQDLB DSRWC DU5 D~K EBS ECEWR EE~ EJD ELUNK F5P F9B FEDTE FLIZI FLUFQ FOEOM FQBLK FVMVE GAUVT GJXCC H13 H5~ HAR HGD HQ6 HVGLF HW0 HZ~ H~9 IOX IPSME J21 JAAYA JAS JBMMH JBZCM JENOY JHFFW JKQEH JLEZI JLXEF JMS JPL JST JXSIZ KAQDR KBUDW KOP KSI KSN M-Z M49 MBTAY ML0 MVM N9A NGC NMDNZ NOMLY NTWIH NU- NVLIB O0~ O9- ODMLO OJQWA OJZSN OVD OWPYF O~Y P2P PAFKI PB- PEELM PQQKQ Q1. Q5Y QBD R44 RD5 RNI RNS ROL ROX ROZ RUSNO RW1 RXO RZF RZO SA0 TCN TEORI TJP TN5 TOX UAP WH7 X7H X7M XSW YAYTL YKOAZ YXANX ZCG ZGI ZKX ~02 ~91 |
| ID | FETCH-LOGICAL-c341t-845ca624e9bb8fde0ab51e7ca4d497b1e61af7eb4e0e8a77256fb07bc77bf5de3 |
| IEDL.DBID | TOX |
| ISICitedReferencesCount | 20 |
| ISICitedReferencesURI | http://www.webofscience.com/api/gateway?GWVersion=2&SrcApp=Summon&SrcAuth=ProQuest&DestLinkType=CitingArticles&DestApp=WOS_CPL&KeyUT=000844406300008&url=https%3A%2F%2Fcvtisr.summon.serialssolutions.com%2F%23%21%2Fsearch%3Fho%3Df%26include.ft.matches%3Dt%26l%3Dnull%26q%3D |
| ISSN | 0006-3444 |
| IngestDate | Wed Apr 02 07:05:48 EDT 2025 |
| IsDoiOpenAccess | true |
| IsOpenAccess | true |
| IsPeerReviewed | true |
| IsScholarly | true |
| Issue | 3 |
| Keywords | Consistent scoring function Forecast dominance Probability forecast Sequential inference Proper scoring rule E-value Optional stopping |
| Language | English |
| License | This is an Open Access article distributed under the terms of the Creative Commons Attribution License (https://creativecommons.org/licenses/by/4.0/), which permits unrestricted reuse, distribution, and reproduction in any medium, provided the original work is properly cited. |
| LinkModel | DirectLink |
| MergedId | FETCHMERGED-LOGICAL-c341t-845ca624e9bb8fde0ab51e7ca4d497b1e61af7eb4e0e8a77256fb07bc77bf5de3 |
| ORCID | 0000-0002-5916-9746 |
| OpenAccessLink | https://dx.doi.org/10.1093/biomet/asab047 |
| PageCount | 17 |
| ParticipantIDs | oup_primary_10_1093_biomet_asab047 |
| PublicationCentury | 2000 |
| PublicationDate | 2022-09-01 |
| PublicationDateYYYYMMDD | 2022-09-01 |
| PublicationDate_xml | – month: 09 year: 2022 text: 2022-09-01 day: 01 |
| PublicationDecade | 2020 |
| PublicationTitle | Biometrika |
| PublicationYear | 2022 |
| Publisher | Oxford University Press |
| Publisher_xml | – name: Oxford University Press |
| SSID | ssj0006656 |
| Score | 2.5362628 |
| Snippet | Summary
Probability forecasts for binary events play a central role in many applications. Their quality is commonly assessed with proper scoring rules, which... |
| SourceID | oup |
| SourceType | Publisher |
| StartPage | 647 |
| Title | Valid sequential inference on probability forecast performance |
| Volume | 109 |
| WOSCitedRecordID | wos000844406300008&url=https%3A%2F%2Fcvtisr.summon.serialssolutions.com%2F%23%21%2Fsearch%3Fho%3Df%26include.ft.matches%3Dt%26l%3Dnull%26q%3D |
| hasFullText | 1 |
| inHoldings | 1 |
| isFullTextHit | |
| isPrint | |
| link | http://cvtisr.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwlV1NSwMxEA1SFHrxoyp-E8Rr6O4m2WwugojFU_VQpbdlkkxAkG1pV6H_3uxmrSIe9JZDQmDyMW9I3nuEXAkNnksrWZpZYCLDnGmlJZMWOWpvPReuNZtQ43ExnerHTix6-csTvubDlodeD2EJJhENbzyVRbOjJw_T9Z2b561Pa9NiXAixlmf8OTzy2L6lkNHOPybfJdsdTqQ3cWH3yAZWA7IVnSNXA9JvQGLUWN4n188BTDsaf0WHE_tKXz5pfHRW0cYzJqpxr2iAqGhhWdP5F2HggDyN7ia396zzRWA25JyaFUJayDOB2pjCO0zAyBSVBeGEVibFPAWv0AhMsIAAn2XuTaKMVcp46ZAfkl41q_CI0EJpF2oo1xifC5UAhGojYDauU0AOLjsmlyFc5TwqX5TxxZqXMSplF5WTv3Q6Jf2sYQ6037POSK9evOE52bTvIViLi3Y1PwAHyaAy |
| linkProvider | Oxford University Press |
| openUrl | ctx_ver=Z39.88-2004&ctx_enc=info%3Aofi%2Fenc%3AUTF-8&rfr_id=info%3Asid%2Fsummon.serialssolutions.com&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&rft.genre=article&rft.atitle=Valid+sequential+inference+on+probability+forecast+performance&rft.jtitle=Biometrika&rft.au=Henzi%2C+Alexander&rft.au=Ziegel%2C+Johanna+F&rft.date=2022-09-01&rft.pub=Oxford+University+Press&rft.issn=0006-3444&rft.eissn=1464-3510&rft.volume=109&rft.issue=3&rft.spage=647&rft.epage=663&rft_id=info:doi/10.1093%2Fbiomet%2Fasab047&rft.externalDocID=10.1093%2Fbiomet%2Fasab047 |
| thumbnail_l | http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/lc.gif&issn=0006-3444&client=summon |
| thumbnail_m | http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/mc.gif&issn=0006-3444&client=summon |
| thumbnail_s | http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/sc.gif&issn=0006-3444&client=summon |