Estimating Epistemic and Aleatoric Uncertainty with a Single Model

Saved in:
Bibliographic Details
Title: Estimating Epistemic and Aleatoric Uncertainty with a Single Model
Authors: Chan, Matthew A., Molina, Maria J., Metzler, Christopher A.
Publisher Information: 2024-02-05 2024-11-06
Document Type: Electronic Resource
Abstract: Estimating and disentangling epistemic uncertainty, uncertainty that is reducible with more training data, and aleatoric uncertainty, uncertainty that is inherent to the task at hand, is critically important when applying machine learning to high-stakes applications such as medical imaging and weather forecasting. Conditional diffusion models' breakthrough ability to accurately and efficiently sample from the posterior distribution of a dataset now makes uncertainty estimation conceptually straightforward: One need only train and sample from a large ensemble of diffusion models. Unfortunately, training such an ensemble becomes computationally intractable as the complexity of the model architecture grows. In this work we introduce a new approach to ensembling, hyper-diffusion models (HyperDM), which allows one to accurately estimate both epistemic and aleatoric uncertainty with a single model. Unlike existing single-model uncertainty methods like Monte-Carlo dropout and Bayesian neural networks, HyperDM offers prediction accuracy on par with, and in some cases superior to, multi-model ensembles. Furthermore, our proposed approach scales to modern network architectures such as Attention U-Net and yields more accurate uncertainty estimates compared to existing methods. We validate our method on two distinct real-world tasks: x-ray computed tomography reconstruction and weather temperature forecasting.
Comment: 19 pages, 11 figures. To be published in Conference on Neural Information Processing Systems (NeurIPS) 2024
Index Terms: Computer Science - Machine Learning, Computer Science - Computer Vision and Pattern Recognition, text
URL: http://arxiv.org/abs/2402.03478
Availability: Open access content. Open access content
Other Numbers: COO oai:arXiv.org:2402.03478
1438523032
Contributing Source: CORNELL UNIV
From OAIster®, provided by the OCLC Cooperative.
Accession Number: edsoai.on1438523032
Database: OAIster
Description
Abstract:Estimating and disentangling epistemic uncertainty, uncertainty that is reducible with more training data, and aleatoric uncertainty, uncertainty that is inherent to the task at hand, is critically important when applying machine learning to high-stakes applications such as medical imaging and weather forecasting. Conditional diffusion models' breakthrough ability to accurately and efficiently sample from the posterior distribution of a dataset now makes uncertainty estimation conceptually straightforward: One need only train and sample from a large ensemble of diffusion models. Unfortunately, training such an ensemble becomes computationally intractable as the complexity of the model architecture grows. In this work we introduce a new approach to ensembling, hyper-diffusion models (HyperDM), which allows one to accurately estimate both epistemic and aleatoric uncertainty with a single model. Unlike existing single-model uncertainty methods like Monte-Carlo dropout and Bayesian neural networks, HyperDM offers prediction accuracy on par with, and in some cases superior to, multi-model ensembles. Furthermore, our proposed approach scales to modern network architectures such as Attention U-Net and yields more accurate uncertainty estimates compared to existing methods. We validate our method on two distinct real-world tasks: x-ray computed tomography reconstruction and weather temperature forecasting.<br />Comment: 19 pages, 11 figures. To be published in Conference on Neural Information Processing Systems (NeurIPS) 2024