Bayesian Nonparametric Causal Inference: Information Rates and Learning Algorithms
We investigate the problem of estimating the causal effect of a treatment on individual subjects from observational data; this is a central problem in various application domains, including healthcare, social sciences, and online advertising. Within the Neyman-Rubin potential outcomes model, we use...
Uloženo v:
| Vydáno v: | IEEE journal of selected topics in signal processing Ročník 12; číslo 5; s. 1031 - 1046 |
|---|---|
| Hlavní autoři: | , |
| Médium: | Journal Article |
| Jazyk: | angličtina |
| Vydáno: |
New York
IEEE
01.10.2018
The Institute of Electrical and Electronics Engineers, Inc. (IEEE) |
| Témata: | |
| ISSN: | 1932-4553, 1941-0484 |
| On-line přístup: | Získat plný text |
| Tagy: |
Přidat tag
Žádné tagy, Buďte první, kdo vytvoří štítek k tomuto záznamu!
|
| Shrnutí: | We investigate the problem of estimating the causal effect of a treatment on individual subjects from observational data; this is a central problem in various application domains, including healthcare, social sciences, and online advertising. Within the Neyman-Rubin potential outcomes model, we use the Kullback-Leibler (KL) divergence between the estimated and true distributions as a measure of accuracy of the estimate, and we define the information rate of the Bayesian causal inference procedure as the (asymptotic equivalence class of the) expected value of the KL divergence between the estimated and true distributions as a function of the number of samples. Using Fano's method, we establish a fundamental limit on the information rate that can be achieved by any Bayesian estimator, and show that this fundamental limit is independent of the selection bias in the observational data. We characterize the Bayesian priors on the potential (factual and counterfactual) outcomes that achieve the optimal information rate. We go on to propose a prior adaptation procedure (which we call the information-based empirical Bayes procedure) that optimizes the Bayesian prior by maximizing an information-theoretic criterion on the recovered causal effects rather than maximizing the marginal likelihood of the observed (factual) data. Building on our analysis, we construct an information-optimal Bayesian causal inference algorithm. This algorithm embeds the potential outcomes in a vector-valued reproducing Kernel Hilbert space, and uses a multitask Gaussian process prior over that space to infer the individualized causal effects. We show that for such a prior, the proposed information-based empirical Bayes method adapts the smoothness of the multitask Gaussian process to the true smoothness of the causal effect function by balancing a tradeoff between the factual bias and the counterfactual variance. We conduct experiments on a well-known real-world dataset and show that our model significantly outperforms the state-of-the-art causal inference models. |
|---|---|
| Bibliografie: | ObjectType-Article-1 SourceType-Scholarly Journals-1 ObjectType-Feature-2 content type line 14 |
| ISSN: | 1932-4553 1941-0484 |
| DOI: | 10.1109/JSTSP.2018.2848230 |