Bayesian Nonparametric Causal Inference: Information Rates and Learning Algorithms

We investigate the problem of estimating the causal effect of a treatment on individual subjects from observational data; this is a central problem in various application domains, including healthcare, social sciences, and online advertising. Within the Neyman-Rubin potential outcomes model, we use...

Celý popis

Uloženo v:
Podrobná bibliografie
Vydáno v:IEEE journal of selected topics in signal processing Ročník 12; číslo 5; s. 1031 - 1046
Hlavní autoři: Alaa, Ahmed M., van der Schaar, Mihaela
Médium: Journal Article
Jazyk:angličtina
Vydáno: New York IEEE 01.10.2018
The Institute of Electrical and Electronics Engineers, Inc. (IEEE)
Témata:
ISSN:1932-4553, 1941-0484
On-line přístup:Získat plný text
Tagy: Přidat tag
Žádné tagy, Buďte první, kdo vytvoří štítek k tomuto záznamu!
Popis
Shrnutí:We investigate the problem of estimating the causal effect of a treatment on individual subjects from observational data; this is a central problem in various application domains, including healthcare, social sciences, and online advertising. Within the Neyman-Rubin potential outcomes model, we use the Kullback-Leibler (KL) divergence between the estimated and true distributions as a measure of accuracy of the estimate, and we define the information rate of the Bayesian causal inference procedure as the (asymptotic equivalence class of the) expected value of the KL divergence between the estimated and true distributions as a function of the number of samples. Using Fano's method, we establish a fundamental limit on the information rate that can be achieved by any Bayesian estimator, and show that this fundamental limit is independent of the selection bias in the observational data. We characterize the Bayesian priors on the potential (factual and counterfactual) outcomes that achieve the optimal information rate. We go on to propose a prior adaptation procedure (which we call the information-based empirical Bayes procedure) that optimizes the Bayesian prior by maximizing an information-theoretic criterion on the recovered causal effects rather than maximizing the marginal likelihood of the observed (factual) data. Building on our analysis, we construct an information-optimal Bayesian causal inference algorithm. This algorithm embeds the potential outcomes in a vector-valued reproducing Kernel Hilbert space, and uses a multitask Gaussian process prior over that space to infer the individualized causal effects. We show that for such a prior, the proposed information-based empirical Bayes method adapts the smoothness of the multitask Gaussian process to the true smoothness of the causal effect function by balancing a tradeoff between the factual bias and the counterfactual variance. We conduct experiments on a well-known real-world dataset and show that our model significantly outperforms the state-of-the-art causal inference models.
Bibliografie:ObjectType-Article-1
SourceType-Scholarly Journals-1
ObjectType-Feature-2
content type line 14
ISSN:1932-4553
1941-0484
DOI:10.1109/JSTSP.2018.2848230