Directed Acyclic Graphs and Bayesian Networks for Causal Identification and Estimation
The objective of this paper is to provide you with a practical framework for causal effect estimation in the context of policy assessment and impact analysis, and in the absence of experimental data. We will present a range of methods, along with their limitations, including Directed Acyclic Graphs and Bayesian networks. These techniques are intended to help you distinguish causation from association when working with data from observational studies.
This paper is structured as a tutorial that revolves around a single, seemingly simple example. On the basis of this example, we will illustrate numerous techniques for causal identification and estimation.
Major government or business initiatives generally involve extensive studies to anticipate consequences of actions not yet taken. Such studies are often referred to as “policy analysis” or “impact assessment.”
- “Impact assessment, simply defined, is the process of identifying the future consequences of a current or proposed action.” (IAIA, 2009)
- “Policy assessment seeks to inform decision-makers by predicting and evaluating the potential impacts of policy options.” (Adelle and Weiland, 2012)
What can be the source of such predictive powers? A policy analysis must discover a mechanism that links an action/policy to a consequence/impact, yet, experiments are typically out of the question in this context. Rather, impact assessments must determine the existence and the size of a causal effect from non-experimental observations alone.
Given the sheer number of impact analyses performed, and their tremendous weight in decision making, one would like to believe that there has been a long-established scientific foundation with regard to (non-experimental) causal effect identification, estimation and inference. Quite naturally, as decision makers quote statistics in support of policies, the field of statistics comes to mind as the discipline that studies such causal questions.
However, casual observers may be surprised to hear that causality has been anathema to statisticians for the longest time. “Considerations of causality should be treated as they always have been treated in statistics, preferably not at all…” (Speed, 1990).
The repercussions of this chasm between statistics and causality can be felt until today. Judea Pearl highlights this unfortunate state of affairs in the preface of his book Causality: “… I see no greater impediment to scientific progress than the prevailing practice of focusing all our mathematical resources on probabilistic and statistical inferences while leaving causal considerations to the mercy of intuition and good judgment.” (Pearl, 1999)
Rubin (1974) and Holland (1986), who introduced the counterfactual (potential outcomes) approach to causal inference to statistics, can be credited with overcoming statisticians’ traditional reluctance to engage causality. However, it will take many years for this fairly recent academic consensus to fully reach the world of practitioners, which is the motivation for this paper. We wish to make the important advances in causality accessible to analysts, whose work ultimately drives the policies that shape our world.