Child pages
  • Knowledge Modeling and Probabilistic Reasoning with Bayesian Networks

Contents

Motivation

Complexity & Cognitive Challenges

It is presumably fair to state that reasoning in complex environments creates cognitive challenges for hu- mans. Adding uncertainty to our observations of the problem domain, or even considering uncertainty regard- ing the structure of the domain itself, makes matters worse. When discussing problems in a group, it can be very challenging to find common ground when uncertainty blurs all assumptions.

No Data, No Analytics

So, let’s go ahead and apply our analytics skills, right? If we had observations from our domain in the form of data, yes, by all means, bring in the data scientists and have them build a model. However, what about if we have no data whatsoever? Assume that all we have is the knowledge of people who are familiar with the problem domain.

To an analyst with Excel, every problem looks like arithmetic

At this point, business analysts will typically bring in Excel as a trusted tool to model the relationships be- tween the variables that are in play in our problem domain. In the absence of data, we can certainly use ex- perts’ judgments to define relationships between variables of interest and then encode them as values and formulas in the spreadsheet.

Unfortunately, this approach inevitably discards critical parts of the problem domain, i.e. the uncertain/ probabilistic and the omnidirectional nature of variable interactions.

Taking no chances!

By default, individual cells and formulas in spreadsheets are deterministic and work with single-point values only. This is ideally suited for encoding “hard” logic, but not at all for “soft” probabilistic knowledge that re- flects uncertainty. As a result, any uncertainty has to be addressed with workarounds, often in the form of try- ing out multiple scenarios or by working with simulation add-ons.

It’s a one-way street!

The lack if omnidirectional inference, however, is the bigger issue in spreadsheets. As soon as we create a formula linking two cells in a spreadsheet, e.g. B1=function(A1), we preclude any evaluation in the other di- rection, from B1 to A1. You may object and say that you know that A1 causes B1, so it does not matter. Even if we were right about the causal direction, it still matters. Let’s assume A1 is indeed the cause, and B1 is the effect. In this case, we can use the spreadsheet for simulation. However, what if we can only observe the ef- fect B1 and want to perform diagnosis about the cause A1? The one-way nature of spreadsheet computations prevents that. Even if we wanted to limit ourselves to reason exclusively in one direction in our spreadsheet model, the omnidirectional interactions in the real world are not subject to such a constraint.

Bayesian Networks to the Rescue!

Invented by Judea Pearl in the 1980s at UCLA, Bayesian networks are a mathematical formalism that can si- multaneously represent a multitude of relationships between variables in a system. The graph of a Bayesian Network contains nodes (representing variables of interest) and directed arcs that link the nodes. The arcs represent probabilistic relationships between the nodes. The probabilistic nature of Bayesian Networks han- dles uncertainty “natively,” i.e. such a model can work directly with probabilistic inputs, probabilistic relation- ships, and deliver correctly-computed, probabilistic outputs.

Also, whereas traditional models are of the form y=f(x), Bayesian Networks do not have to distinguish be- tween independent and dependent variables. Rather, a Bayesian Network represents the entire joint probabil- ity distribution of the system under study. This representation facilitates omnidirectional inference, which is what we need for reasoning about a complex problem domain. 

Objective of Tutorial

The objective of this tutorial is to introduce you to knowledge modeling and omnidirectional probabilistic inference with Bayesian Networks, using the BayesiaLab software platform. In this context, you will learn about several basic properties of Bayesian Networks and how to apply them in practice.

We recommend this tutorial as starting point for learning about Bayesian Networks in general. Given the small size of the problem, you can quickly move back and forth between knowledge modeling and perform- ing inference. This should help you to develop an intuitive understanding of the benefits of reasoning with Bayesian Networks. Hopefully, you will find that Bayesian Networks make formal reasoning as straightforward as doing arithmetic with a spreadsheet.

As opposed to previous BayesiaLab tutorials, which involved machine learning from data for generating Bayesian Networks, we will now exclusively assemble a network using domain knowledge, including a num- ber of causal assumptions, and uncertainty. More formally, we will encode propositional, associational, and causal knowledge in a Bayesian Network.

The focus on knowledge modeling in this paper is meant to highlight that Bayesian Networks can work with the entire spectrum of knowledge sources, from no data to big data. 

Download the white paper here (2.3 MB)