Suppose I have two variables in my dataset. How BayesiaLab determines the causality? The MDL score will be same for both directions of the link, right? Thus, how exactly it determines the arrow?
Indeed, the MDL score will be the same forand . Therefore, BayesiaLab will be unable to determine any causal direction by just using your dataset. In order to determine causality, you will have to use your expert knowledge (to set the causal direction manually) or use some temporal information (e.g. B occurs before A). If you have this temporal information, you will just have to define temporal indexes in BayesiaLab (by double clicking on the node to open the Node Editor).
These indexes are taken into account while learning the structure to prevent links from the “future” to the “past”.
What if I do not have any prior specification about the relationship between two varibales. Still BayesiaLab determines the causality. So what instruction does bayesiaLab follow?
BayesiaLab never determines causality automatically. It either uses the temporal indexes or the expert manual orientation of the arcs. At the end of the day, this is always the analyst that qualifies the BBN as a Causal BBN.
However, a BBN is an oriented graph. This is the reason why we need to select an orientation, even when two orientations are equivalent for representing the joint probability distribution and have the same MDL score. Therefore, after structural learning, you do not have to directly interpret these oriented arcs as causal relations.
Note that we do have a tool in Analysis – Visual – Graph - Equivalent Classes menu that transforms the BBN into an Essential Graph, where the arcs for which both orientations are equivalent are displayed as simple edges (undirected). The remaining arcs are the arcs belonging to V-Structures and the arcs that cannot be inverted without introducing spurious V-Structures or directed cycles.
The arcs of the Essential Graph do not have to be directly interpreted as causal either. They indicates that changing their orientation will change the represented joint probability distribution.
I am still not clear. The output of BayesiaLab is a DAG. MDL can't detect direction. So, how does it produce a DAG on the basis of MDL score (minimizing MDL score)?
The direction of the arcs that can be oriented in both directions without changing the represented joint are selected based on the rank of the variable in the data set.