Is multiple testing adjustment (MTA) necessary for establishing significance of dependencies or associations in learned-BN?
I recently generated a learned BN (single equivalence class using SopLEQ method) from small dataset (47 instances) to associate viral DNA sequence to therapy outcomes. BN was then validated with 2 independent datasets. However, a reviewer pointed out that MTA needs to be performed.
The BayesiaLab’s structural learning algorithms are based on the Minimum Description Length (MDL) score:
Be careful not choosing too low!
Setting this coefficient to 0 leads to fully connected networks that are quickly unmanageable when the number of variable is higher than 10.
This score with is conservative and returns by default highly significant relations (classical statistical tests will return p-values = 0).
However, when data is scarce, we usually need to lower . Choosing a value that is too low can lead to learning models with relationships that are not significant anymore. This value should then be chosen carefully.
BayesiaLab comes with a tool that allows evaluating the Structure/Data ratio for a broad set of values (DL(B) / DL(D|B), the two parts of the MDL score).
This tool generates a graph where the ratio is plotted for each . Using the "elbow" method usually helps choosing the right value.
You can also use the cross validation tools for measuring the confidence of the arcs obtained on different subsets of data, or on perturbed data, with a given
A synthetic graph is returned with black, blue and red links. The thickness of a link is directly proportional to the number of times it has been generated during the cross-validation.