Knowledge Modeling, Causal Analysis and Data Mining with Bayesian Networks
Training Overview
 Teaching objectives: Comprehensive understanding of the Bayesian network paradigm plus practical skills for realworld research applications
 Length: 3 days
 Required Level: The course is taught at a beginner level, so no prior knowledge of Bayesian networks is necessary. However, undergraduatelevel familiarity with probability theory and statistics is recommended.
 Teaching methods: Tutorials with practical exercises using BayesiaLab plus plenty of oneonone coaching
 Trainer: Dr. Lionel Jouffe, CEO, Bayesia SAS.
 Training materials: A printed tutorial (approx. 300 slides in two binders), plus a memory stick containing numerous exercises and white papers
 Bayesian network Software: Bayesia provides all trainees with an unrestricted 60day license of BayesiaLab Professional Edition, so they can participate in all exercises on their own laptops
The registration is complete upon payment of the fee by Bank Transfer, or Credit Cards. Visit the BayesiaLab Store to get the prices corresponding to the type of your organization and number of seats your are interested in.
Training Program
Day 1: Theoretical Introduction
 Bayesian networks for Association analysis and Causal analysis
 Bayesian networks for Knowledge modeling and Data Mining
 How do Bayesian networks fit into the world of research and analytics?
 A timeline of Bayesian networks, from Bayes’ Theorem to Judea Pearl winning the Turing Award.
 Timeline of Bayesia S.A.S.

 Interpreting results of medical tests
 Kahneman & Tversky’s Yellow Cab/White Cab example
 The Monty Hall Problem, solving a vexing puzzle with a Bayesian network
 The Kingborn High School
 Simpson’s Paradox  Observational Inference vs Causal Inference

 Probabilistic axioms
 Probability interpretation
 Probabilistic inference
 Particle interpretation
 Observation versus intervention
 Observational inference
 Joint probability distribution (JPD)
 JPD  particle interpretation
 Leveraging independence properties
 Product/chain rule for compact representation of JPD

 Qualitative part: structure
 Graph terminology
 Dependencies and independencies
 Dseparation
 Information flow
 Quantitative part: parameters
 Inference in Bayesian networks
 Exact inference
 Approximate inference
 Example of probabilistic inference: alarm system

 Brainstorming workflow
 Structural modeling
 Parametrical modeling
 Bayesia Expert Knowledge Elicitation Environment


Day 2: Machine Learning  Part 1
 Maximum Likelihood
 Introduction of Prior Knowledge
 Smooth Probability Estimation (Laplacian correction)

 Information as a measurable quantity
 Entropy
 Conditional Entropy
 Mutual Information
 KullbackLeibler Divergence

 Minimum Description Length (MDL) score
 Structural Coefficient
 Minimum size of dataset
 Search Spaces
 Search Strategies
 Learning algorithms
 Maximum Weight Spanning Tree
 Taboo Search
 EQ
 SopLEQ
 Taboo Order
 Data Perturbation
 Example: Dominick’s Finner Food
 Data Import (Typing, Discretization)
 Dictionary of node comments
 Exclusion of a node
 Heuristic Search Algorithms
 Data Perturbation (Learning, Bootstrap)
 Choice of the Structural Coefficient
 Symmetric Layout
 Analysis of the model (Arc Force, Node Force, Pearson Coefficient)
 Distance Mapping
 Forbidden Arcs
 Manual Connections

 Context
 Learning Algorithms
 Naive
 Augmented Naive
 Manual Augmented Naive
 TreeAugment Naive
 Sons & Spouse
 Markov Blanket
 Augmented Markov Blanket
 Minimal Augmented Markov Blanket
 Example: Microarray Analysis
 Data Import (Transpose, Row Identifier, Data Type, Not Distributed, Decision Tree Discretization)
 Target Node
 Heuristic Search Algorithms
 Targeted Evaluation (InSample, OutofSample: KFold, Data Perturbation, Test Set)
 Smoothed Probability Estimation
 Feature Selection
 Analysis of the Model (Monitors, Mapping, Target Report, Influence Analysis, Target Sensitivity Analysis, Target Mean Analysis, Target Interpretation Tree)
 Evidence Scenario File
 Interactive Inference
 Adaptive Questionnaire
 Batch Labeling


Day 3: Machine Learning  Part 2
 Context
 Algorithms
 Example: S&P 500 Analysis
 Variable Clustering
 Changing the number of Clusters
 Dynamic Dendrogram
 Dynamic Mapping
 Manual Modification of Clusters
 Manual Creation of Clusters
 SemiSupervised Learning
 Search Tool
 Notes

 Context
 Synthesis of a Latent Variable
 Ordered Numerical Values
 Cluster Purity
 Cluster Mapping
 LogLikelihood
 Contingency Table Fit
 Hypercube Cells Per State
 Example: Dominick’s Finner Food
 Data Clustering (Algorithm, Numerical States)
 Quality Metrics
 Invert Node Selection
 Quadrants
 Set a Target State
 Sort the Monitors by Target Value Correlation
 Conditional Mean Analysis (means, deltameans, radars)
 Mapping
 Target Dynamic Profile
 Target Optimization Tree
 Projection of the Cluster on other Variables
 Data Association
 Save Internal Dataset

 Context
 PSEM Workflow
 Unsupervised Structural Learning
 Variable Clustering
 Data Clustering for each Cluster of Manifest Variables
 Connection of the Target Variable to the Factors
 Unsupervised Learning for Discovering the Path
 Example: The French Market of Perfumes
 PSEM Workflow
 Displayed Classes
 Total Effects
 Direct Effects
 Direct Effect Contributions
 Multiple Clustering
 Structure Comparison Tool
 Dictionaries (Arcs, Costs)
 Fixed Arcs
 Taboo and Arc Constraints
 MultiQuadrants
 Export Variations
 Select Evidence Set

