Child pages
  • Introductory Course

Contents

Knowledge Modeling, Causal Analysis and Data Mining with Bayesian Networks

Training Overview

  • Teaching objectives: Comprehensive understanding of the Bayesian network paradigm plus practical skills for real-world research applications
  • Length: 3 days
  • Required Level: The course is taught at a beginner level, so no prior knowledge of Bayesian networks is necessary. However, undergraduate-level familiarity with probability theory and statistics is recommended.
  • Teaching methods: Tutorials with practical exercises using BayesiaLab plus plenty of one-on-one coaching
  • Trainer: Dr. Lionel Jouffe, CEO, Bayesia SAS.
  • Training materials: A printed tutorial (approx. 300 slides in two binders), plus a memory stick containing numerous exercises and white papers
  • Bayesian network Software: Bayesia provides all trainees with an unrestricted 60-day license of BayesiaLab Professional Edition, so they can participate in all exercises on their own laptops

Registration

The registration is complete upon payment of the fee by Bank Transfer, or Credit Cards. Visit the BayesiaLab Store to get the prices corresponding to the type of your organization and number of seats your are interested in.

Training Program

Day 1: Theoretical Introduction

  • Bayesian networks for Association analysis and Causal analysis
  • Bayesian networks for Knowledge modeling and Data Mining
  • How do Bayesian networks fit into the world of research and analytics?
  • A timeline of Bayesian networks, from Bayes’ Theorem to Judea Pearl winning the Turing Award.
  • Timeline of Bayesia S.A.S.
  • Interpreting results of medical tests
  • Kahneman & Tversky’s Yellow Cab/White Cab example
  • The Monty Hall Problem, solving a vexing puzzle with a Bayesian network
  • The Kingborn High School
  • Simpson’s Paradox - Observational Inference vs Causal Inference

  • Probabilistic axioms
  • Probability interpretation
  • Probabilistic inference
  • Particle interpretation 
  • Observation versus intervention
  • Observational inference
  • Joint probability distribution (JPD)
  • JPD - particle interpretation
  • Leveraging independence properties
  • Product/chain rule for compact representation of JPD


  • Qualitative part: structure
  • Graph terminology
  • Dependencies and independencies
  • D-separation
  •  Information flow
  • Quantitative part: parameters
  • Inference in Bayesian networks
  • Exact inference
  • Approximate inference
  • Example of probabilistic inference: alarm system


  • Brainstorming workflow
  • Structural modeling
  • Parametrical modeling
  • Bayesia Expert Knowledge Elicitation Environment

Day 2: Machine Learning - Part 1


  • Maximum Likelihood
  • Introduction of Prior Knowledge
  • Smooth Probability Estimation (Laplacian correction)


  • Information as a measurable quantity
  • Entropy
  • Conditional Entropy
  • Mutual Information
  • Kullback-Leibler Divergence


  • Minimum Description Length (MDL) score
  • Structural Coefficient
  • Minimum size of dataset
  • Search Spaces
  • Search Strategies
  • Learning algorithms
    • Maximum Weight Spanning Tree
    • Taboo Search
    • EQ
    • SopLEQ
    • Taboo Order
  • Data Perturbation
  • Example: Dominick’s Finner Food 
    • Data Import (Typing, Discretization)
    • Dictionary of node comments
    • Exclusion of a node
    • Heuristic Search Algorithms
    • Data Perturbation (Learning, Bootstrap)
    • Choice of the Structural Coefficient
    • Symmetric Layout
    • Analysis of the model (Arc Force, Node Force, Pearson Coefficient)
    • Distance Mapping
    • Forbidden Arcs
    • Manual Connections


  • Context
  • Learning Algorithms
    • Naive
    • Augmented Naive
    • Manual Augmented Naive
    • Tree-Augment Naive
    • Sons & Spouse
    • Markov Blanket
    • Augmented Markov Blanket
    • Minimal Augmented Markov Blanket
  • Example: Microarray Analysis
    • Data Import (Transpose, Row Identifier, Data Type, Not Distributed, Decision Tree Discretization)
    • Target Node
    • Heuristic Search Algorithms
    • Targeted Evaluation (In-Sample, Out-of-Sample: K-Fold, Data Perturbation, Test Set)
    • Smoothed Probability Estimation
    • Feature Selection
    • Analysis of the Model (Monitors, Mapping, Target Report, Influence Analysis,  Target Sensitivity Analysis, Target Mean Analysis, Target Interpretation Tree)
    • Evidence Scenario File
    • Interactive Inference
    • Adaptive Questionnaire
    • Batch Labeling

Day 3: Machine Learning - Part 2

  • Context
  • Algorithms
  • Example: S&P 500 Analysis
    • Variable Clustering
      • Changing the number of Clusters
      • Dynamic Dendrogram
      • Dynamic Mapping
      • Manual Modification of Clusters
      • Manual Creation of Clusters
    • Semi-Supervised Learning
    • Search Tool
    • Notes


  • Context
  • Synthesis of a Latent Variable
  • Ordered Numerical Values
  • Cluster Purity
  • Cluster Mapping
  • Log-Likelihood
  • Contingency Table Fit
  • Hypercube Cells Per State
  • Example: Dominick’s Finner Food
    • Data Clustering (Algorithm, Numerical States)
    • Quality Metrics
    • Invert Node Selection
    • Quadrants
    • Set a Target State
    • Sort the Monitors by Target Value Correlation
    • Conditional Mean Analysis (means, delta-means, radars)
    • Mapping
    • Target Dynamic Profile
    • Target Optimization Tree
    • Projection of the Cluster on other Variables
    • Data Association
    • Save Internal Dataset


  • Context
  • PSEM Workflow
    • Unsupervised Structural Learning
    • Variable Clustering
    • Data Clustering for each Cluster of Manifest Variables
    • Connection of the Target Variable to the Factors
    • Unsupervised Learning for Discovering the Path
  • Example: The French Market of Perfumes
    • PSEM Workflow
    • Displayed Classes
    • Total Effects
    • Direct Effects
    • Direct Effect Contributions
    • Multiple Clustering
    • Structure Comparison Tool
    • Dictionaries (Arcs, Costs)
    • Fixed Arcs
    • Taboo and Arc Constraints
    • Multi-Quadrants
    • Export Variations
    • Select Evidence Set