Knowledge Modeling, Causal Analysis and Data Mining with Bayesian Networks

Training Overview

Registration

The registration is complete upon payment of the fee by Bank Transfer, or Credit Cards. Visit the BayesiaLab Store to get the prices corresponding to the type of your organization and number of seats your are interested in.

Training Program

Day 1: Theoretical Introduction

  • Bayesian networks for Association analysis and Causal analysis
  • Bayesian networks for Knowledge modeling and Data Mining
  • How do Bayesian networks fit into the world of research and analytics?
  • A timeline of Bayesian networks, from Bayes’ Theorem to Judea Pearl winning the Turing Award.
  • Timeline of Bayesia S.A.S.
  • Interpreting results of medical tests
  • Kahneman & Tversky’s Yellow Cab/White Cab example
  • The Monty Hall Problem, solving a vexing puzzle with a Bayesian network
  • The Kingborn High School
  • Simpson’s Paradox - Observational Inference vs Causal Inference

  • Probabilistic axioms
  • Probability interpretation
  • Probabilistic inference
  • Particle interpretation 
  • Observation versus intervention
  • Observational inference
  • Joint probability distribution (JPD)
  • JPD - particle interpretation
  • Leveraging independence properties
  • Product/chain rule for compact representation of JPD


  • Qualitative part: structure
  • Graph terminology
  • Dependencies and independencies
  • D-separation
  •  Information flow
  • Quantitative part: parameters
  • Inference in Bayesian networks
  • Exact inference
  • Approximate inference
  • Example of probabilistic inference: alarm system


  • Brainstorming workflow
  • Structural modeling
  • Parametrical modeling
  • Bayesia Expert Knowledge Elicitation Environment

Day 2: Machine Learning - Part 1


  • Maximum Likelihood
  • Introduction of Prior Knowledge
  • Smooth Probability Estimation (Laplacian correction)


  • Information as a measurable quantity
  • Entropy
  • Conditional Entropy
  • Mutual Information
  • Kullback-Leibler Divergence


  • Minimum Description Length (MDL) score
  • Structural Coefficient
  • Minimum size of dataset
  • Search Spaces
  • Search Strategies
  • Learning algorithms
    • Maximum Weight Spanning Tree
    • Taboo Search
    • EQ
    • SopLEQ
    • Taboo Order
  • Data Perturbation
  • Example: Dominick’s Finner Food 
    • Data Import (Typing, Discretization)
    • Dictionary of node comments
    • Exclusion of a node
    • Heuristic Search Algorithms
    • Data Perturbation (Learning, Bootstrap)
    • Choice of the Structural Coefficient
    • Symmetric Layout
    • Analysis of the model (Arc Force, Node Force, Pearson Coefficient)
    • Distance Mapping
    • Forbidden Arcs
    • Manual Connections


  • Context
  • Learning Algorithms
    • Naive
    • Augmented Naive
    • Manual Augmented Naive
    • Tree-Augment Naive
    • Sons & Spouse
    • Markov Blanket
    • Augmented Markov Blanket
    • Minimal Augmented Markov Blanket
  • Example: Microarray Analysis
    • Data Import (Transpose, Row Identifier, Data Type, Not Distributed, Decision Tree Discretization)
    • Target Node
    • Heuristic Search Algorithms
    • Targeted Evaluation (In-Sample, Out-of-Sample: K-Fold, Data Perturbation, Test Set)
    • Smoothed Probability Estimation
    • Feature Selection
    • Analysis of the Model (Monitors, Mapping, Target Report, Influence Analysis,  Target Sensitivity Analysis, Target Mean Analysis, Target Interpretation Tree)
    • Evidence Scenario File
    • Interactive Inference
    • Adaptive Questionnaire
    • Batch Labeling

Day 3: Machine Learning - Part 2

  • Context
  • Algorithms
  • Example: S&P 500 Analysis
    • Variable Clustering
      • Changing the number of Clusters
      • Dynamic Dendrogram
      • Dynamic Mapping
      • Manual Modification of Clusters
      • Manual Creation of Clusters
    • Semi-Supervised Learning
    • Search Tool
    • Notes


  • Context
  • Synthesis of a Latent Variable
  • Ordered Numerical Values
  • Cluster Purity
  • Cluster Mapping
  • Log-Likelihood
  • Contingency Table Fit
  • Hypercube Cells Per State
  • Example: Dominick’s Finner Food
    • Data Clustering (Algorithm, Numerical States)
    • Quality Metrics
    • Invert Node Selection
    • Quadrants
    • Set a Target State
    • Sort the Monitors by Target Value Correlation
    • Conditional Mean Analysis (means, delta-means, radars)
    • Mapping
    • Target Dynamic Profile
    • Target Optimization Tree
    • Projection of the Cluster on other Variables
    • Data Association
    • Save Internal Dataset


  • Context
  • PSEM Workflow
    • Unsupervised Structural Learning
    • Variable Clustering
    • Data Clustering for each Cluster of Manifest Variables
    • Connection of the Target Variable to the Factors
    • Unsupervised Learning for Discovering the Path
  • Example: The French Market of Perfumes
    • PSEM Workflow
    • Displayed Classes
    • Total Effects
    • Direct Effects
    • Direct Effect Contributions
    • Multiple Clustering
    • Structure Comparison Tool
    • Dictionaries (Arcs, Costs)
    • Fixed Arcs
    • Taboo and Arc Constraints
    • Multi-Quadrants
    • Export Variations
    • Select Evidence Set