Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Dr. Lionel Jouffe, Bayesia S.A.S.
Presented at the 10th Annual BayesiaLab Conference on October 24, 2022.
Knowledge elicitation from domain experts is a key area of interest for the BayesiaLab Team. In this year's technology presentation, Dr. Lionel Jouffe introduces several innovations in structural knowledge elicitation, which are now implemented in the BEKEE (Bayesia Expert Knowledge Elicitation Environment) workflow.
Dr. Lionel Jouffe is co-founder and CEO of France-based Bayesia S.A.S. Lionel holds a Ph.D. in Computer Science from the University of Rennes and has worked in Artificial Intelligence since the early 1990s. While working as a Professor/Researcher at ESIEA, Lionel started exploring the potential of Bayesian networks.
After co-founding Bayesia in 2001, he and his team have been working full-time on the development of BayesiaLab. Since then, BayesiaLab has emerged as the leading software package for knowledge discovery, data mining, and knowledge modeling using Bayesian networks. It enjoys broad acceptance in academic communities, business, and industry.
Dr. Renan Rocha & Dr. Francisco de Assis de Souza Filho, Federal University of Ceará
Presented at the 10th Annual BayesiaLab Conference on Monday, October 24, 2022.
The thesis entitled “Bayesian Networks and Network Science Applied to Water Resources: Streamflow Analysis and Forecast incorporating the Non-Stationarity” aimed to develop methodologies to (1) identify the existence of changes in the streamflow time series and its location, (2) incorporate this aspect in the streamflow modeling and forecasting framework and (3) analyze the full extension regarding its impact. The focus on Bayesian Networks, as an alternative approach to classical streamflow modeling methodologies, relied upon recent articles that indicated Bayesian Networks as a promising tool in hydroclimate studies, simultaneously providing good modeling results and allowing causal discovery through the analysis of the network structure. A first attempt to incorporate this non-stationarity was made using Gaussian Bayesian Networks (GBN). Discrete variables representing the different phases of low-frequency oscillations were included in the networks, allowing different network parameters according to the phases. The results demonstrated the great potential of the GBN to forecast streamflow with lead times from one to eight months. The results also unveiled a good streamflow forecasting potential via Bayesian Inference based on Likelihood Weighting simulations. The use of the phases resulted in performance improvement for some stations, however, it did not improve the results of the stations that presented changes in the time series, suggesting significant changes between the network structures of each homogeneous period. Network structures were obtained through different methodologies for each homogeneous period to analyze this aspect. The results confirmed the initial hypothesis, showing significant differences between the network structures of each homogeneous period, with alterations in the relationship between the variables and their autocorrelation function. Therefore, the use of the same set of parents for the complete series may not comprise the extension of the changes observed.
Dr. Renan Rocha is a Civil Engineer with a Master's and Doctor degree in Water Resources from the Federal University of Ceará (UFC). Currently works as a Researcher in FUNCEME, Institute for Research in Meteorology, Water Resources, and Environment, and is the Head of the Water Resources Department (GEPEH). Has experience with Time Series Analysis, Hydrological Modelling, Bayesian Networks, Complex Networks, Drought analysis, and NEXUS (Water-Energy-Food). His recent thesis explored the use of Bayesian Networks to forecast streamflow incorporating the Non-Stationarity existence.
Dr. Francisco de Assis de Souza Filho is a professor at the Hydraulics and Environmental Engineering Department of the Federal University of Ceará and the Head Scientist of Water Resources at the Ceará State Foundation for the Support of Scientific and Technological Development. Has a Doctor degree from São Paulo University and a postdoctoral internship at the International Research Institute for Climate and Society of Columbia University. Has won several awards, including the Engineer Francisco Gonçalves Aguiar Medal, the highest commendation of the Water Resources of Ceará. Was the head of water resources-related organizations, such as FUNCEME and ABRH.
Vuong Pham, Ph.D., CMCC@Ca’Foscari
Presented at the 10th Annual BayesiaLab Conference on Monday, October 24, 2022.
Extreme weather and climate-related events, from river flooding to droughts and tropical cyclones, are likely to become both more severe and more frequent in the coming decades, and the damages caused by these events will be felt across all sectors of society. In the face of this threat, policy- and decision-makers are increasingly calling for new approaches and tools to support risk management and climate adaptation pathways that can capture the full extent of the impacts. In this context, Bayesian Network (BN) stands as a novel and powerful approach for the capturing and modelling of multi-risk against future ‘what-if’ scenarios.
Building on a risk-based conceptual framework, several BN models were developed, trained, and validated by expert judgment and database-driven to support multi-risk scenario analysis with various aims such as multi-sectoral flooding damages, marine cumulative impacts, ecosystem services assessment in different domains (e.g., freshwater, marine, and coastal, agriculture and industry). A major advantage across these applications lies in the possibility of combining heterogeneous data from multiple sources and across different domains, which is vital in environmental risk assessment.
The outcome of these applications represents valuable support for disaster risk management and reduction actions against climate change and extreme events, enabling better-informed decision-making. Furthermore, more ambitious development could involve the spatialization of the output of the model with a user-friendly interface, building on the GIS-based structure of the training dataset, to assist policy- and decision-makers in using the results of these applications to prioritize more efficiently plans for Disaster Risk Management and Climate Change Adaptation.
Vuong Pham holds a Ph.D. in Science and Management of Climate Change from Ca’Foscari University (Italy) and an MSc. in Environmental and Geomatic Engineering from Politecnico di Milano (Italy). He has been affiliated with the Euro-Mediterranean Center on Climate Change since 2016, collaborating in the research activities within the projects of the Risk Assessment and Adaptation Strategies Division at CMCC@Ca’Foscari (Venice).
Vuong’s research focuses on multi-risk assessment, including the issues related to freshwater, coastal areas, and ecosystem services capacity associated with these domains. He and his research developed several BNs applications to support multi-risk scenario analysis with various aims such as multi-sectoral flooding damages, marine cumulative impacts, and ecosystem services assessment.
Edwin Hui, University of St Andrews, Scotland
Presented at the 10th Annual BayesiaLab Conference on Monday, October 24, 2022.
Understanding the dynamics that regulate ecological resilience is becoming increasingly important in today’s world, as ecosystems are facing multiple pressures on global, regional, and local scales. If pressures exceed a threshold, this may trigger a regime shift, where a system undergoes a step change to another state that can last for substantial periods of time. Recent applications of Bayesian networks (BNs) have shown promise in revealing network structures of complex systems, and such understanding shows great promise for the understanding of mechanisms underlying the resilience of complex systems. In this talk, we present two case studies to document the potential of Bayesian Networks in the study of complex systems:
In recent years, the use of Bayesian networks (BN) has seen successful applications in molecular biology and ecology, where it was able to recover known links in the respective systems it was applied to. While this is invaluable in ecology, an unexplored application of BNs would be utilizing it as a novel variable selection tool in the training of predictive models. To this end, we evaluate the potential usefulness of BNs in two aspects: (1) we apply BN inference on species abundance data from a rocky shore ecosystem, a system with well-documented links, to test the ecological validity of the revealed network; and (2) we evaluate BNs as a novel variable selection method to guide the training of an artificial neural network (ANN).
To date, two distinct approaches have emerged in the study of ecological resilience. On one hand, network-based approaches have successfully revealed ecological network structures of complex systems. On the other hand, novel non-additive modelling frameworks have been developed and allowed for the direct quantification of ecological resilience. So far, these two approaches have been largely segregated. However, connecting these two fields may offer novel insight into the study of ecological resilience. Here, we propose a novel 2-step modelling process to study ecological resilience and regime shifts: (1) we apply the Integrated Resilience Assessment (IRA) framework proposed by Vasilakopoulos et al. (2017) to quantify and approximate the ecological resilience of ecosystems under study; and (2) we apply a dynamic Bayesian Gaussian Mixture (BGMD) Bayesian network model to reveal the network structure with a changepoint process to take the temporal structure into account.
Edwin Hui is a Ph.D. student from the University of St Andrews, where his research focuses on developing computational models to study resilience and regime shifts across complex systems. He is interested in applying a variety of statistical and computational tools to address ecological questions and study complex systems theory. Throughout his Ph.D., he aims to develop novel computational approaches to study complex systems across disciplines, ranging from ecological to macroeconomic systems.
A Zoom Virtual Event — October 24–28, 2022
2022 marked a significant milestone for Bayesia. We hosted the 10th Annual BayesiaLab Conference. This special event featured more speakers from more fields than ever! Our exciting lineup of talks reflected the ever-growing relevance of Bayesian networks for research, analytics, and reasoning.
If you missed some of the talks, we uploaded recordings of all presentations and the corresponding slides.
John Carriger, Ph.D., U.S. Environmental Protection Agency
Presented at the 10th Annual BayesiaLab Conference on Monday, October 24, 2022.
Environmental assessments require endpoints representative of ecological communities. These can include summary indicators or indicators for different components of the community. However, summary indicators applied to a complex system can sometimes create mathematical challenges that result in metrics that are ambiguous or uninterpretable. Coral reefs are complex ecosystems, so patterns of ecological interactions were explored by probabilistic clustering of reef monitoring variables with Bayesian networks. In 2010 and 2011, the U.S. Environmental Protection Agency sampled coral reef communities along the coast of Puerto Rico with probabilistic surveys, and the data were examined in a clustering analysis with Bayesian networks. Most of the component variables (gorgonians, sponges, fish, and coral) were found to have stronger associations within than between taxa, but unsupervised structure learning with lowered complexity weights identified two cross-taxa relationships. Survey data were also used in data clustering analyses to identify site clusters for sponge, gorgonian, stony coral, and fish variables. These clusters were constructed using an expectation-maximization algorithm that created a factor node jointly characterizing the density, size, and diversity of individuals in each taxon. The clusters were interpreted in terms of their relationship with the monitoring variables used in their construction and the relationship of the fish clusters to the monitoring variables for other taxa, such as stony coral variables. Each of these factor nodes was then used to create a set of meta-factor clusters that further summarized the aggregate monitoring variables for the four taxa. Once identified, taxon-specific and meta-clusters can be applied on a regional or site-specific basis to better understand reef communities in terms of ecosystem services and risk assessment.
EPA Disclaimer: The views expressed in this presentation are those of the authors and do not necessarily represent the views or policies of the U.S. Environmental Protection Agency.
John F. Carriger U.S. Environmental Protection Agency, Office of Research and Development, Center for Environmental Solutions and Emergency Response, Cincinnati, Ohio, USA
William S. Fisher U.S. Environmental Protection Agency, Office of Research and Development, Center for Environmental Measurement and Modeling, Gulf Breeze, Florida, USA
John Carriger is a research scientist at the U.S. Environmental Protection Agency’s Office of Research and Development in Cincinnati, Ohio. John has a marine science Ph.D. from the College of William and Mary. John’s research interests include applying risk assessment, decision analysis, and weight of evidence tools to environmental problems.
Hussein Jouni, L’Oréal Research & Innovation
Presented at the 10th Annual BayesiaLab Conference on Tuesday, October 25, 2022, at 13:30 (UTC).
Capitalizing on expert knowledge can be useful for a company. It can be for transmitting all the know-how on a given field, incorporating technical aspects for decision making, or building causal models for doing prediction. This knowledge can be represented through a Bayesian Network in order to introduce uncertainty on the phenomenon, and, combined with Data, its performance can be improved. Elicitation is done thanks to sessions where expert works together to build models thanks to a facilitator and a modeler. The experts are asked to be available for a given amount of time, which can be large (several days), with a risk that at the end of the sessions, they will not be able to have a satisfying tool. In the context of multi-project management, we propose a tool to assess the probability of success of Elicitation sessions on a given problem. This tool is obtained thanks to the Elicitation of a Bayesian Network (meta-model).
Hussein Jouni, Statistical Engineer at L’Oréal Research & Innovation
I studied biomedical engineering at ESIEE PARIS and at the Faculty of Medicine of Paris XII University. After obtaining my degree and my first experience at Danone Nutricia Research, I specialized in clinical data science and clinical research. I’ve been working for L’Oréal (Research and Innovation division) for five years as a Statistical Engineer – Data Scientist.
Ali Fahmi, Ph.D., University of Manchester
Bayesian networks (BNs) have been widely proposed for medical decision support, perhaps because they can be built from knowledge and data. In my Ph.D., we examined how BNs can be used for decision-support challenges of chronic diseases, and we focused on Rheumatoid Arthritis (RA), as a case study. Three stages of this decision-support included diagnosis, self-management, and personalised care, with progressively less available data. For diagnosis, various criteria are proposed by clinicians for early diagnosis of RA, but these criteria are deterministic and cannot deal with diagnostic uncertainty. We built a BN model for diagnosing RA using an available dataset and experts’ knowledge. We obtained promising results (AUROC=0.84), and we compared them with those of an alternative BN model entirely learned from data (AUROC=0.71). We argued that a clinically meaningful structure of a BN model allows us to explain clinical scenarios in a way that cannot be done with the model learned entirely from data. For self-management, we intended to estimate disease activity remotely and frequently (e.g., weekly), instead of the current clinical practice of disease activity measurement once in 3 to 6 months, when an urgent visit and medication review may not be needed. We built two dynamic BN (DBN) models using experts’ knowledge and a set of manipulated data to predict appointment scheduling and medication review. Both models indicated acceptable performance; AUROC of the first DBN was 0.69, and AUROC of the second DBN was 0.66. The third stage of decision-support focused on personalised care for living with RA since it can have a profound impact on quality of life (QoL). We used experts’ knowledge and literature to build a BN that predicts QoL and helps to personalise the recommendations for enhancing QoL. The obtained recommendations for a set of scenarios were comparable with those of the experts.
Presentation Video
Ali Fahmi is a post-doctoral researcher in statistics and epidemiology at the University of Manchester, United Kingdom. He holds a Ph.D. in computer science from Queen Mary University of London, an MSc in management engineering from Istanbul Technical University, Turkey, and a BSc in industrial engineering from the University of Tabriz, Iran. His Ph.D. research focused on creating decision support systems with causal Bayesian network models for diagnosis, self-management, and personalised care. Currently, he is doing research in the framework of the BRIT2 project, aiming to develop and evaluate a knowledge-support for prescribing antibiotics for common infections in primary care. This project also evaluates the indirect effect of the Covid19 pandemic on prescribing antibiotics for common infections. His main research interests are decision support systems and their applications in medicine and healthcare. His main extracurricular activity is designing carpet patterns and weaving carpets.
Steven F. Wilson, Ph.D., EcoLogic Research
Abstract Significant environmental degradation is rarely the result of a single, acute event but is more often caused by the additive and/or synergistic effects of several stressors. Assessing these “cumulative impacts” is an important component of environmental assessments, but the procedures for calculating such impacts are theoretically weak and could generate misleading estimates of project impacts. Here, I propose a framework for analyzing cumulative environmental impacts that is rooted in causal theory. Specifically, I argue for the application of causal models and the explicit incorporation of “rung three” counterfactual reasoning from Pearl’s causal hierarchy. The important but underused concepts of necessary and sufficient causation are prominent in the proposed framework and lead to surprising assessment results when used to estimate the effects of industrial development on grizzly bear and caribou populations.
Steven F. Wilson, Ph.D., EcoLogic Research, 302-99 Chapel Street, Nanaimo, BC V9R 5H3, Canada, steven.wilson@ecologicresearch.ca
Steve Wilson has 30 years of experience working at technical and professional levels in strategic and operational planning for wildlife and other ecological values. He specializes in quantitative approaches to decision support and policy analysis. Steve holds a Ph.D. in wildlife ecology from the University of British Columbia in Vancouver.
The Social Graph—Using Bayesian Networks to Identify Spatial Population Structuring Among Caribou in Subarctic Canada (Paris, 2017)
Use of causal modeling with Bayesian networks to inform policy options for sustainable resource management (Nashville, 2016)
Big data, small data: Bayesian networks in environmental policy analysis in Canada’s energy sector (Fairfax, 2015)
Presented at the on Tuesday, October 25, 2022.
Presented at the on Monday, October 24, 2022.
Yann Corriou, Charier S.A.S.
In the public works environment, avoiding breakdowns of construction machines is a major challenge. Indeed, this phenomenon can represent a significant economic cost at three different levels. First, we need to pay for the repair of the machine, which is called the direct cost. Then, a breakdown will eventually lead to a delay in the progress of the work or to the need to rent another machine to replace the unavailable one, all of this representing the indirect cost. And finally, a breakdown can also affect the lifetime of a machine, and optimizing this lifetime is a priority when handling a fleet of public works equipment.
To reach our goal, we are developing different Bayesian networks using an expert-driven approach with BEKEE. The presentation will be about the development of our networks, how we use this process to target useful data for our CMMS, and how we plan to include the result of the networks in our CMMS.
Yann Corriou, Data Scientist, Charier S.A.S. yann.corriou@charier.fr
I studied for five years (2015-2020) at INSA Rennes (engineering school) in the Department of Applied Mathematics. After an internship (2019) and a one-year work-study contract (2019-2020), I am now working full-time as a data scientist at CHARIER, a public works company operating mainly in the West of France.
Presented at the on Tuesday, October 25, 2022.
Olivier Cussenot, MD, Ph.D., Sorbonne University
Presented at the 10th Annual BayesiaLab Conference on Tuesday, October 25, 2022.
Comprehensive tools to drive decision-making are new challenges for the revolution of precision and preventive medicine. These tools could optimize management or assess better the risk of having or developing diseases and help practitioners make decisions to perform deep (invasive or costly) diagnosis procedures. So, Bayesian statistical methods and modeling techniques provide a powerful approach to integrate knowledge and new markers to refine the probability of outcome and decision-making in clinical practice. Bayesian networks through BayesiaLab offer the opportunity to search for better, more informative factors or a combination of these factors to enhance endpoint prediction. For example, we present some applications for risk prediction and personalized management in prostate cancer. In this way, Bayesian approaches, using BayesiaLab, have been shown a powerful strategy to explore, validate, and translate useful multifactorial predictors for precision medicine to clinical practice.
Professor Olivier Cussenot, MD, Ph.D., is a full professor at Sorbonne University. He is qualified as a urological surgeon, oncologist, and geneticist and has a minimum of two decades of experience in molecular/translational prostate and urological cancer research.
As head of the department, he managed a research unit on predictive oncology and personalized prevention strategies. He was also the principal investigator of many national and European research programs on urological and prostate cancers. His research programs are mainly on the clinicopathological and molecular heterogeneity of cancers related to the germline genetic background (family history or ancestry) and the interaction with the environment. He led the French Institute of Cancer (INCa) and the French Cancer Research League, the national programs on prostate cancer genomics (French part of the ICGC and molecular tumor ID). He also led the first national program, which links genetic markers to the national health database to model the different life pathways according to the different prostate cancer management, co-morbidities, and individual genetic or mesological factors.
He is the author of more than 480 scientific articles referenced as “Cussenot O.” They are available on the PubMed data search engine. He wrote more than 60 didactic chapters in books and provides translational seminars on genomics and artificial intelligence /decision-making in urologic oncology.
Mikael Rubin, Ph.D., Palo Alto University
Depression is a highly heterogeneous mental health concern, making it difficult to determine optimal treatment approaches. Mindfulness-based interventions have been shown in meta-analyses to be moderately effective in reducing symptoms of depression. Research identifying mechanisms governing the efficacy of mindfulness-based interventions might be guided by analytic approaches that can generate hypotheses to test in future research. Network analysis (often utilizing Gaussian Graphical Models) is an item-level approach widely used in recent psychological research to understand interrelations within and between heterogeneous mental health constructs.
To identify the interrelations between symptoms of depression and features of mindfulness, we used Bayesian Network Analysis across three cross-sectional samples (N = 1,135). Bayesian Gaussian Graphical Models allowed us to (1) generate an exploratory network in two samples using different depression assessments: the Patient Health Questionnaire (n = 384) and the Depression Anxiety and Stress Scale (n = 350), with mindfulness being assessed using the Five-Facet Mindfulness Scale and (2) confirm findings from the exploratory network in a third sample (n = 401) with a pre-registered replication.
From the exploratory analyses, we found that the Non-judging facet of mindfulness (reflecting acceptance of thoughts and feelings) was the most central (i.e., interconnected) bridge to symptoms of depression. The pre-registered analysis confirmed our initial findings: after controlling for all other associations, Non-judging represented the most central connection between facets of mindfulness and depression. These results suggest that when considering the use of mindfulness-based interventions for individuals with depression, examination of Non-judging is warranted and may offer a potent target.
Mikael Rubin is an Assistant Professor at Palo Alto University. He received his Ph.D. in clinical psychology from the University of Texas at Austin. From studying virtual reality in art to conducting virtual reality exposure therapy, he is curious about how what we attend to influences how we make meaning out of the lived experience. He specializes in research and interventions related to anxiety and post-traumatic stress. His research has used a wide range of approaches (including eye tracking, neuroimaging, and network analysis). He directs the Transdiagnostic Attention Intervention (TRAIN) Lab at Palo Alto University and is especially interested in using virtual reality and eye-tracking methods to evaluate, enhance, and widely disseminate mental health interventions.
Presented at the on Wednesday, October 25, 2022.
Chetan S. Kulkarni, Ph.D., KBR Inc., NASA Ames Research Center
Presented at the 10th Annual BayesiaLab Conference on Tuesday, October 25, 2022.
Aeronautics research is at the forefront with the advent of electric vertical takeoff and landing (eVOTL) vehicles as a mode of transport. These vehicles are expected to travel in low-altitude airspace, from small drones for package delivery to larger, on-demand, urban air mobility (UAM) vehicles. The foreseeable high traffic density suggests that many of these electric propulsion systems will enter the airspace and that they will also operate at high frequency. The reliability of such critical systems is, therefore, key to ensuring high safety standards in low-altitude airspace, especially when moving in dense urban environments. Diagnostic systems, which aim at identifying incipient faults, can mitigate unexpected failures by performing early fault detection in critical systems through monitoring. The proposed approach leverages a combination of failure mode and effect analysis (FMEA) integrated with Bayesian networks, thus introducing dependability structures into a diagnostic framework.
Faults and failure events from the FMEA are mapped within a Bayesian network, where network edges replicate the links embedded within FMEAs. A key element of fault diagnosis is fault detection and isolation (FDI), which increases in complexity with the complexity of the system itself, namely the number of subsystems and components, interactions among sub-systems, and sensor availability. The developed framework enables the fault isolation process by identifying the probability of occurrence of specific faults or root causes given evidence observed through sensor data. This is demonstrated through a case study applied to the electric powertrain system of a small, rotary-wing unmanned aerial vehicle (UAV). The proposed work integrates the early design phase of an electric propulsion system with diagnostic tools, often developed later in the product lifecycle. Failure mode and effect analysis (FMEA) derived for the system in the design phase is embedded within a Bayesian network (BN).
Chetan S. Kulkarni is a staff researcher at the Prognostics Center of Excellence and the Diagnostics and Prognostics Group in the Intelligent Systems Division at NASA Ames Research Center. His current research interests are in Systems Diagnostics, Prognostics, and Health Management. Specifically focused on developing physics-based models, prognostics of electronic systems, energy systems, exploration ground systems, and hybrid systems.
He completed his MS ('09) and Ph.D. ('13) from Vanderbilt University, TN where he was a Graduate Research Assistant with the Institute for Software Integrated Systems and the Department of Electrical Engineering and Computer Science. He completed his BE (02) from the University of Pune, India. Before joining Vanderbilt, he was a Research Fellow at the Department of Electrical Engineering, IIT-Bombay, where his research work focused on developing low-cost substation automation system monitoring and control devices and partial discharge of high voltage transformers. Earlier he was a member of the technical team of the Power Automation group at Honeywell, India, where he was involved in turnkey power automation projects and product development in substation automation.
He is KBR Technical Fellow and AIAA Associate Fellow. Associate Editor for IEEE, SAE, and IJPHM Journals on topics related to Prognostics and Systems Health Management. He has been the Technical Program Committee co-chair at PHME18, PHM20-22. And co-chairs the Professional Development and Education Outreach subcommittee in the AIAA Intelligent Systems Technical Committee.
Anand Wilson & Chiranjiv Roy, Ph.D., Course5 Intelligence
Presented at the 10th Annual BayesiaLab Conference on Wednesday, October 26, 2022.
Surveys of primary search are conducted at different frequencies across segments to determine satisfaction metrics and brand perception for the brand and its competitors. A telemetry of the consumer is also measured by the number of issues, severity, the time required to resolve, and satisfaction by category. Modeling begins with exploratory factor analysis and probabilistic structural equations. The results show that these factors affect consumers' brand trust and are visualized in a network graph of association. The relationship network structure is derived using both controllable and latent factors, including service history and surveys, as well as socioeconomic and market factors.
As part of my talk, I will elaborate on the importance of brand trust in commercial consumer behavior and marketing management, particularly for cloud services. The purpose of this research is to investigate the effects of factors on consumers’ brand loyalty in products and service businesses leading to trust and love and recommend constraint-driven changes to improve.
Anand has 10+ years of experience in applied artificial intelligence and data sciences. He has worked for marque clients such as Lenovo, Intel, Microsoft, Novartis, Novo Nordisk, GE, Mars Wrigley, PepsiCo, etc., enabling digital transformation using A.I.
In his current role, Anand focuses on developing and marketing solutions based on the Bayesian Network model theory, enabling us to quantify causality in an observational study. A major area of work/research includes Knowledge Modelling, Machine Learning with BayesiaLab, and Inference.
Anand has a master's in statistics and a background in applied statistics. He is acutely interested in machine reasoning, causal inference, experimental designs, machine learning, and data science.
Chiranjiv has spent 20+ years across the analytics industry along with a Ph.D. in Applied Data Sciences in incubating, leading, and driving Data Analytics, Engineering, Science & AI Product Development across organizations such as Nissan Motors, Mercedes-Benz, Hewlett Packard, and HSBC Data Analytics. Chiranjiv has filed patents and developed products and solutions by applying applied AI with data for connected mobility, shared SaaS, autonomous/smart systems, AI-IoT, and electric. He is an official member of the Forbes Technology Council and the International Group of Artificial Intelligence and contributes to solving global clientele problems around Industry 4.0, Digital Manufacturing, Mobility, Advanced Analytics, Applied AI, Operations Research, and Sustainability.
Hengyi Hu, Ph.D., George Mason University
Presented at the 10th Annual BayesiaLab Conference on Tuesday, October 25, 2022.
A Bayesian Network is a popular framework for causal studies and for representing causal relationships among a network consisting of multiple variables. Causal relationships and their associated conditional probabilities can be represented in the structure of a Bayesian Network as nodes and edges, creating a Causal Bayesian Network. However, establishing causality extends beyond learning conditional probabilities from a dataset.
This presentation provides a crash course on the history of establishing causation in epidemiology, current viewpoints on defining causality, and a demonstration of how Bayesian Networks can be used to infer causation. We will examine the criteria for establishing a causal relationship, learning a Bayesian Network from a sample dataset, and augmenting (and improving) a Bayesian Network with informed prior knowledge from an ontology such as ICD-10.
Dr. Hengyi Hu is a data scientist and subject matter expert specializing in advanced analytics, performance analysis, process improvement, and program management. He has over 15 years of experience leading data-driven projects for the Department of Homeland Security, the National Science Foundation, and the Department of Justice. His current role in the Strategic Solutions Office at DHS HQ involves leading in-depth analysis of DHS-wide and government-wide category management spending, revamping specialized procurement reports and procurement reporting systems, and leading cross-agency collaborations for data-driven decision-making in category management.
Hengyi holds a B.S. degree in Information Sciences and Technology from Penn State University and M.S. and Ph.D. degrees in Information Technology from George Mason University. His research interests focus on causality, causal modeling, causal inference, and substantiating Bayesian networks learned from large datasets using causal mechanisms from authoritative ontologies. Hengyi holds certifications for PMP, CSSGB, FAC-COR II, and Strategy & Performance Management. Hengyi is also a graduate of the Key Executive Leadership Program at American University.
Dave Barry & John Krech, Optum
Presented at the 10th Annual BayesiaLab Conference on Wednesday, October 26, 2022.
Do you know where the biggest return on investment will be when designing customer experience improvements? Where in the Customer Journey do you focus? BayesiaLab helps connect disparate data sets to find nodes with the highest probability of raising satisfaction if improvements are made.
Dave Barry is a Director and team leader at Optum in the Enterprise Reporting & Analytics department. Part of UnitedHealth Group, his team at Optum aims to support UHG’s mission of helping the health system work better for everyone. Developing an approach to analyzing customer feedback, that method is being used to develop roadmaps that lead towards helping improve the Consumer, Provider, and Client’s Customer Experience (CX). Past roles have included global leadership roles in IT operations, application development, and program management at General Electric. Dave has a degree in Management Information Systems and is a certified Master Black Belt in Six Sigma Quality.
John Krech is a Principal Data Scientist at Optum, a part of UnitedHealth Group. John has undergraduate degrees in Chemical Engineering, Materials Science & Engineering, and a Master of Business Administration. John’s current focus is developing an approach to find leading indicators of future customer satisfaction. The end goal is to help improve the Customer Experience for Consumers, Providers, and Clients. Prior to Optum, John held research and development roles in Supply Chain, Manufacturing, and New Business Development at 3M. John is also a certified Master Black Belt in Six Sigma Quality.
Alexandra Chirilov, James Pitcher, and Andrzej Surma, GfK
Presented at the 10th Annual BayesiaLab Conference on Wednesday, October 26, 2022.
In this second presentation, the authors used real survey data from a syndicated study covering two categories, three countries, and more than 60 brands to assess the differences between the two methods at any point in time (cross-section view) and over time (longitudinal view).
Please also see the first presentation by GfK at the BayesiaLab Conference:
Alexandra Chirilov is leading GfK’s Global Product Development Practice for consumer and brand intelligence. Her insights and research have been featured in publications such as Esomar, Journal of Marketing Research, Sawtooth, and more. She is a winner of the ESOMAR Corporate Young Professional Award, among other industry awards.
James Pitcher leads GfK’s Marketing Sciences team in the UK, which designs and delivers sophisticated analytical solutions to solve client high-value problems. He has spent the last 15 years providing statistical advice and consultancy within the market research industry, working with clients across many different sectors and regions. James is an expert in conjoint analysis, brand research, pricing, consumer segmentation, and a wide range of multivariate techniques, including Bayesian Networks analysis, contributing to the development of innovative techniques and regularly presenting at international conferences.
Andrzej Surma works as a methodological lead for the Global Product Development Team. He has more than ten years of experience in data analysis. With a background in mathematics, Andrzej loves to solve problems, such as recognizing the Greek letters contained within formulas that describe mathematical models! Recently, he co-created a Bayesian Networks approach to running Key Drivers Analysis on brand tracking data. Spatial data analysis is another particular interest of his. Andrzej likes to be active in his free time, playing football and riding bikes and is inspired daily by his wife and their three children.
Yong Zhang, Ph.D., Procter & Gamble
Presented at the 10th Annual BayesiaLab Conference on Wednesday, October 26, 2022.
We investigated how to conduct driver analysis based on topics derived from unstructured textual data. These data include online consumer reviews, ratings, complaints, comments, and verbatims in surveys. The major challenge is the high missing rate of topics in each individual textual document. As an example, each consumer review may just mention a few topics. This leads to an overall higher missing rate in the data. Without knowing the explicit missing mechanism, BayesiaLab recommended using Approximate Dynamic Imputation (ADI) to impute the missing values. We performed simulations to study different methods of processing missing data and performing driver and impact analysis. With complete and missing simulated data (a mixed missing mechanism), Filtered State and ADI tend to learn the same or very similar model structures, drivers, and impacts. At a low missing rate (~10%), structures, drivers, and impacts are the same as those from the simulated Ground Truth BBN model; at a medium missing rate (40-60%), they also tend to be the same or very similar as GT BBN model through equivalent model structures; at a high missing rate (80%), they tend to recover most of the correct structure, drivers and impacts.
Dr. Yong Zhang leverages Bayesian data and modeling science to develop strategies for product design, manufacturing, storage, and transportation across P&G to improve consumers’ life quality and positively influence the environment and society. He develops first principle and data science/machine learning methods and tools through Front-End Innovation projects to enable and promote the capability across P&G for breakthrough consumer understanding and product innovation. The methods and tools can be used to extract and integrate information from a variety of data sources to find a “Body of Evidence” for consumer and product research based on Nonparametric Bayesian statistics and deep learning algorithms.
Dr. Yong Zhang leverages Bayesian data and modeling science to develop a strategy for product design, manufacturing, storage, and transportation across P&G to improve consumers’ quality of life and drive positive influence on the environment and society under different climate change scenarios. He develops modeling and simulation methods and tools through Front End Innovation projects to enable and promote the capability across P&G for breakthrough consumer understanding and product innovation. The methods and tools can be used to extract and integrate information from a variety of data sources to find a “Body of Evidence” for consumer and product research based on Nonparametric Bayesian statistics and deep learning algorithms.
x
Alexandra Chirilov, James Pitcher, and Andrzej Surma, GfK
The topic of the presentation is comparing the different ways of calculating the feature importance: Shapley Value Regression in R, Bayesian Network in BayesiaLab software, Random Forest in R and shap library in Python. The paper aims to show the similarities and differences between considered approaches.
The entire process is based on the data simulation using copulas in which different scenarios are tested to account for the limitations of survey data, e.g., data skewness.
In the paper, the author tests the different strengths of relationships between the independent variables, the number of predictors, and the measurement scales (binary, Likert scale).
The author used a model agnostic approach called permutation feature importance as a comparison benchmark.
Please also see the second presentation by GfK at the BayesiaLab Conference:
Alexandra Chirilov is leading GfK’s Global Product Development Practice for consumer and brand intelligence. Her insights and research have been featured in publications such as Esomar, Journal of Marketing Research, Sawtooth, and more. She is a winner of the ESOMAR Corporate Young Professional Award, among other industry awards.
James Pitcher leads GfK’s Marketing Sciences team in the UK, which designs and delivers sophisticated analytical solutions to solve client high-value problems. He has spent the last 15 years providing statistical advice and consultancy within the market research industry, working with clients across many different sectors and regions. James is an expert in conjoint analysis, brand research, pricing, consumer segmentation, and a wide range of multivariate techniques, including Bayesian Networks analysis, contributing to the development of innovative techniques and regularly presenting at international conferences.
Andrzej Surma works as a methodological lead for the Global Product Development Team. He has more than ten years of experience in data analysis. With a background in mathematics, Andrzej loves to solve problems, such as recognizing the Greek letters contained within formulas that describe mathematical models! Recently, he co-created a Bayesian Networks approach to running Key Drivers Analysis on brand tracking data. Spatial data analysis is another particular interest of his. Andrzej likes to be active in his free time, playing football and riding bikes and is inspired daily by his wife and their three children.
Presented at the on Wednesday, October 26, 2022.
Presented at the 10th Annual BayesiaLab Conference on Thursday, October 27, 2022.
Nuclear data evaluation is concerned with the collection and joint uncertainty quantification of data from nuclear physics experiments with the goal of producing precise estimates of nuclear quantities that can be used in applications ranging from astrophysics over nuclear medicine to nuclear energy. Data stemming from different experiments are often not directly comparable due to experimental aspects, such as the finite energy resolution of detectors, but are nevertheless related to each other as they link back to the same fundamental nuclear quantities. The Bayesian network framework is particularly well suited to model these relationships and bears the promise to accelerate the production of high-quality nuclear data evaluations in the future and to facilitate the consideration of physical constraints that are often not explicitly modeled.
Dr. Georg Schnabel works in the Nuclear Data Section in the Division of Physical and Chemical Sciences of the International Atomic Energy Agency. His responsibilities include the development of scientific codes for data analysis and management and the organization and coordination of technical meetings and workshops on topics ranging from nuclear data libraries to machine learning with the objective of improving the availability, comprehensiveness, and quality of nuclear data. Prior to working at the IAEA and after graduating from the Technical University of Vienna in Austria, Dr. Schnabel was working as a researcher at the University of Uppsala in Sweden and the French Alternative Energies and Atomic Energy Commission (CEA) specializing in the development and application of Bayesian methods for uncertainty quantification in the domain of nuclear physics.
Presented at the 10th Annual BayesiaLab Conference on Thursday, October 27, 2022.
Recent changes to U.S. accounting rules require estimation, in a continuous-variable sense, of current expected credit losses (CECL) on "financial instruments held at the reporting date, based on historical experience, current conditions, and reasonable and supportable forecasts." This new, continuous treatment of credit losses contrasts with the prior binary accounting rule under which losses were recorded only if "probable." At the same time, the standard for auditing such estimates, known as SAS 143, has also changed, placing emphasis on the effects of uncertainty, subjectivity and judgment, negligent or intentional management bias, complexity, and change. Both the accounting and auditing rules require probabilistic and causal reasoning, for which Bayesian networks are an effective tool. This presentation explores the application of Bayesian networks to the audit of current expected credit losses under the new standards, treating as a target variable the risk of material misstatement (RMM) of the continuous CECL estimate.
Kurt Schulzke, JD, CPA, CFE, teaches accounting information systems, auditing, forensic accounting, risk management, and leadership at the University of North Georgia. His teaching, research, and consulting integrate data science, accounting, and law. He has published on business valuation, economic damages, and Bayesian networks in accounting and auditing in the Columbia Journal of Transnational Law, Vanderbilt Journal of Transnational Law, Tennessee Journal of Business Law, Journal of Forensic Accounting Research, and The Value Examiner. As an attorney, Kurt focuses on business entities, estates, and trusts. MAcc (Brigham Young University), J.D. (Georgia State University), M.S. Applied Statistics (Kennesaw State University).
Presented at the 10th Annual BayesiaLab Conference on Wednesday, October 26, 2022.
Risk assessment is challenging when data is unavailable, hard to obtain, or costly to process. Organizations often request estimates from experts instead. This talk demonstrates how to integrate cybersecurity data with expert estimates using Bayesian Networks. Cybersecurity analysts, resource managers, and executives can use Bayesian Network models to perform risk assessments, select security controls, and prioritize which suspicious events to investigate first. System administrators can configure autonomous sources of data, including vulnerability scanners and cybersecurity event monitoring systems, to automatically update these hybrid network models alongside inputs from risk analysts and executives.
Corey Neskey, Vice President, Quantitative Risk, Hive Systems corey.neskey@hivesystems.io
Corey has been providing analyses, architecting secure environments, and leading security program implementations in IT security and risk since 2011. His career started with informing executive decision-making using algebraic data analyses for explanation, simulation, and attribution (i.e., intelligence analysis, forensics, SOC, CIRT), and optimization. His toolset expanded to more descriptive and predictive methods (i.e., machine learning/AI for risk assessment, vulnerability prioritization, and event correlation). He is now developing skills for integrating these analytical areas and expanding beyond algebraic methods and static probability calculus to using Bayesian network models.
Cross-examination is a method for testing the evidence presented at trial by asking probing questions. It is an integral part of the right to confront one’s accusers, as enshrined in the constitutions of many countries. But there is little scholarly work that analyzes cross-examination, its scope, and its function at trial. Until we know what cross-examination consists in, the substance of the fundamental right to confrontation remains elusive. This talk makes a first attempt at clearing the ground by articulating an analysis of cross-examination using Bayesian networks. This conceptual ground clearing will provide a framework to identify when cross-examination may go wrong and hinder the search for an accurate determination of the facts.
Explanation in Artificial Intelligence is often focused on providing reasons for why a model under consideration and its outcome are correct. Recently, research in explainable machine learning has initiated a shift in focus on including so-called counterfactual explanations. In this presentation, we present our recent proposal to combine both types of explanation in the context of explaining Bayesian networks. To this end, we introduce “persuasive contrastive explanations” that aim to provide an answer to the question “Why outcome X instead of Y?” posed by a user. In addition, we discuss an algorithm for computing persuasive contrastive explanations and suggest how these explanations could be used in an interactive session with the user.
Silja Renooij (Utrecht University) is a member of the Intelligent Systems group and is interested in Probabilistic Graphical Models. Her research focuses on understanding the effects of various precision-complexity tradeoffs in the specification of such models on model output, for the purpose of facilitating the construction and explanation of Bayesian networks.
We present a digital humanities application of Bayesian networks to investigate changes in science fiction portrayals of exoplanets (planets outside our solar system) since the 1990s discovery by astronomers of real exoplanets. Bayesian network analysis is applied to a representative database of fictional exoplanets to determine if the publication date influences fictional exoplanet characteristics. Networks are generated using Banjo, search methodology decisions are confirmed with BayesPiles analysis, and the results are visualised using GraphViz. Results show fictional exoplanets from media created after the discovery of real exoplanets are moderately less likely to host established human populations and slightly less likely to host intelligent native life. This change in response to the scientific discovery of thousands of exoplanets, many of which are not hospitable to humans, provides genre-wide evidence indicating that science fiction does communicate rapidly evolving scientific results. This research demonstrates the potential for Bayesian network analysis as a promising data science methodology in interdisciplinary academic practice.
Emma Johanna Puranen, University of St Andrews (Presenter)
V Anne Smith, University of St Andrews
Emily Finer, University of St Andrews
Christiane Helling, The Space Research Institute (Institut für Weltraumforschung, IWF)
Presented at the on Thursday, October 27, 2022.
Marcello Di Bello is an Assistant Professor of Philosophy at Arizona State University. He holds a Ph.D. in Philosophy from Stanford University and an MSc in Logic from the University of Amsterdam. His research interests include evidence and probability, risk and decision-making, statistics in the law, and algorithmic fairness. He is currently working on a book with Rafal Urbaniak of Gdansk University on "legal probabilism," which you can learn more about here: .
Presented at the on Thursday, October 27, 2022.
Presented at the on Thursday, October 27, 2022.
Emma Johanna Puranen is a St Leonards’ Interdisciplinary Doctoral Scholar at the University of St Andrews, combining astronomy, data science, and media studies in her research on exoplanets in science fiction. A member of the St Andrews Centre for Exoplanet Science, she is very interested in questions of astrobiology and space ethics, including how humans portray speculative space travel in science fiction. Learn more at her website .
Presented at the 10th Annual BayesiaLab Conference on Thursday, October 27, 2022.
Radiation dose in nuclear power plant reactors is known to be dominated by the presence of radioisotopes in the primary loop of the reactor. To strictly control it in normal operation (e.g., cleaning and reloading of nuclear fuel), established chemical theories exist to explain the amount of radioisotopes present in the reactor water circuits with respect to known control variables in the plant (e.g., thermal power on the reactor, pH, hydrogen, etc.). However, the high complexity and the uncertainty of the process make difficult an accurate estimation of the measured values of radioisotopes. In order to address this problem, this article introduces a dynamic Bayesian network (DBN) probabilistic model that allows to experimentally demonstrate the capabilities of the control variables to give information about the value of the radioisotope concentrations and to predict their values in a data-driven way. Our results in 5 different nuclear power plants show that the accuracy and reliability of these predictions are remarkable, enabling strategies for gathering reliable information about the chemical process in the primary loop towards possible operational improvements.
Dr. Daniel Ramos finished his Ph.D. in 2007 at Universidad Autonoma de Madrid (UAM), Spain. Since 2011, he has been an Associate Professor at the UAM. He is a staff member of AUDIAS Group. During his career, he has visited several research laboratories and institutions around the world, including the Institute of Scientific Police at the University of Lausanne (Switzerland), the School of Mathematics at the University of Edinburgh (Scotland), the Electrical Engineering School at the University of Stellenbosch (South Africa), and more recently the Netherlands Forensic Institute and the Computational and Biological Learning Lab of the University of Cambridge. He has been visiting professor at the Universidad de Buenos Aires in 2019. His research interests focus on the forensic evaluation of the evidence using Bayesian techniques, probabilistic calibration, validation of forensic evaluation methods, speaker and language recognition, and, more generally, signal processing and pattern recognition. Dr. Ramos is actively involved in the research of development of different aspects of forensic science, including the statistical evaluation of speech and chemical evidence (mainly glass). He has been invited by the NIST to several workshops, including the OSAC standardization initiative. He is the author of multiple publications in national and international journals and conferences, some of them awarded. He has also participated in several international competitive evaluations of speaker and language recognition technology since 2003. Recently, he has been working on signal processing and machine learning for industrial applications in the energy sector. Dr. Ramos is regularly a member of scientific committees at different international conferences, and he is often invited to give talks at conferences and institutions.
Presented at the 10th Annual BayesiaLab Conference on Friday, October 28, 2022.
Testing of human skills and abilities is a task that is being repeated frequently in the modern world. In this talk, we will explore an approach to Computerized Adaptive Testing using Bayesian Networks. This concept aims at modeling a student and measuring his/her skills. This effort allows us to create a shorter and more precise test as we are able to ask questions suiting the particular student better. We will also present the effect of a special condition of Bayesian Networks used for this task, monotonicity. The monotonicity condition requires a model to satisfy special conditions placed on its parameters. This condition is especially helpful in cases where the learning dataset is small. We will present a new method for learning monotone parameters. Based on our experiments these models provide better results than non-monotone methods and competitive monotone methods. Monotonicity is an important concept that helps to learn models and allows us to learn more reliable parameters. Monotone models are more likely to be accepted by final users in areas where monotonicity is to be expected.
Martin Planjer is the Director of the Research and Development department in the consultancy company Logio. This department's goal is to keep the company at the technological edge and to provide new methods and methodology. This is done by seeking for new approaches, prototyping, and defining new products.
Martin is also a junior researcher at the Institute of Information Theory and Automation (UTIA) in the field of decision-making theory with mathematical modeling background from Ph.D. studies and the Czech Technical University.
These two parts provide an opportunity to combine the business and the academic world and to challenge both theoretical concepts as well as established practices.
Presented at the 10th Annual BayesiaLab Conference on Friday, October 28, 2022.
The differential diagnosis of respiratory diseases is usually a challenge for medical specialists in the first line of care, which has increased under the current COVID-19 pandemic. A Clinical Decision Support System — CDSS — is being developed using Bayesian Networks – BNs – to help physicians diagnose respiratory diseases, including those related to COVID-19. Network structure has been elicited from expert physicians, and network parameters (disease prevalence, symptoms, findings, and lab results conditional probabilities) were extracted from relevant bibliography or currently standard global information sources. The CDSS is being tested using case studies taken from real situations, provided and validated by physicians. The resulting system demonstrates the suitability and flexibility of BNs for diagnosis support.
Dr. Ernesto Ocampo, Ph.D., is a senior full-time professor at the Catholic University of Uruguay Computer Science Department, where he teaches subjects such as Artificial Intelligence, Machine Learning, and Algorithms. His research focus is on Artificial Intelligence applied to Clinical Decision Support Systems (e.g., Acute Bacterial Meningitis, HIV/AIDS, respiratory diseases, and cancer).
His background is in software engineering and holds a Ph.D. in Computer Science from the University of Alcalá, Spain.
An IEEE Senior Member, Dr. Ocampo has worked in the software industry for more than 30 years, currently as a technical consultant for Qualisys Software and Technologies - www.qualisyss.com-).
Dr. Silvia Herrera, MD, is a senior pediatric physician with more than 30 years of professional experience. Dr. Herrera worked for 25 years as an internal pediatrician in the Central Armed Forces Hospital of Uruguay, and for several years she was part of the pro-bono health team that focused on children with HIV/AIDS at the National Pediatric Centre of Reference.
She currently works in a pre-hospital pediatric emergency unit and the pediatric emergency room of a private health provider hospital.
Dr. Herrera has helped the UCU Computer Science Department for several years as a field expert in various CDSS research projects.
Juan Francisco Kurucz is an Informatics Engineer who graduated from the Catholic University of Uruguay, where he works as an assistant professor in Artificial Intelligence and other computer science courses. He is an active academic researcher on Artificial Intelligence and is currently focused on the application of Bayesian Networks and Deep Learning to Clinical Decision Support.
In the professional field, Juan Francisco works as a Machine Learning Engineer at an AI Software Company —Tryolabs — where he specializes in Computer Vision. He also participates in the IEEE as a volunteer member.
Lucas Lois is a Software Engineer who graduated from the Catholic University of Uruguay, where he works as an assistant professor in Computer Science courses. His research area is Artificial Intelligence in Health, focused on the application of Natural Language Processing to Named Entities Recognition in Electronic Health Records and Bayesian Networks applied to Clinical Decision Support.
In the professional field, Lucas also has several years of software development experience, working currently as a software team leader at December Labs, a high-touch boutique Design & Development shop.
Presented at the 10th Annual BayesiaLab Conference on Friday, October 28, 2022.
A latent-observational space analytical formalism is applied to a sub-grid modeled turbulent kinetic energy (tke) field emanating from ocean turbulence large eddy simulation (LES) data containing Langmuir cells but no breaking waves. The purpose of the analysis is to illustrate how machine learning modeling can be used to understand the probabilistic structure of observational space and how the allied latent space can be related statistically to it for the purpose of data generation. The Peter and Clark (PC)-algorithm-based Bayesian belief network (BBN) edge-nodal structure for the observational-space tke subdomains demonstrates a distinctive nonlocal connectivity pattern when the multidimensional scaling graph layout is invoked. When the Chow-Liu algorithm is used, tree-based connectivity in the network is revealed. In particular, a dominant parental root node occupies the far upper left region in the observational-space domain with many edge connections flowing toward the right. Hidden Markov model (HMM) parameter estimation, applied to the maximum and minimum values taken from the mean tke feature matrix and to generative topographic mapping (GTM)-based latent space for the same tke feature nodes, enables estimation of the latent-space state transition matrix and tke latent-observational space emission matrices. The latent-space state transition matrix demonstrates how many columns of latent space, associated with different root-mean-square tke values, possess a high probability of transitioning to other distinct latent-space areas. The tke latent-observational space emission matrix provides spatial subdomain locations most strongly tied to specific vertical columns of GTM-based latent space. The maximum value-based emission matrix shows very high probabilities for single columns on the latent-space perimeter being associated with maximum values occurring at specific observational-space subdomain locations. These observational-space nodal locations have strong statistical linkages to other nodes when the Chow-Liu-algorithm-based BBN is invoked, suggesting this model to be physically appropriate to the high-energy turbulent flow physics. Bayesian and manifold learning processing methods provide a way for understanding the spatial structure of LES-derived tke features and how latent space can provide a constraint for discerning optimal linkages between spatially separate observational-space subdomains.
Dr. Nicholas Scott is a modeling scientist and physical oceanographer and has been a member of the professional staff at Riverside Research in Dayton, Ohio, since October 2012. He investigates the applicability of non-traditional signal and image processing techniques to the extraction of information from remotely sensed data. This includes hyperspectral imagery. His present work includes statistical modeling of computer network system traffic, Bayesian analysis of geo-intelligence system nonlinear dynamics, and time series analysis of environmental data. He also exploits probabilistic graphical modeling algorithms for understanding the structure existing within turbulent flow imagery features and numerically simulated data.
Using Bayesian Networks to Extract Expert Knowledge from a Pre-existing Machine Learning Model Trained Elsewhere
Presented at the 10th Annual BayesiaLab Conference on Wednesday, October 26, 2022.
ML practitioners have traditionally had to make a choice between the most performant models and those that allow for clear explanations of model decisions. With the recent advent of powerful XAI techniques, this is becoming less of an issue as formerly black box models become (more) transparent. Using these tools on large, complex, real-world tabular datasets, however, can lead to disappointing results that fall short of what is necessary for key stakeholders to use and trust system decisions. Using Bayesian networks, we propose a method that can help build a bridge between machine and expert “reasoning.” We show that applied to the classification of stock market investments – a field with a notoriously low signal-to-noise ratio – this method can help bring the best of man and machine working together to tackle new problems in applied data science.
Gabriel Andraos jointly leads Voya’s Machine Intelligence group (part of Voya Financial, a leading health, wealth, and investment company — NYSE ticker: VOYA). As the co-head of VMI, he focuses on research and development in the application of AI and machine learning models for fundamental investing. For more than ten years, the VMI team has been using machine intelligence to run virtual employees – analysts, traders, and portfolio managers with transparent, explainable computer models anchored in fundamentals. Gabriel has approximately 26 years of investment experience. Prior to joining Voya, Gabriel was a managing partner and co-founder of G Squared Capital LLP. Before that, he held senior investment roles in Europe, the U.S., and Asia, combining knowledge and experience in fundamental analysis with the latest tools in computing and data science. Gabriel received an MBA from Harvard Business School and a BA in Economics from Georgetown University. He also has a Certificate in Quantitative Finance and several artificial intelligence, data science, and machine learning accreditations.
Workflow automation in Bayesialab with applications to time series analysis (Paris, 2017)
Presented at the 10th Annual BayesiaLab Conference on Friday, October 28, 2022.
GM’s experience shows that expanded data access must be coupled with tools that can manage volume, techniques that appropriately control for exposure and sample bias, and processes that incorporate SME knowledge. BayesiaLab helps GM address these critical issues, with one tool, in a way that is intuitive and makes communication with engineers easier to support vehicle hazard investigations.
Michelle Michelini is currently a Senior Technical Fellow in the Global Planning Analytics team at General Motors, where she focuses on projects which require the deployment of new analytical methods and tools to develop deeper insights and recommendations from Planning data. Michelle started her career as a statistician with GM Credit Card and then worked for OnStar as a strategist. From OnStar, she went to GM Vehicle Safety, where she developed the Vehicle Safety Analytics team as a response to the findings of the ignition switch investigation. She holds a Bachelor of Science from the University of Michigan in Mathematics and Applied Statistics and a Master of Science from Carnegie Mellon University in Information Technology and Systems.