Data-driven design of logic-based models of biological processes

Samuel Pastva (Masaryk Univ. Brno, Tchéquie)

January 27, 2026

Modern cell biology generates a treasure trove of experimental data, allowing us to measure many biochemical processes in individual cells with single-molecule resolution. However, applying this data to generate explainable predictions is often challenging due to the high number of entities, interactions, and environmental factors in question. Systems biology utilizes logic-based explainable models (e.g., Boolean networks) to address this challenge. Historically, many such models were designed manually by domain experts; however, this approach is not scalable for the modern era, and new data-driven approaches are needed.

In this talk, I will provide a state-of-the-art overview of methods for developing and analyzing logic-based models that integrate these large, cutting-edge genomic datasets. First, we will cover the formalization of biological observations into logical constraints. Specifically, which formal assumptions can be extracted from biological data and what the limitations of our current measurement techniques are. Then, I will present methods based on automated reasoning (SAT/SMT/ASP) and symbolic data structures (binary decision diagrams) that allow us to learn formally verified model candidates from these observations. Finally, because complex systems rarely have a single “best” logic-based model, we will conclude by discussing how to analyze and refine large ensembles of logic-based models (so-called partially specified Boolean models) that capture the plausible behaviors of a biological system.