James Nightingale

Gather.town id
SBD01
Poster Title
PyAutoFit: A Classy Probabilistic Programming Language For Cosmology and Cancer
Institution
Durham University
Abstract (short summary)
A major trend in astronomy and healthcare is the rapid adoption of Bayesian statistics for data analysis and modeling. With modern data-sets growing by orders of magnitude in size, the focus is now on developing methods capable of applying contemporary inference techniques to extremely large datasets. To this aim, I present PyAutoFit (https://github.com/rhayes777/PyAutoFit), an open-source probabilistic programming language for automated Bayesian inference.

I will present PyAutoFit’s multi-level modeling framework, which allows a user to compose and fit hierarchical models to extremely large datasets. In an Astronomy setting, I will show how a multi-level model can constrain the dark matter particle, by modeling individual images of strong lens galaxies at the lowest level and the Universe’s cosmological parameters at the top. Next, I will describe a multi-level model of cancer treatment we are building with the healthcare company Roche, where the inner components are how specific genetic or epigenetic profiles of cancers respond to treatments and the higher levels represent tumour dynamics and patient outcomes. Finally, I will discuss how this framework can overcome a major challenge for both Astronomy and healthcare datasets: missing data.
Plain text (extended) Summary
A major trend in academia and data science is the rapid adoption of Bayesian statistics for data analysis and modeling, leading to the development of probabilistic programming languages (PPL). A PPL provides a framework that allows users to easily specify a probabilistic model and perform inference automatically. PyAutoFit is a Python-based PPL which interfaces with all aspects of the modeling (e.g., the model, data, fitting procedure, visualization, results) and therefore provides complete management of every aspect of modeling. PyAutoFit is a collaboration between Cosmologists seeking to study dark matter and Cancer researchers aiming to improve the diagnosis and treatment of cancer.

A core feature of PyAutoFit are graphical models, where a graphical model is a network of model components which express the conditional dependencies between different parameters and model-components in the model. Using hierarchies of Python classes PyAutoFit can construct graphical models, and these can be linked together to compose and fit one overarching model.

For Cosmology, graphical models allow us to analyse large samples of strong gravitational lenses, where the inner model components represent the light and mass distributions of galaxies and higher level components the Universe’s cosmological parameters. For Cancer, the inner components represent how (epi-)genetic profiles of a variety of cancers respond to treatments and therefore model the evolution of a tumour. Higher levels describe patient outcomes during cancer treatment, allowing us to determine effective cancer treatments.

Checkout the following links for a complete overview of how PyAutoFit works:

GitHub: https://github.com/rhayes777/PyAutoFit

Readthedocs: https://pyautofit.readthedocs.io/en/latest/

JOSS Paper: https://joss.theoj.org/papers/10.21105/joss.02550

You can try PyAutoFit in your web browser right now at the following Binder link:

https://mybinder.org/v2/gh/Jammy2211/autofit_workspace/master?filepath=introduction.ipynb.
URL
james.w.nightingale@durham.ac.uk