Module 4: PyRate and AI methods to reconstruct biodiversity
Estimation of diversity and diversification using Bayesian and deep learning models
In this part of the course we will go over several models in infer preservation, speciation, and extinction processes from fossil occurrence data. Most of the methods do not require a joint estimation of the underlying phylogenetic tree connecting the lineages (which can be both a good and a bad thing).
The classes will include general explanations of how Bayesian inference and MCMC machinery work, and how supervised and unsupervised deep neural networks can be used in the context of macroevolution.
Installing Python and dependencies
We will be using mostly software written in Python (3.8 or higher), which can be installed from python.org . In this tutorial we show how to set up a virtual environment (venv
) with all dependencies installed. A venv
is a local copy of Python that makes sure the libraries installed are compatible with the software we want to run, without interfering with the global Python installed on your machine.
PyRate
We will use the software PyRate and follow some of the available tutorials to run Bayesian analyses of fossil occurrence data and estimate variation of speciation and extinction rates through time and among lineages. This will also allow us to test specific hypotheses on the drivers of diversification and extinction.
The program is written in Python but does not require previous Python knowledge as it is typically run as an executable (just like you don’t need to know C++ to run RevBayes).
rootBBB
We will use the software rootBBB to estimate clade age following its tutorial and with the available example files. You are welcome of course to try it with your own data, if applicable.
PyRate models for hypothesis testing
We will discuss several models available in PyRate that allow us to link speciation and extinction rates with a time series, e.g. proxies for climate change. These models can be expanded to a multivariate birth-death model to simultaneously assess the impact of multiple time-variable predictors.
If time allows, we will touch upon age-dependent extinction models and the birth-death chrono-species model that allows the joint analysis of fossil data and a phylogenetic tree of extant species.
Finally, we will talk about a new model that uses a neural network to allow for speciation and extinction rate variation as a function of one or more categorical or continuous traits and multiple time series.
DeepDive
We will present a new model to infer past biodiversity trajectories using deep learning. The DeepDive project is available for download here.