Introduction to divDyn - Phanerozoic diversity

Ádám T. Kocsis

FAU GeoZentrum Nordbayern

Background

What is this?

  • Short for Diversity Dynamics
  • Background in island biogeography (MacArthur and Wilson 1963)
  • Diversity, extinction and origination metrics
  • Variables time series - to be analyzed further!


Alroy (2008) - PNAS

Diversity over time?

  • Overall increase?
  • Saturation?
  • Equilibrial dynamics?


J. John Sepkoski Jr. (1978)

The divDyn project

  • Estimating metrics has been a problem - everybody had different scripts
  • Implementing sampling standardization
  • Wrote basics for my PhD


The project


R hexsticker


https://github.com/divDyn

Getting started

Resources

Installing and loading

Package (v0.8.2) can be installed from CRAN.

install.packages("divDyn")

Attach the package with:

library(divDyn)

Binning

  • divDyn operates with discrete-time
  • Requires temporal/stratigraphic binning
  • One value/time bin
  • A timescale
  • Always the first step of the analysis!

Example time scales

47-bin, 10my time scale

data(tens)
str(tens)
'data.frame':   49 obs. of  9 variables:
 $ X10    : chr  "Cambrian 1" "Cambrian 2" "Cambrian 3" "Cambrian 4" ...
 $ Ocean  : chr  "ar1" "cc1" "cc1" "cc1" ...
 $ Ocean2 : chr  "ar1" "" "" "" ...
 $ Climate: chr  "w" "w" "w" "w" ...
 $ bottom : num  542 532 513 501 490 ...
 $ mid    : num  537 522 507 495 483 ...
 $ top    : num  532 513 501 490 479 ...
 $ dur    : num  10 19 12 11 11.4 ...
 $ ten    : int  1 2 3 4 5 6 7 8 9 10 ...

Example time scales

95-bin stages

data(stages)
str(stages)
'data.frame':   95 obs. of  13 variables:
 $ sys      : chr  "E" "E" "E" "Cm" ...
 $ system   : chr  "Ediacaran" "Ediacaran" "Ediacaran" "Cambrian" ...
 $ series   : chr  "Upper Ediacaran" "Upper Ediacaran" "Upper Ediacaran" "Terreneuvian" ...
 $ stage    : chr  "Avalon Assemblage" "White Sea Assemblage" "Nama Assemblage" "Fortunian" ...
 $ short    : chr  "Ava" "Whs" "Nam" "For" ...
 $ bottom   : num  580 560 550 539 529 ...
 $ mid      : num  570 555 544 534 525 ...
 $ top      : num  560 550 539 529 521 ...
 $ dur      : num  20 10 11.2 9.8 8 6.5 5.5 4.5 4 3.5 ...
 $ stg      : int  1 2 3 4 5 6 7 8 9 10 ...
 $ systemCol: chr  "#fddd88" "#fddd88" "#fddd88" "#94ac72" ...
 $ seriesCol: chr  "#fddd88" "#fddd88" "#fddd88" "#9db989" ...
 $ col      : chr  "#fcd589" "#fdd587" "#fed583" "#a9be93" ...

Required data - Interface conventions

  • x: occurrence data.frame to be analyzed
  • bin: column name of bin numbers (integer)
  • tax: column name of taxon entries (character)
  • coll: column name of sampling units (optional, character)
  • age: column name of numeric ages (optional, numeric)

Phanerozoic Marine Animals

About

The example was used as SOM for the divDyn paper.
Was supposed to have a similar scope as the results of (J. John Sepkoski 1997) and (Alroy 2010).
The analyses are done at the genus level!
- Stratigraphic assignments
- Sampling patterns
- Richness estimators
- Taxonomic rates
- Subsampling: (Next Monday)

More info: https://github.com/divDyn/ddPhanero

Downloading the data

We need almost the whole PBDB.

API call to produce downlad:

https://paleobiodb.org/data1.2/occs/list.csv?datainfo&rowcount&interval=Ediacaran,Holocene&show=class,classext,genus,subgenus,coll,coords,loc,paleoloc,strat,stratext,lith,env,ref,crmod,timebins,timecompare,refattr,entname,attr,ident,img,plant,abund,ecospace,taphonomy,etbasis,pres,prot,lithext,geo,methods,resgroup,refattr,ent
  • takes about 30 minutes…
  • 2GB+ .csv file (read.csv())
  • or…

The chronosphere

  • API for research data acquisition
  • Versioned data entries
  • PBDB from Aug 24, 2023

Install from CRAN (about to be updated!)

install.packages("chronosphere")

Load package

library(chronosphere)

Downloading the PBDB

# Data download with explicit version
dat <- chronosphere::fetch("pbdb", ver="20230824")

Takes about 1 minute, ~130MB download.

After done, check if structure is ok!

str(dat)
'data.frame':   1587958 obs. of  147 variables:
 $ occurrence_no      : int  1 2 3 4 5 7 8 9 10 11 ...
 $ record_type        : chr  "occ" "occ" "occ" "occ" ...
 $ reid_no            : int  NA NA NA NA NA NA NA NA NA NA ...
 $ flags              : chr  "" "" "" "" ...
 $ collection_no      : int  1 1 1 2 3 4 4 1 5 5 ...
 $ identified_name    : chr  "Australosutura llanoensis" "Carbonocoryphe planucauda" "Thigriffides roundyi" "Pudoproetus chappelensis" ...
 $ identified_rank    : chr  "species" "species" "species" "species" ...
 $ identified_no      : int  349412 349526 349420 349411 349411 349411 349420 349411 349411 349668 ...
 $ difference         : chr  "" "recombined as" "" "" ...
 $ accepted_name      : chr  "Australosutura llanoensis" "Phillibole planucauda" "Thigriffides roundyi" "Pudoproetus chappelensis" ...
 $ accepted_attr      : logi  NA NA NA NA NA NA ...
 $ accepted_rank      : chr  "species" "species" "species" "species" ...
 $ accepted_no        : int  349412 349526 349418 349410 349410 349410 349418 349410 349410 349664 ...
 $ early_interval     : chr  "Ivorian" "Ivorian" "Ivorian" "Ivorian" ...
 $ late_interval      : chr  "" "" "" "" ...
 $ max_ma             : num  354 354 354 354 354 ...
 $ min_ma             : num  345 345 345 345 345 ...
 $ ref_author         : chr  "Brezinski" "Brezinski" "Brezinski" "Brezinski" ...
 $ ref_pubyr          : int  1998 1998 1998 1998 1998 1998 1998 1998 1998 1998 ...
 $ reference_no       : num  1 1 1 1 1 1 1 1 1 1 ...
 $ phylum             : chr  "Arthropoda" "Arthropoda" "Arthropoda" "Arthropoda" ...
 $ phylum_no          : chr  "18891" "18891" "18891" "18891" ...
 $ class              : chr  "Trilobita" "Trilobita" "Trilobita" "Trilobita" ...
 $ class_no           : chr  "19100" "19100" "19100" "19100" ...
 $ order              : chr  "Proetida" "Proetida" "Proetida" "Proetida" ...
 $ order_no           : chr  "21062" "21062" "21062" "21062" ...
 $ family             : chr  "Brachymetopidae" "Phillipsiidae" "Phillipsiidae" "Proetidae" ...
 $ family_no          : chr  "56732" "85866" "85866" "56726" ...
 $ genus              : chr  "Australosutura" "Archegonus (Phillibole)" "Thigriffides" "Pudoproetus" ...
 $ genus_no           : chr  "21084" "21075" "21387" "177081" ...
 $ subgenus_no        : int  NA 21306 NA NA NA NA NA NA NA NA ...
 $ collection_name    : chr  "USNM 9047, Jack Sloan Ranch" "USNM 9047, Jack Sloan Ranch" "USNM 9047, Jack Sloan Ranch" "USNM 9048" ...
 $ collection_subset  : int  NA NA NA NA NA NA NA NA NA NA ...
 $ collection_aka     : chr  "" "" "" "" ...
 $ lng                : num  -99 -99 -99 -98.1 -98.1 ...
 $ lat                : num  31.2 31.2 31.2 31 31 ...
 $ cc                 : chr  "US" "US" "US" "US" ...
 $ state              : chr  "Texas" "Texas" "Texas" "Texas" ...
 $ county             : chr  "San Saba" "San Saba" "San Saba" "" ...
 $ latlng_basis       : chr  "based on nearby landmark" "based on nearby landmark" "based on nearby landmark" "" ...
 $ latlng_precision   : chr  "2" "2" "2" "1" ...
 $ geogscale          : chr  "outcrop" "outcrop" "outcrop" "" ...
 $ geogcomments       : chr  "800 to 900 feet SSE of an abandoned well and just SE of a stock pen on the Jack Sloan ranch. The pen and abando"| __truncated__ "800 to 900 feet SSE of an abandoned well and just SE of a stock pen on the Jack Sloan ranch. The pen and abando"| __truncated__ "800 to 900 feet SSE of an abandoned well and just SE of a stock pen on the Jack Sloan ranch. The pen and abando"| __truncated__ "Near San Saba, TX" ...
 $ paleomodel         : chr  "gplates" "gplates" "gplates" "gplates" ...
 $ geoplate           : chr  "101" "101" "101" "101" ...
 $ paleoage           : chr  "mid" "mid" "mid" "mid" ...
 $ paleolng           : num  -65.7 -65.7 -65.7 -65.6 -65.6 ...
 $ paleolat           : num  -26.7 -26.7 -26.7 -27.5 -27.5 ...
 $ formation          : chr  "Chappel Limestone" "Chappel Limestone" "Chappel Limestone" "Chappel Limestone" ...
 $ stratgroup         : chr  "" "" "" "" ...
 $ member             : chr  "" "" "" "" ...
 $ formation.1        : chr  "Chappel Limestone" "Chappel Limestone" "Chappel Limestone" "Chappel Limestone" ...
 $ stratgroup.1       : chr  "" "" "" "" ...
 $ member.1           : chr  "" "" "" "" ...
 $ stratscale         : chr  "bed" "bed" "bed" "" ...
 $ zone               : chr  "" "" "" "S. isostichia - crenulata through S. anchoralis - latus" ...
 $ localsection       : chr  "" "" "" "" ...
 $ localbed           : chr  "" "" "" "" ...
 $ localbedunit       : chr  "" "" "" "" ...
 $ localorder         : chr  "" "" "" "" ...
 $ regionalsection    : chr  "" "" "" "" ...
 $ regionalbed        : chr  "" "" "" "" ...
 $ regionalbedunit    : chr  "" "" "" "" ...
 $ regionalorder      : chr  "" "" "" "" ...
 $ stratcomments      : chr  "Chappel Limestone is a condensed one-meter thick unit spanning the S. isostichia-crenulata through Gnathodus ty"| __truncated__ "Chappel Limestone is a condensed one-meter thick unit spanning the S. isostichia-crenulata through Gnathodus ty"| __truncated__ "Chappel Limestone is a condensed one-meter thick unit spanning the S. isostichia-crenulata through Gnathodus ty"| __truncated__ "Late Kinderhookian and Osagean" ...
 $ lithdescript       : chr  "" "" "" "" ...
 $ lithology1         : chr  "\"limestone\"" "\"limestone\"" "\"limestone\"" "\"carbonate\"" ...
 $ lithadj1           : chr  "" "" "" "" ...
 $ lithification1     : chr  "lithified" "lithified" "lithified" "" ...
 $ minor_lithology1   : chr  "" "" "" "" ...
 $ fossilsfrom1       : chr  "" "" "" "" ...
 $ lithology2         : chr  "" "" "" "" ...
 $ lithadj2           : chr  "" "" "" "" ...
 $ lithification2     : chr  "" "" "" "" ...
 $ minor_lithology2   : chr  "" "" "" "" ...
 $ fossilsfrom2       : chr  "" "" "" "" ...
 $ primary_reference  : chr  "D. K. Brezinski. 1998. Trilobites from the Lower Mississippian starved basin facies of the southern United Stat"| __truncated__ "D. K. Brezinski. 1998. Trilobites from the Lower Mississippian starved basin facies of the southern United Stat"| __truncated__ "D. K. Brezinski. 1998. Trilobites from the Lower Mississippian starved basin facies of the southern United Stat"| __truncated__ "D. K. Brezinski. 1998. Trilobites from the Lower Mississippian starved basin facies of the southern United Stat"| __truncated__ ...
 $ created            : chr  "1998-11-20 07:59:51" "1998-11-20 07:59:51" "1998-11-20 07:59:51" "2001-03-26 11:06:31" ...
 $ modified           : chr  "2018-02-02 16:22:36" "2018-02-02 16:22:36" "2018-02-02 16:22:36" "2000-11-30 12:09:35" ...
 $ time_bins          : chr  "Tournaisian" "Tournaisian" "Tournaisian" "Tournaisian" ...
 $ time_contain       : chr  "-" "-" "-" "-" ...
 $ time_major         : chr  "Tournaisian" "Tournaisian" "Tournaisian" "Tournaisian" ...
 $ time_buffer        : chr  "Tournaisian, Visean" "Tournaisian, Visean" "Tournaisian, Visean" "Tournaisian, Visean" ...
 $ time_overlap       : chr  "Tournaisian, Visean" "Tournaisian, Visean" "Tournaisian, Visean" "Tournaisian, Visean" ...
 $ authorizer         : chr  "J. Alroy" "J. Alroy" "J. Alroy" "W. Kiessling" ...
 $ enterer            : chr  "M. Sommers" "M. Sommers" "M. Sommers" "W. Kiessling" ...
 $ modifier           : chr  "M. Clapham" "M. Clapham" "M. Clapham" "" ...
 $ primary_name       : chr  "Australosutura" "Carbonocoryphe" "Thigriffides" "Pudoproetus" ...
 $ primary_reso       : chr  "" "" "" "" ...
 $ subgenus_name      : chr  "" "" "" "" ...
 $ subgenus_reso      : chr  "" "" "" "" ...
 $ species_name       : chr  "llanoensis" "planucauda" "roundyi" "chappelensis" ...
 $ species_reso       : chr  "" "" "" "" ...
 $ image_no           : int  1746 1746 1746 1746 1746 1746 1746 1746 1746 1746 ...
 $ plant_organ        : chr  "" "" "" "" ...
 $ plant_organ2       : chr  "" "" "" "" ...
 $ abund_value        : chr  "4" "1" "2" "6" ...
 $ abund_unit         : chr  "specimens" "specimens" "specimens" "specimens" ...
 $ taxon_environment  : chr  "marine" "marine" "marine" "marine" ...
  [list output truncated]
 - attr(*, "chronosphere")=List of 13
  ..$ src         : chr "pbdb"
  ..$ ser         : chr "occs3"
  ..$ res         : chr "species"
  ..$ ver         : int 20230824
  ..$ datafile    : chr "pbdb_occs.rds"
  ..$ item        : int 731
  ..$ reference   : chr [1:2] "Please remember to acknowledge the Paleobiology Database (http://paleobiodb.org) in your publication." "Alternatively, you can use: The Paleobiology Database Community. (year). Denormalized occurrence table of the P"| __truncated__
  ..$ bibtex      : chr [1:2] "NA" "NA"
  ..$ downloadDate: POSIXct[1:1], format: "2023-08-25 07:10:12"
  ..$ productDate : chr "2023-08-24"
  ..$ infoURL     : chr "https://chronosphere-portal.github.io/docs/pbdb/"
  ..$ API         : chr "https://paleobiodb.org/data1.2/occs/list.csv?datainfo&rowcount&interval=Ediacaran,Holocene&show=class,classext,"| __truncated__
  ..$ additional  : list()

Filtering - I. Taxonomy: quality

Get rid of unprecise entries!

# filter records not identified at least to genus
dat <-dat[dat$accepted_rank %in% c("genus", "species"),]

# omit non-informative genus entries
dat <- dat[dat$genus!="", ]

Filtering - I. Taxonomy: phyla

marineNoPlant <- c("",
    "Agmata",
    "Annelida",
    "Bilateralomorpha",
    "Brachiopoda",
    "Bryozoa",
    "Calcispongea",
    "Chaetognatha",
    "Cnidaria",
    "Ctenophora",
    "Echinodermata",
    "Entoprocta",
    "Foraminifera",
    "Hemichordata",
    "Hyolitha",
    "Mollusca",
    "Nematoda",
    "Nematomorpha",
    "Nemertina",
    "Onychophora",
    "Petalonamae",
    "Phoronida",
    "Platyhelminthes",
    "Porifera",
    "Problematica",
    "Rhizopodea",
    "Rotifera",
    "Sarcomastigophora",
    "Sipuncula",
    "Uncertain",
    "Vetulicolia",
    ""
)

# which rows?
bByPhyla <- dat$phylum%in% marineNoPlant

Filtering - I. Taxonomy: classes

#B. classes
#   levels(factor(noNeed$class))
needClass <- c(
    "Acanthodii",
    "Actinopteri",
    "Actinopterygii",
    "Agnatha",
    "Cephalaspidomorphi",
    "Chondrichthyes",
    "Cladistia",
    "Coelacanthimorpha",
    "Conodonta",
    "Galeaspida",
    "Myxini",
    "Osteichthyes",
    "Petromyzontida",
    "Plagiostomi",
    "Pteraspidomorphi",
    # here come the Arthropods
    "Artiopoda",
    "Branchiopoda",
    "Cephalocarida",
    "Copepoda",
    "Malacostraca",
    "Maxillopoda",
    "Megacheira",
    "Merostomoidea",
    "Ostracoda",
    "Paratrilobita",
    "Pycnogonida",
    "Remipedia",
    "Thylacocephala",
    "Trilobita",
    "Xiphosura"
)

# which rows?
bNeedClass <- dat$class %in% needClass

Filtering - I. Taxonomy: more

#C.  mammals
#   mammals <- dat[dat$class=="Mammalia",]
#   levels(factor(mammals$order))
needMammalOrd <- c("Cetacea", "Sirenia")

# which rows?
bMammalOrder <- dat$order %in% needMammalOrd

# the carnivores
#   carnivores <- dat[dat$order=="Carnivora",]
#   levels(factor(carnivores$family))
needFam <- c("Otariidae", "Phocidae", "Desmatophocidae")

# which rows?
bNeedMamFam<- dat$family%in%needFam

# D. Reptiles
#   reptiles <- dat[dat$class=="Reptilia",]
#   levels(factor(reptiles$order))
needReptOrd<-c(
    "Eosauropterygia",
    "Hupehsuchia",
    "Ichthyosauria",
    "Placodontia",
    "Sauropterygia",
    "Thalattosauria"
)

# which rows?
bRept <- dat$order%in%needReptOrd

# E. turtles 
#   turtles <- dat[dat$order=="Testudines",]
#   levels(factor(turtles$family))

# Chelonioidea turtles
needTurtleFam <- c(
    "Cheloniidae",
    "Protostegidae",
    "Dermochelyidae",
    "Dermochelyoidae",
    "Toxochelyidae",
    "Pancheloniidae"
)

# which rows?
bTurtle <- dat$family%in%needTurtleFam

Filtering - I. Taxonomy: execute!

Subsetting for the rows we highlighted:

# subset the data
dat <- dat[bByPhyla | bNeedClass | bMammalOrder | bNeedMamFam | bRept | bTurtle , ]

Resolve the potential homonymy problem across taxa:

dat$clgen <- paste(dat$class, dat$genus)

Number of remaining records:

nrow(dat)
[1] 1002951

Filtering - II. Environment

Fossils come from different environments:

levels(factor((dat$environment)))
 [1] ""                                    
 [2] "\"channel\""                         
 [3] "\"floodplain\""                      
 [4] "alluvial fan"                        
 [5] "basin reef"                          
 [6] "basinal (carbonate)"                 
 [7] "basinal (siliceous)"                 
 [8] "basinal (siliciclastic)"             
 [9] "carbonate indet."                    
[10] "cave"                                
[11] "channel lag"                         
[12] "coarse channel fill"                 
[13] "coastal indet."                      
[14] "crater lake"                         
[15] "crevasse splay"                      
[16] "deep subtidal indet."                
[17] "deep subtidal ramp"                  
[18] "deep subtidal shelf"                 
[19] "deep-water indet."                   
[20] "delta front"                         
[21] "delta plain"                         
[22] "deltaic indet."                      
[23] "dry floodplain"                      
[24] "dune"                                
[25] "eolian indet."                       
[26] "estuary/bay"                         
[27] "fine channel fill"                   
[28] "fissure fill"                        
[29] "fluvial indet."                      
[30] "fluvial-deltaic indet."              
[31] "fluvial-lacustrine indet."           
[32] "foreshore"                           
[33] "glacial"                             
[34] "interdistributary bay"               
[35] "interdune"                           
[36] "intrashelf/intraplatform reef"       
[37] "karst indet."                        
[38] "lacustrine - large"                  
[39] "lacustrine - small"                  
[40] "lacustrine delta front"              
[41] "lacustrine delta plain"              
[42] "lacustrine deltaic indet."           
[43] "lacustrine indet."                   
[44] "lacustrine prodelta"                 
[45] "lagoonal"                            
[46] "lagoonal/restricted shallow subtidal"
[47] "levee"                               
[48] "loess"                               
[49] "marginal marine indet."              
[50] "marine indet."                       
[51] "mire/swamp"                          
[52] "offshore"                            
[53] "offshore indet."                     
[54] "offshore ramp"                       
[55] "offshore shelf"                      
[56] "open shallow subtidal"               
[57] "paralic indet."                      
[58] "perireef or subreef"                 
[59] "peritidal"                           
[60] "platform/shelf-margin reef"          
[61] "pond"                                
[62] "prodelta"                            
[63] "reef, buildup or bioherm"            
[64] "sand shoal"                          
[65] "shallow subtidal indet."             
[66] "shoreface"                           
[67] "sinkhole"                            
[68] "slope"                               
[69] "slope/ramp reef"                     
[70] "spring"                              
[71] "submarine fan"                       
[72] "terrestrial indet."                  
[73] "transition zone/lower shoreface"     
[74] "wet floodplain"                      

Filtering - II. Environment

We want go get rid of non-marine ones:

omitEnv <- c(
    "\"floodplain\"",
    "alluvial fan",
    "cave",
    "\"channel\"",
    "channel lag" ,
    "coarse channel fill",
    "crater lake",
    "crevasse splay",
    "dry floodplain",
    "delta plain",
    "dune",
    "eolian indet.",
    "fine channel fill",
    "fissure fill",
    "fluvial indet.",
    "fluvial-lacustrine indet.",
    "fluvial-deltaic indet.",
    "glacial",
    "interdune",
    "karst indet.",
    "lacustrine - large",
    "lacustrine - small",
    "lacustrine delta front",
    "lacustrine delta plain",
    "lacustrine deltaic indet.",
    "lacustrine indet.",
    "lacustrine interdistributary bay",
    "lacustrine prodelta",
    "levee",
    "loess",
    "mire/swamp",
    "pond",
    "sinkhole",
    "spring",
    "tar",
    "terrestrial indet.",
    "wet floodplain"
)

# actual omission
dat<-dat[!dat$environment%in%omitEnv, ]

Omitting unlithified sediments

Remaining:

nrow(dat)
[1] 986576

Unusually good preservation (usually in more recent intervals) might cause temporal biases in sampling. To evade this problem, it can be a good idea to omit records from unlithified sediments.

dat <- dat[dat$lithification1!="unlithified",]
nrow(dat)
[1] 916579

Stratigraphic binning - stages

data(stages)
str(stages)
'data.frame':   95 obs. of  13 variables:
 $ sys      : chr  "E" "E" "E" "Cm" ...
 $ system   : chr  "Ediacaran" "Ediacaran" "Ediacaran" "Cambrian" ...
 $ series   : chr  "Upper Ediacaran" "Upper Ediacaran" "Upper Ediacaran" "Terreneuvian" ...
 $ stage    : chr  "Avalon Assemblage" "White Sea Assemblage" "Nama Assemblage" "Fortunian" ...
 $ short    : chr  "Ava" "Whs" "Nam" "For" ...
 $ bottom   : num  580 560 550 539 529 ...
 $ mid      : num  570 555 544 534 525 ...
 $ top      : num  560 550 539 529 521 ...
 $ dur      : num  20 10 11.2 9.8 8 6.5 5.5 4.5 4 3.5 ...
 $ stg      : int  1 2 3 4 5 6 7 8 9 10 ...
 $ systemCol: chr  "#fddd88" "#fddd88" "#fddd88" "#94ac72" ...
 $ seriesCol: chr  "#fddd88" "#fddd88" "#fddd88" "#9db989" ...
 $ col      : chr  "#fcd589" "#fdd587" "#fed583" "#a9be93" ...

Stratigraphic binning

  • Attributes of collections!
  • Stratigraphic assignment is uncertain
  • ca. 500 names
  • early_interval: obligatory, starting name
  • late_interval: facultative

Look up the stage numbers

Relevant information is in the keys object: ?keys

data(keys)

Built-in categorize() function allows us to find entries in keys$stgInt:

# B. the stg entries (lookup)
stgMin<-categorize(dat[,"early_interval"],keys$stgInt)
stgMax<-categorize(dat[,"late_interval"],keys$stgInt)

Result of the categorization

entry result meaning
"Ypresian" 84 name indicates interval confined to a stage
"Permian" NA Name indicates interval not confined to a stage
"" -1 Inteval entry is not given - late_interval

Lower number - earlier stage.

We need a single bin entry

Make sure that numbers are numbers!

stgMin<-as.numeric(stgMin)
stgMax<-as.numeric(stgMax)

We need just one bin for every entry! Add this to the table, we will fill this in the next step.

# empty container
dat$stg <- rep(NA, nrow(dat))

What are the cases?

stgMin stgMax Meaning stg
84 -1 early_interval resolves to one stage, no late_interval 84
84 84 early_interval resolves to same stage as late_interval 84
84 86 collection can range from stages 84 to 86 NA
84 NA late_interval does not resolve to one stage NA
NA -1 early_interval does not resolve to one stage, the other missing NA
NA NA neither early_interval nor late_interval resolve to one stage NA

Deriving a single stage number

Programming the rules above with logical expressions:

# select entries, where
stgCondition <- c(
# the early and late interval fields indicate the same stg
    which(stgMax==stgMin),
# or the late_intervarl field is empty
    which(stgMax==-1))

In these entries, use the stg indicated by the early_interval:

dat$stg[stgCondition] <- stgMin[stgCondition]

Result

  • integer: record comes from a one stage
  • NA: record does not come from one stage

Extra cases /1

The Cambrian and the Ordovician are problematic.

The Cambrian

Collections were assigned by Na Lin to the 10 stages (Na and Kiessling 2015) .

# load data
load(url("https://github.com/divDyn/ddPhanero/raw/master/data/Stratigraphy/2018-08-31/cambStrat.RData"))

# correct it with this function
source("https://github.com/divDyn/ddPhanero/raw/master/scripts/strat/2018-08-31/cambProcess.R")

Note

This is only necessary if you do Phanerozoic scale analyses!

Extra cases /2

The Cambrian and the Ordovician are problematic.

The Ordovician

Assigments based on formations and biozone by Wolfgang Kiessling.

# load data
load(url("https://github.com/divDyn/ddPhanero/raw/master/data/Stratigraphy/2018-08-31/ordStrat.RData"))

# correct it with this function
source("https://github.com/divDyn/ddPhanero/raw/master/scripts/strat/2019-05-31/ordProcess.R")

Note

This is only necessary if you do Phanerozoic scale analyses!

We are done with the preparation!

Note

  • You do not need to repeat all these stages, when you focus on a shorter interval of time or a specific taxa. Think about whether these are necessary!
  • The smaller subset you look at, the weirder you data behaves.
  • There are other ways of doing this more to come in the future…

Basic sampling patterns

The number of collections and number of records (occurrences) are very much correlated with sampling.

These simple statistics can be calculated with basic functions:

table(dat$stg)

    4     5     6     7     8     9    10    11    12    13    14    15    16 
 1342  1819  5675  2659  3774   921  2361  1203  2577  1804  6358  5819  4568 
   17    18    19    20    21    22    23    24    25    26    27    28    29 
14917 15557 33145  3753  4654  3215  5779  6943  3879  2583  4769  4016  7659 
   30    31    32    33    34    35    36    37    38    39    40    41    42 
 5063  7604  7073 21890 10543  6820  6392 13199  4514  3785  6446  2872  4843 
   43    44    45    46    47    48    49    50    51    52    53    54    55 
 5481  8078  8113 14308 10834 11032  8928 18581 16167  7757 10126 14275  5814 
   56    57    58    59    60    61    62    63    64    65    66    67    68 
10966  8621  5789  3411  6417 17253 15271  3946  5809  6617  9016 10134  6795 
   69    70    71    72    73    74    75    76    77    78    79    80    81 
 6810  3333  6997  5370  3604  7155 10670 13563  8292  2261  4906 14001 35958 
   82    83    84    85    86    87    88    89    90    91    92    93    94 
 7114  6059  7960  9033 12078 10482  8202  7017 14800 14362 20393 21124 23677 
   95 
 3416 

Sampling

To make these even faster and convenient, the binstat() function can calculate them all at once:

sampStg<- binstat(dat, bin="stg", tax="clgen", coll="collection_no", 
        duplicates=FALSE)  

Note

duplicates=FALSE makes sure that multiple occurrences of the same genus in the same collection (i.e. when there are multiple species) are counted as one!

The result of binstat()

str(sampStg)
'data.frame':   95 obs. of  8 variables:
 $ stg     : num  1 2 3 4 5 6 7 8 9 10 ...
 $ occs    : num  NA NA NA 1047 1454 ...
 $ colls   : num  NA NA NA 217 225 755 639 813 257 460 ...
 $ SIBs    : num  NA NA NA 157 223 637 462 513 278 324 ...
 $ occ1    : num  NA NA NA 74 89 224 189 194 136 121 ...
 $ occ2    : num  NA NA NA 14 34 100 73 72 51 58 ...
 $ u       : num  NA NA NA 0.929 0.939 ...
 $ chao1occ: num  NA NA NA 353 339 ...
  • By default, there are as many rows as the highest integer in the bin argument (95).
  • There are missing values in the beginning. This is can help with plotting.
  • Better: plot the values in the context of the timescale.

Unifying the output

Easiest: merge the results with the time scale object.

sampling <- merge(stages, sampStg, by="stg")
str(sampling)
'data.frame':   95 obs. of  20 variables:
 $ stg      : int  1 2 3 4 5 6 7 8 9 10 ...
 $ sys      : chr  "E" "E" "E" "Cm" ...
 $ system   : chr  "Ediacaran" "Ediacaran" "Ediacaran" "Cambrian" ...
 $ series   : chr  "Upper Ediacaran" "Upper Ediacaran" "Upper Ediacaran" "Terreneuvian" ...
 $ stage    : chr  "Avalon Assemblage" "White Sea Assemblage" "Nama Assemblage" "Fortunian" ...
 $ short    : chr  "Ava" "Whs" "Nam" "For" ...
 $ bottom   : num  580 560 550 539 529 ...
 $ mid      : num  570 555 544 534 525 ...
 $ top      : num  560 550 539 529 521 ...
 $ dur      : num  20 10 11.2 9.8 8 6.5 5.5 4.5 4 3.5 ...
 $ systemCol: chr  "#fddd88" "#fddd88" "#fddd88" "#94ac72" ...
 $ seriesCol: chr  "#fddd88" "#fddd88" "#fddd88" "#9db989" ...
 $ col      : chr  "#fcd589" "#fdd587" "#fed583" "#a9be93" ...
 $ occs     : num  NA NA NA 1047 1454 ...
 $ colls    : num  NA NA NA 217 225 755 639 813 257 460 ...
 $ SIBs     : num  NA NA NA 157 223 637 462 513 278 324 ...
 $ occ1     : num  NA NA NA 74 89 224 189 194 136 121 ...
 $ occ2     : num  NA NA NA 14 34 100 73 72 51 58 ...
 $ u        : num  NA NA NA 0.929 0.939 ...
 $ chao1occ : num  NA NA NA 353 339 ...

Plotting

  1. We create an empty canvas to plot on.
tsplot(stages, boxes="sys", shading="sys", xlim=4:95, ylim=c(0,20000), 
    ylab="Number of Records")

The number of records

# occurrences
lines(sampling$mid, sampling$occs, lwd=2)

The number of collections

# collections
lines(sampling$mid, sampling$colls, lwd=2, col="blue")

Adding a legend

legend("topleft", bg="white", legend=c("occurrences", "collections"), 
    col=c("black", "blue"), lwd=2, inset=c(0.05,0.01), cex=1.3)

Sampling is heterogeneous

Strong correlation between collection and record numbers

cor.test(sampling$occs, sampling$colls, method="spearman")

    Spearman's rank correlation rho

data:  sampling$occs and sampling$colls
S = 15127, p-value < 2.2e-16
alternative hypothesis: true rho is not equal to 0
sample estimates:
     rho 
0.883432 

Main function: divDyn()

Calculate basic metrics of diversity dynamics:

dd <- divDyn(dat, bin="stg", tax="clgen")
str(dd)
'data.frame':   95 obs. of  31 variables:
 $ stg      : num  1 2 3 4 5 6 7 8 9 10 ...
 $ t2d      : num  NA NA NA NA 92 123 204 163 137 97 ...
 $ t2u      : num  NA NA NA 92 123 204 163 137 97 79 ...
 $ t3       : num  NA NA NA NA 53 59 67 50 43 24 ...
 $ tPart    : num  NA NA NA NA 14 2 23 13 19 5 ...
 $ tGFd     : num  NA NA NA NA NA 4 4 13 5 6 ...
 $ tGFu     : num  NA NA NA NA 1 10 11 6 9 9 ...
 $ tSing    : num  NA NA NA 49 51 334 147 219 71 116 ...
 $ tOri     : num  NA NA NA 108 80 166 109 104 54 86 ...
 $ tExt     : num  NA NA NA NA 36 65 128 116 86 76 ...
 $ tThrough : num  NA NA NA NA 72 87 125 118 136 114 ...
 $ divSIB   : num  NA NA NA 157 223 637 462 513 278 324 ...
 $ divCSIB  : num  NA NA NA NA 222 ...
 $ divRT    : num  NA NA NA 157 239 652 509 557 347 392 ...
 $ divBC    : num  NA NA NA NA 108 152 253 234 222 190 ...
 $ extProp  : num  NA NA NA 0.312 0.364 ...
 $ oriProp  : num  NA NA NA 1 0.548 ...
 $ extPC    : num  NA NA NA NA 0.405 ...
 $ oriPC    : num  NA NA NA NA 0.747 ...
 $ ext3t    : num  NA NA NA NA 0.551 ...
 $ ori3t    : num  NA NA NA NA 0.842 ...
 $ extC3t   : num  NA NA NA NA 0.518 ...
 $ oriC3t   : num  NA NA NA NA NA ...
 $ extGF    : num  NA NA NA NA 0.444 ...
 $ oriGF    : num  NA NA NA NA NA ...
 $ E2f3     : num  NA NA NA NA 0.264 ...
 $ O2f3     : num  NA NA NA NA NA ...
 $ ext2f3   : num  NA NA NA NA 0.307 ...
 $ ori2f3   : num  NA NA NA NA NA ...
 $ samp3t   : num  NA NA NA NA 0.791 ...
 $ sampRange: num  NA NA NA NA 0.778 ...

The output of divDyn()

Returns a data.frame, row of values for every bin. Prefixes:

  • t: taxon counts
  • div: diversity counts
  • ext: extinction metrics
  • ori: origination metrics
  • samp: sampling metrics

See: ?divDyn for more…

Range-based metrics of diversity dynamics

Range-through diversity

divRT = tSing(FL) + tExt(bL) + tOri(Ft) + tThrough(bt)

Boundary-crosser diversity

divBC = tSing(FL) + tExt(bL) + tOri(Ft) + tThrough(bt)

Per capita rates (M. Foote 1999)

\(oriPC = -log( tOri / ( tOri + tThrough ))\).
\(extPC = -log( tExt / ( tExt + tThrough ))\).

  • Time homogeneous process in bin, Decay rate
  • Originally divided by \(\Delta t\)
  • Assumption about pulsedness (Michael Foote 2005)

Plotting preparation

To make plotting easy:

patterns <- merge(stages, dd, by="stg")
str(patterns)
'data.frame':   95 obs. of  43 variables:
 $ stg      : int  1 2 3 4 5 6 7 8 9 10 ...
 $ sys      : chr  "E" "E" "E" "Cm" ...
 $ system   : chr  "Ediacaran" "Ediacaran" "Ediacaran" "Cambrian" ...
 $ series   : chr  "Upper Ediacaran" "Upper Ediacaran" "Upper Ediacaran" "Terreneuvian" ...
 $ stage    : chr  "Avalon Assemblage" "White Sea Assemblage" "Nama Assemblage" "Fortunian" ...
 $ short    : chr  "Ava" "Whs" "Nam" "For" ...
 $ bottom   : num  580 560 550 539 529 ...
 $ mid      : num  570 555 544 534 525 ...
 $ top      : num  560 550 539 529 521 ...
 $ dur      : num  20 10 11.2 9.8 8 6.5 5.5 4.5 4 3.5 ...
 $ systemCol: chr  "#fddd88" "#fddd88" "#fddd88" "#94ac72" ...
 $ seriesCol: chr  "#fddd88" "#fddd88" "#fddd88" "#9db989" ...
 $ col      : chr  "#fcd589" "#fdd587" "#fed583" "#a9be93" ...
 $ t2d      : num  NA NA NA NA 92 123 204 163 137 97 ...
 $ t2u      : num  NA NA NA 92 123 204 163 137 97 79 ...
 $ t3       : num  NA NA NA NA 53 59 67 50 43 24 ...
 $ tPart    : num  NA NA NA NA 14 2 23 13 19 5 ...
 $ tGFd     : num  NA NA NA NA NA 4 4 13 5 6 ...
 $ tGFu     : num  NA NA NA NA 1 10 11 6 9 9 ...
 $ tSing    : num  NA NA NA 49 51 334 147 219 71 116 ...
 $ tOri     : num  NA NA NA 108 80 166 109 104 54 86 ...
 $ tExt     : num  NA NA NA NA 36 65 128 116 86 76 ...
 $ tThrough : num  NA NA NA NA 72 87 125 118 136 114 ...
 $ divSIB   : num  NA NA NA 157 223 637 462 513 278 324 ...
 $ divCSIB  : num  NA NA NA NA 222 ...
 $ divRT    : num  NA NA NA 157 239 652 509 557 347 392 ...
 $ divBC    : num  NA NA NA NA 108 152 253 234 222 190 ...
 $ extProp  : num  NA NA NA 0.312 0.364 ...
 $ oriProp  : num  NA NA NA 1 0.548 ...
 $ extPC    : num  NA NA NA NA 0.405 ...
 $ oriPC    : num  NA NA NA NA 0.747 ...
 $ ext3t    : num  NA NA NA NA 0.551 ...
 $ ori3t    : num  NA NA NA NA 0.842 ...
 $ extC3t   : num  NA NA NA NA 0.518 ...
 $ oriC3t   : num  NA NA NA NA NA ...
 $ extGF    : num  NA NA NA NA 0.444 ...
 $ oriGF    : num  NA NA NA NA NA ...
 $ E2f3     : num  NA NA NA NA 0.264 ...
 $ O2f3     : num  NA NA NA NA NA ...
 $ ext2f3   : num  NA NA NA NA 0.307 ...
 $ ori2f3   : num  NA NA NA NA NA ...
 $ samp3t   : num  NA NA NA NA 0.791 ...
 $ sampRange: num  NA NA NA NA 0.778 ...

Plotting - empty canvas

tsplot(stages, boxes="sys", shading="sys", xlim=4:95, ylim=c(0,4000), 
    ylab="Richness")

Plotting divRT

lines(patterns$mid, patterns$divRT)

Plotting divBC

lines(patterns$mid, patterns$divBC, col="blue")

Plotting rates (canvas)

tsplot(stages, boxes="sys", shading="sys", xlim=4:95, ylim=c(0,1.5), 
    ylab="Taxonomic rates")

Plotting rates (lines)

lines(patterns$mid, patterns$oriPC, col="green")
lines(patterns$mid, patterns$extPC, col="red")

Plotting rates (legend)

legend("topleft", bg="white", legend=c("origination", "extinction"), 
    col=c("green", "red"), lwd=2, inset=c(0.05,0.01), cex=1.3)

Can you name the main events and patterns?

Rates - main features

  • Decline of rates
  • Mass extinctions + (rebounds)
  • Edge effects

Issues

Observation (occurrence)-based metrics

Sampled-in-Bin diversity

  • divSIB
  • The number of sampled taxa in the focal bin.
  • Very vulnerable to the sampling universe effect

Plotting SIB diversity (previous)

Previous code:

tsplot(stages, boxes="sys", shading="sys", xlim=4:95, ylim=c(0,4000), 
    ylab="Richness")
lines(patterns$mid, patterns$divRT)
lines(patterns$mid, patterns$divBC, col="blue")

Plotting SIB diversity (line)

The SIB diversity:

lines(patterns$mid, patterns$divSIB, col="gray")

Plotting SIB diversity (legend)

The SIB diversity:

legend("topleft", bg="white", legend=c("divRT", "divBC", "divSIB"), 
    col=c("black", "blue", "gray"), lwd=2, inset=c(0.05,0.01), cex=1.3)

Three-timer sampling completeness

  • samp3T = t3 / (t3 + tPart)
  • \([0,1]\) - How well-sampled is the interval?

Plotting samp3t (canvas)

tsplot(stages, boxes="sys", shading="sys", xlim=4:95, ylim=c(0,1), 
    ylab="Sampling completeness")

Plotting samp3t (line)

lines(patterns$mid, patterns$samp3t)

Corrected SIB diversity (divCSIB)

  • Use samp3t to correct divSIB
  • divCSIV = divSIB / samp3t * samp3tTotal

Plotting CSIB diversity (previous)

tsplot(stages, boxes="sys", shading="sys", xlim=4:95, ylim=c(0,4000), 
    ylab="Richness")
lines(patterns$mid, patterns$divRT)
lines(patterns$mid, patterns$divBC, col="blue")
lines(patterns$mid, patterns$divSIB, col="gray")

Plotting CSIB diversity (line)

lines(patterns$mid, patterns$divCSIB, col="purple")

Plotting CSIB diversity (legend)

legend("topleft", bg="white", legend=c("divRT", "divBC", "divSIB", "divCSIB"), 
    col=c("black", "blue", "gray", "purple"), lwd=2, inset=c(0.05,0.01), cex=1.3)

Diversity comparison

  • Range-based values are higher and less volatile (smoother)
  • Face-value results: divRT
  • Truest method: divCSIB (with subsampling)

Observation-based rates

  • Purpose: get rid of the smearing effects
  • Approximate the per capita rates, when sampling is complete
  1. Three-timer rates (Alroy 1996)
  2. Corrected three-timer rates (Alroy 2008)
  3. Gap-filler rates (Alroy 2014)
  4. Second-for-third substitution rates (Alroy 2015)
  • Wider moving windows, more complex patterns

Three-timer rates 3t

Per capita rates in a 3-bin window.

(Alroy 1996)

ext3t = log( t2d / t3)
ori3t = log( t2u / t3)

Corrected Three-timer rates C3t

Improvement: (Alroy 2008)

\(extC3t = log( t2d / t3 ) + log( samp3t[i+1] )\)
\(oriC3t = log( t2u / t3 ) - log( samp3t[i-1] )\)

Gap-filler GF and 2f3 rates

Gap filler equations (Alroy 2014): \(extC3t = log((t2d + tPart)/(t3 + tPart + tGFu))\)
\(oriC3t = log((t2u + tPart)/(t3 + tPart + tGFd))\)

Improvements in (Alroy 2015). These are algorithmic solutions. These are the state-of-the-art rate metrics.

ext2f3
ori2f3

Comparison of extinction rates (canvas)

tsplot(stages, boxes="sys", shading="sys", xlim=4:95, ylim=c(0,1.5), 
    ylab="Extinction rates")

Comparison of extinction rates (lines)

lines(patterns$mid, patterns$ext3t, col="black")
lines(patterns$mid, patterns$extC3t, col="orange")
lines(patterns$mid, patterns$extGF, col="cyan")
lines(patterns$mid, patterns$ext2f3, col="darkred")

Comparison of extinction rates (legend)

legend("topleft", bg="white", legend=c("3t", "C3t", "GF", "2f3"), 
    col=c("black", "orange", "cyan", "darkred"), lwd=2, inset=c(0.05,0.01), cex=1.3)

Rate types - comparison

  • Different metrics converge, but they vary!
  • No edge or smearing effects with the occurrence-based ones
  • They omit lots of data, very unprecise: negative values!

Rate General guidlines

Use

PC: General assessment/raw patterns
2f3: When possible, rate estimation at MEs

Subsampling of samples in divDyn

Why

  • Comparing samples with different quantity of information
  • What would our results look like if we had the same quantity of information?

Miller and Foote (1996)

Two main types

Classical Rarefaction (CR)

  • Analytical and algorithmic approach
  • target: a particular number of records
  • By-list subsampling

Shareholder Quorum Subsampling (SQS)

  • Based on coverage
  • Target is a certain level of quorum (coverage)

Algorithmic solution: SQS (Alroy 2010)
Analytical solution: Coverage-based rarefaction (Chao and Jost 2012)

Implementation in divDyn

  • Subsampling for one sample is the first step
  • Example implementation on Evolv-ED

Subsampling with Phanerozoic Marine Animals

Sampling’s effects (occurrences)

cor.test(sampling$occs, patterns$divCSIB, method="spearman")

    Spearman's rank correlation rho

data:  sampling$occs and patterns$divCSIB
S = 29933, p-value < 2.2e-16
alternative hypothesis: true rho is not equal to 0
sample estimates:
      rho 
0.7536105 

Massive correlation between sampling and diversity in a bin!

Sampling’s effects (collections)

cor.test(sampling$colls, patterns$divCSIB, method="spearman")

    Spearman's rank correlation rho

data:  sampling$colls and patterns$divCSIB
S = 54684, p-value = 3.46e-08
alternative hypothesis: true rho is not equal to 0
sample estimates:
      rho 
0.5498704 

Massive correlation between sampling and diversity in a bin!

Subsampling in a time-bin context


Kocsis et al. (2019)

  1. Subsample every bin
  2. Calculate a statistic in mind - run a function on the data, e.g. divDyn()
  3. Store the result
  4. Iterate 1-3 a number of times

CR with Phanerozoic-scale data

crStg<-subsample(dat, bin="stg", tax="clgen", coll="collection_no", q=1000, 
    iter=5, duplicates=FALSE, na.rm=TRUE)

Arguments

arg meaning
q Subsampling quota
iter Number of iterations
duplicates Multiple records of the same genus in a list?
coll What is a list (for duplicates)
na.rm Omit NA bin assignments automatically

CR with Phanerozoic-scale data (canvas)

Prepartion for plotting

cr <- merge(stages, crStg, by="stg")

Timescale plot

tsplot(stages, boxes="sys", shading="sys", xlim=4:95, ylim=c(0,4000), 
    ylab="Richness")

Raw diversity curves (RT an CSIB)

lines(patterns$mid, patterns$divRT)
lines(patterns$mid, patterns$divCSIB, col="purple")

Adding subsampling results

lines(cr$mid, cr$divRT, lty=2)
lines(cr$mid, cr$divCSIB, col="purple", lty=2)

Did it solve the problem?

Not entirely…

No subsampling

cor.test(sampling$colls, patterns$divCSIB, method="spearman")

    Spearman's rank correlation rho

data:  sampling$colls and patterns$divCSIB
S = 54684, p-value = 3.46e-08
alternative hypothesis: true rho is not equal to 0
sample estimates:
      rho 
0.5498704 

CR subsampling

cor.test(sampling$colls, cr$divCSIB, method="spearman")

    Spearman's rank correlation rho

data:  sampling$colls and cr$divCSIB
S = 74542, p-value = 0.002568
alternative hypothesis: true rho is not equal to 0
sample estimates:
      rho 
0.3207152 

… but it is better!

Occurrence weighted By-list subsampling

  • Another version of rarefaction
  • Keeps lists (collections) intact
  • People call it \(O^xW\)
  • Largely superseeded by SQS
arg use
type the type, here oxw
xexp=0 q: coll No.
xexp=1 q: occ No.
oxwStg <- subsample(dat, bin="stg", tax="clgen", coll="collection_no", 
    q=200, type="oxw", xexp=0, iter=10, duplicates=FALSE, na.rm=TRUE)

Issues with plain CR and by-list

  • sampling of rare genera is disproportionately low
  • decrease in curve variance as the quota decreases (1000, 800, 600, 400)

Shareholder quorum subsampling

sqsStg<-subsample(dat, bin="stg", tax="clgen", 
    coll="collection_no", q=0.7, iter=10, 
    ref="reference_no",singleton="ref", type="sqs", duplicates=FALSE, 
    excludeDominant=TRUE, largestColl =TRUE, na.rm=TRUE)
arg meaning
q target quorum
type subsamplinng type
ref reference columns
singleton What is a singleton based on?
excludeDominant tiny correction by Alroy (2010)
largestColl tiny correction by Alroy (2010)

Prepare to plot

Merge with timescale

sqs <- merge(stages, sqsStg, by="stg")

Timescale

tsplot(stages, boxes="sys", shading="sys", xlim=4:95, ylim=c(0,4000), 
    ylab="Richness")

SQS vs CR vs raw

lines(patterns$mid, patterns$divCSIB, col="purple")
lines(cr$mid, cr$divCSIB, col="purple", lty=2)
lines(sqs$mid, sqs$divCSIB, col="purple", lty=3)

Guidelines

  • Use subsampling to test the robustness of your results!
  • Use as high iter as necessary to stabilize the results (usually above 100)
  • Can be used to do rates as well! (use CR!)

The coral example

Details

  • Material available online on Evolv-ED
  • Go through this at your own pace
  • Update the chronosphere with
install.packages("chronosphere")

Shout if you have any issues!

References

Alroy, John. 1996. “Constant Extinction, Constrained Diversification, and Uncoordinated Stasis in North American Mammals.” Palaeogeography, Palaeoclimatology, Palaeoecology 127: 285–311.
———. 2008. “Dynamics of Origination and Extinction in the Marine Fossil Record.” Proceedings of the National Academy of Science 105: 11536–42.
———. 2010. “The Shifting Balance of Diversity Among Major Marine Animal Groups.” Science 329: 1191–94. https://doi.org/10.1126/science.1189910.
———. 2014. “Accurate and Precise Estimates of Origination and Extinction Rates.” Paleobiology 40 (3): 374–97. https://doi.org/10.1666/13036.
———. 2015. “A More Precise Speciation and Extinction Rate Estimator.” Paleobiology 41 (04): 633–39. https://doi.org/10.1017/pab.2015.26.
Chao, Anne, and Lou Jost. 2012. “Coverage-Based Rarefaction and Extrapolation: Standardizing Samples by Completeness Rather Than Size.” Ecology 93 (12): 2533–47. https://doi.org/10.1890/11-1952.1.
Foote, M. 1999. “Morphological Diversity In The Evolutionary Radiation Of Paleozoic and Post-Paleozoic Crinoids.” Paleobiology 25 (S2): 1–115. https://doi.org/10.1017/S0094837300020236.
Foote, Michael. 2005. “Pulsed Origination and Extinction in the Marine Realm.” Paleobiology 31 (1): 6–20.
Kocsis, Ádám T., Carl J. Reddin, John Alroy, and Wolfgang Kiessling. 2019. “The R Package divDyn for Quantifying Diversity Dynamics Using Fossil Sampling Data.” Methods in Ecology and Evolution 10 (5): 735–43. https://doi.org/10.1111/2041-210X.13161.
MacArthur, Robert, and Edward O. Wilson. 1963. “An Equilibrium Theroy of Insular Zoogeography.” Evolution 17 (4): 373–87.
Miller, Arnold I., and Michael Foote. 1996. “Calibrating the Ordovician Radiation of Marine Life: Implications for Phanerozoic Diversity Trends.” Paleobiology 22 (2): 304–9.
Na, Lin, and Wolfgang Kiessling. 2015. “Diversity Partitioning During the Cambrian Radiation.” Proceedings of the National Academy of Sciences 112 (15): 4702–6. https://doi.org/10.1073/pnas.1424985112.
Sepkoski, J. John. 1997. “Biodiversity: Past, Present, and Future.” Journal of Paleontology 71 (4): 533–39. https://doi.org/10.1017/S0022336000040026.
Sepkoski, J. John, Jr. 1978. “A Kinetic Model of Phanerozoic Taxonomic Diversity I. Analysis of Marine Orders.” Paleobiology 4: 223–51. https://doi.org/10.2307/2400203.
Signor III, P. W., and J. H. Lipps. 1982. “Sampling Bias, Gradual Extinction Patterns and Catastrophesin the Fossil Record.” Geological Society of America Special Paper 190: 291–96.