1 - A quick intro to machine learning

Analytical Paleobiology Workshop 2022

Who are you?

  • You know a bit about R.

  • You have exposure to basic statistical concepts

  • You do not need intermediate or expert familiarity with modeling or ML

Who am I?

Asking for help

🎀 “I’m stuck and need help!”

🟨 “I finished the exercise”

Plan for this workshop

  • Your data budget
  • What makes a model
  • Examples of model types
  • Evaluating models
  • Feature engineering
  • Tuning hyperparameters
  • Wrapping up!

What is machine learning?

What is machine learning?

What is machine learning?

Your turn

How are statistics and machine learning related?

How are they similar? Different?

05:00

Statistics draws population inferences from a sample, and machine learning finds generalisable predictive patterns.

Read more about it here.

The extinction risk of corals worldwide

Raja et al (2022, Global Ecology and Biogeography)

Predicting the extinction risk of corals worldwide

Raja et al (2022, Global Ecology and Biogeography)

Our machine learning framework for today

Use H2O directly from R

Copy and paste these commands into R one line at a time:


# The following two commands remove any previously installed H2O packages for R.
if ("package:h2o" %in% search()) { detach("package:h2o", unload=TRUE) }
if ("h2o" %in% rownames(installed.packages())) { remove.packages("h2o") }

# Next, we download packages that H2O depends on.
pkgs <- c("RCurl","jsonlite")
for (pkg in pkgs) {
if (! (pkg %in% rownames(installed.packages()))) { install.packages(pkg) }
}

# Now we download, install and initialize the H2O package for R.
install.packages("h2o", type="source", repos="https://h2o-release.s3.amazonaws.com/h2o/rel-zumbo/4/R")

# Finally, let's load H2O and start up an H2O cluster
library(h2o)

# NOTE: You may get an error while running the following code asking you to install Java. Follow the link provided.

h2o.init(nthreads = 1, #Number of threads/cores 
         max_mem_size = "1G")  #max mem size is the maximum memory to allocate to H2O