Analytical Paleobiology Workshop 2022
h2o.performance(mod_glm2)
#> H2ORegressionMetrics: glm
#> ** Reported on training data. **
#>
#> MSE: 4.789876
#> RMSE: 2.188578
#> MAE: 1.578086
#> RMSLE: NaN
#> Mean Residual Deviance : 4.789876
#> R^2 : 0.5362487
#> Null Deviance :30396.91
#> Null D.o.F. :2942
#> Residual Deviance :14096.6
#> Residual D.o.F. :2933
#> AIC :12984.09
h2o.performance(mod_glm2, newdata = ring_test)
#> H2ORegressionMetrics: glm
#>
#> MSE: 4.865688
#> RMSE: 2.20583
#> MAE: 1.591277
#> RMSLE: 0.179791
#> Mean Residual Deviance : 4.865688
#> R^2 : 0.538475
#> Null Deviance :13015.45
#> Null D.o.F. :1233
#> Residual Deviance :6004.259
#> Residual D.o.F. :1224
#> AIC :5476.385
⚠️ DANGERS OF OVERFITTING ⚠️
h2o.performance(mod_glm2, newdata = ring_train)
#> H2ORegressionMetrics: glm
#>
#> MSE: 4.789876
#> RMSE: 2.188578
#> MAE: 1.578086
#> RMSLE: NaN
#> Mean Residual Deviance : 4.789876
#> R^2 : 0.5362487
#> Null Deviance :30396.91
#> Null D.o.F. :2942
#> Residual Deviance :14096.6
#> Residual D.o.F. :2933
#> AIC :12984.09
We call this “resubstitution” or “repredicting the training set”
The values are “resubstitution estimate”
h2o.performance(mod_glm2, newdata = ring_train)
#> H2ORegressionMetrics: glm
#>
#> MSE: 4.789876
#> RMSE: 2.188578
#> MAE: 1.578086
#> RMSLE: NaN
#> Mean Residual Deviance : 4.789876
#> R^2 : 0.5362487
#> Null Deviance :30396.91
#> Null D.o.F. :2942
#> Residual Deviance :14096.6
#> Residual D.o.F. :2933
#> AIC :12984.09
h2o.performance(mod_glm2, newdata = ring_train)
#> H2ORegressionMetrics: glm
#>
#> MSE: 4.789876
#> RMSE: 2.188578
#> MAE: 1.578086
#> RMSLE: NaN
#> Mean Residual Deviance : 4.789876
#> R^2 : 0.5362487
#> Null Deviance :30396.91
#> Null D.o.F. :2942
#> Residual Deviance :14096.6
#> Residual D.o.F. :2933
#> AIC :12984.09
h2o.performance(mod_glm2, newdata = ring_test)
#> H2ORegressionMetrics: glm
#>
#> MSE: 4.865688
#> RMSE: 2.20583
#> MAE: 1.591277
#> RMSLE: 0.179791
#> Mean Residual Deviance : 4.865688
#> R^2 : 0.538475
#> Null Deviance :13015.45
#> Null D.o.F. :1233
#> Residual Deviance :6004.259
#> Residual D.o.F. :1224
#> AIC :5476.385
⚠️ Don’t use the test set until the end of your modeling analysis
Compute the metrics for both training and testing data.
Notice the evidence of overfitting, if any! ⚠️
10:00
h2o.performance(mod_glm2, newdata = ring_train)
#> H2ORegressionMetrics: glm
#>
#> MSE: 4.789876
#> RMSE: 2.188578
#> MAE: 1.578086
#> RMSLE: NaN
#> Mean Residual Deviance : 4.789876
#> R^2 : 0.5362487
#> Null Deviance :30396.91
#> Null D.o.F. :2942
#> Residual Deviance :14096.6
#> Residual D.o.F. :2933
#> AIC :12984.09
h2o.performance(mod_glm2, newdata = ring_test)
#> H2ORegressionMetrics: glm
#>
#> MSE: 4.865688
#> RMSE: 2.20583
#> MAE: 1.591277
#> RMSLE: 0.179791
#> Mean Residual Deviance : 4.865688
#> R^2 : 0.538475
#> Null Deviance :13015.45
#> Null D.o.F. :1233
#> Residual Deviance :6004.259
#> Residual D.o.F. :1224
#> AIC :5476.385
What if we want to compare more models?
And/or more model configurations?
And we want to understand if these are important differences?
If we use 10 folds, what percent of the training data
for each fold?
03:00
What is in this?
h2o.performance(mod_glm3, newdata=ring_train)
#> H2ORegressionMetrics: glm
#>
#> MSE: 4.789876
#> RMSE: 2.188578
#> MAE: 1.578086
#> RMSLE: NaN
#> Mean Residual Deviance : 4.789876
#> R^2 : 0.5362487
#> Null Deviance :30396.91
#> Null D.o.F. :2942
#> Residual Deviance :14096.6
#> Residual D.o.F. :2933
#> AIC :12984.09
We can reliably measure performance using only the training data 🎉
How do the metrics from resampling compare to the metrics from training and testing?
h2o.performance(mod_glm3, newdata=ring_train)
#> H2ORegressionMetrics: glm
#>
#> MSE: 4.789876
#> RMSE: 2.188578
#> MAE: 1.578086
#> RMSLE: NaN
#> Mean Residual Deviance : 4.789876
#> R^2 : 0.5362487
#> Null Deviance :30396.91
#> Null D.o.F. :2942
#> Residual Deviance :14096.6
#> Residual D.o.F. :2933
#> AIC :12984.09
The RMSE previously was
Remember that:
⚠️ the training set gives you overly optimistic metrics
⚠️ the test set is precious
h2o.performance(mod_glm3, newdata=ring_test)
#> H2ORegressionMetrics: glm
#>
#> MSE: 4.865688
#> RMSE: 2.20583
#> MAE: 1.591277
#> RMSLE: 0.179791
#> Mean Residual Deviance : 4.865688
#> R^2 : 0.538475
#> Null Deviance :13015.45
#> Null D.o.F. :1233
#> Residual Deviance :6004.259
#> Residual D.o.F. :1224
#> AIC :5476.385
Resampling can involve fitting a lot of models!
These models don’t depend on one another and can be run in parallel
Create:
Don’t forget to set a seed!
10:00
Which model do you think you would decide to use?
What surprised you the most?
What is one thing you are looking forward to next?
05:00