Week 12 Tree-Based Models in R
This week, we focus on tree-based models and their implementation in R. For the more advanced, a recommendable resource for tree-based modeling is Prasad, Iverson, and Liaw (2006) or Gries (2021). Very good papers dealing with many critical issues related to tree-based models are Strobl, Malley, and Tutz (2009) and Breiman (2001). The aim of this week is to show how to implement and perform basic tree-based modeling and classification using R.
Preparation and session set up
For this week, we need to install certain packages from an R library so that the scripts shown below are executed without errors. Before turning to the code below, please install the packages by running the code below this paragraph - it may take some time (between 1 and 5 minutes to install all of the libraries so you do not need to worry if it takes some time).
# install packages
install.packages("Boruta")
install.packages("tree")
install.packages("caret")
install.packages("cowplot")
install.packages("tidyverse")
install.packages("ggparty")
install.packages("Gmisc")
install.packages("grid")
install.packages("Hmisc")
install.packages("party")
install.packages("partykit")
install.packages("randomForest")
#install.packages("Rling")
install.packages("pdp")
install.packages("tidyr")
install.packages("RCurl")
install.packages("vip")
install.packages("flextable")
# install klippy for copy-to-clipboard button in code chunks
install.packages("remotes")
::install_github("rlesur/klippy") remotes
Now that we have installed the packages, we can activate them as shown below.
# set options
options(stringsAsFactors = F)
options(scipen = 999)
options(max.print=10000)
# load packages
library(Boruta)
library(tree)
library(caret)
library(cowplot)
library(tidyverse)
library(ggparty)
library(Gmisc)
library(grid)
library(Hmisc)
library(party)
library(partykit)
library(randomForest)
#library(Rling)
library(pdp)
library(RCurl)
library(tidyr)
library(vip)
library(flextable)
# activate klippy for copy-to-clipboard button
::klippy() klippy
NOTE
In some cases, installing the caret
package can be a bit more complicated. In my case, it was necessary to execute the code chunk shown below. However, once the caret
package is installed, you do not need to go through these steps again and can simply activate it by calling library(caret)
.
`
# install caret library
source("https://bioconductor.org/biocLite.R");
biocLite(); library(Biobase)
install.packages("Biobase",
repos=c("http://rstudio.org/_packages",
"http://cran.rstudio.com",
"http://cran.rstudio.com/", dependencies=TRUE))
install.packages("dimRed", dependencies = TRUE)
install.packages('caret', dependencies = TRUE)
# activate caret library
library(caret)
`
Once you have installed R, RStudio, and have also initiated the session by executing the code shown above, you are good to go.