Introduction to Data Science with R

Author

J Kyle Armstrong, PhD

Published

May 18, 2024

Preface

These notes have been developed with:

  • R version 4.4.0
  • Quarto version 1.4.553

R, RStudio, and Quarto are all free to download and use. You can learn more about both R and RStudio and download them both from https://posit.co/download/rstudio-desktop/.

To learn more about Quarto books visit https://quarto.org/docs/books.

This work was completed using R version 4.4.0 (R Core Team 2024) with the following R packages: Amelia v. 1.8.2 (Honaker, King, and Blackwell 2011), arsenal v. 3.6.3 (Heinzen et al. 2021), caret v. 6.0.94 (Kuhn and Max 2008), class v. 7.3.22 (Venables and Ripley 2002), clue v. 0.3.65 (Hornik 2005, 2023), compare v. 0.2.6 (Murrell 2015), corrplot v. 0.92 (Wei and Simko 2021), data.table v. 1.15.4 (Barrett et al. 2024), doParallel v. 1.0.17 (Corporation and Weston 2022), e1071 v. 1.7.14 (Meyer, Dimitriadou, et al. 2023), factoextra v. 1.0.7 (Kassambara and Mundt 2020), gcookbook v. 2.0 (Chang 2018), GGally v. 2.2.1 (Schloerke et al. 2024), ggbernie v. 1.0 (CODER 2024), ggbiplot v. 0.6.2 (Vu and Friendly 2024), ggdendro v. 0.2.0 (de Vries and Ripley 2024), glmnet v. 4.1.8 (Friedman, Tibshirani, and Hastie 2010; Simon et al. 2011; Tay, Narasimhan, and Hastie 2023), glue v. 1.7.0 (Hester and Bryan 2024), gplots v. 3.1.3.1 (Warnes et al. 2024), grateful v. 0.2.4 (Francisco Rodriguez-Sanchez and Connor P. Jackson 2023), gtools v. 3.9.5 (Warnes et al. 2023), gtrendsR v. 1.5.1 (Massicotte and Eddelbuettel 2022), here v. 1.0.1 (Müller 2020), Hmisc v. 5.1.2 (Harrell Jr 2024), ISLR v. 1.4 (James et al. 2021), kableExtra v. 1.4.0 (Zhu 2024), klaR v. 1.7.3 (Weihs et al. 2005), knitr v. 1.46 (Xie 2014, 2015, 2024), labelled v. 2.13.0 (Larmarange 2024), lattice v. 0.22.6 (Sarkar 2008), mice v. 3.16.0 (van Buuren and Groothuis-Oudshoorn 2011), networkD3 v. 0.4 (J. J. Allaire et al. 2017), NHANES v. 2.1.0 (Pruim 2015), psych v. 2.4.3 (William Revelle 2024), quarto v. 1.4 (J. Allaire and Dervieux 2024), randomForest v. 4.7.1.1 (Liaw and Wiener 2002), renv v. 1.0.7 (Ushey and Wickham 2024), reprtree v. 0.6 (Dasgupta 2014), rmarkdown v. 2.27 (Xie, Allaire, and Grolemund 2018; Xie, Dervieux, and Riederer 2020; J. Allaire et al. 2024), rpart v. 4.1.23 (Therneau and Atkinson 2023), rpart.plot v. 3.1.2 (Milborrow 2024), rsample v. 1.2.1 (Frick et al. 2024), rsq v. 2.6 (Zhang 2023), scales v. 1.3.0 (Wickham, Pedersen, and Seidel 2023), skimr v. 2.1.5 (Waring et al. 2022), sqldf v. 0.4.11 (Grothendieck 2017), tableone v. 0.13.2 (Yoshida and Bartel 2022), tictoc v. 1.2.1 (Izrailev 2024), tidymodels v. 1.2.0 (Kuhn and Wickham 2020), tidyverse v. 2.0.0 (Wickham et al. 2019), tree v. 1.0.43 (Ripley 2023), vcd v. 1.4.12 (Meyer, Zeileis, and Hornik 2006; Zeileis, Meyer, and Hornik 2007; Meyer, Zeileis, et al. 2023), VIM v. 6.2.2 (Kowarik and Templ 2016), viridis v. 0.6.5 (Garnier et al. 2024), yardstick v. 1.3.1 (Kuhn, Vaughan, and Hvitfeldt 2024).