1
Introduction
Introduction to Data Science with R
Preface
1
Introduction
Part 1 - Review of R
2
Base R Review
3
Examples using ggplot2
Part 2 - Exploratory Data Analysis
4
Exploratory Data Analysis
Part 3 - Naïve Bayes, k-Nearest Neighbors, & Logistic Regression
5
Resampling Methods
6
Naïve Bayes
7
Naïve Bayes with
caret
8
k-Nearest Neighbors
9
k-Nearest Neighbors with
caret
10
Logistic Regression
11
Logistic Regression with
caret
12
Logistic Regression with
tidymodels
Part 4 - LASSO, Ridge, & Sampling
13
Linear Regression
14
Linear Model - Ridge & LASSO
15
Selection by Filter
16
Recursive Feature Elimination
17
Resampling Samples
18
Resampling Samples - Classification with
caret
glm
logit
19
Adjusted R
2
VS Root Mean Squared Error
20
Adjusted R
2
VS Root Mean Squared Error - formula on resamples
Part 5 - Decision Trees, Random Forests, and Nested Cross Validation
21
Read in the Data
22
Random Forest Classification
23
Random Forest Regression
24
RandomForest with
caret
25
Random Forest Imputation and Multi-class Classifier
26
Nested Cross-Validation Example
27
Nested Cross-Validation Example - mtry
Part 6 - k-means Clustering and Principal Component Analysis
28
k-mean Clustering
29
Modeling with k-means clustering - diab_pop
30
Principal Component Analysis
31
Modeling with PCA - Diab_Pop
Part 7 - Support Vector Machines
32
Random Forest Imputation and SVM Multi-class Classifier
33
Random Forest Imputation and SVM Multi-class Classifier
Appendix
References
1
Introduction
Preface
Part 1 - Review of R