install_if_not <- function( list.of.packages ) {
new.packages <- list.of.packages[!(list.of.packages %in% installed.packages()[,"Package"])]
if(length(new.packages)) { install.packages(new.packages) } else { print(paste0("the package '", list.of.packages , "' is already installed")) }
}27 Nested Cross-Validation Example - mtry
\(~\)
\(~\)
Install if not Function
\(~\)
\(~\)
\(~\)
27.1 Getting Started!
\(~\)
The rsample::initial_split function works similarly to the caret::createDataPartition function.
The rsample::training and rsample::testing functions will return back the training or testing data.
diab_pop <- readRDS('C:/Users/jkyle/Documents/GitHub/Intro_Jeff_Data_Science/DATA/diab_pop.RDS')
library('tidyverse')── Attaching core tidyverse packages ──────────────────────── tidyverse 2.0.0 ──
✔ dplyr 1.1.4 ✔ readr 2.1.5
✔ forcats 1.0.0 ✔ stringr 1.5.1
✔ ggplot2 3.5.1 ✔ tibble 3.2.1
✔ lubridate 1.9.3 ✔ tidyr 1.3.1
✔ purrr 1.0.2
── Conflicts ────────────────────────────────────────── tidyverse_conflicts() ──
✖ dplyr::filter() masks stats::filter()
✖ dplyr::lag() masks stats::lag()
ℹ Use the conflicted package (<http://conflicted.r-lib.org/>) to force all conflicts to become errors
diab_pop.no_na_vals <- diab_pop %>% na.omit()
install_if_not('rsample')[1] "the package 'rsample' is already installed"
library(rsample)
# this will ensure our results are the same every run, to randomize you may use: set.seed(Sys.time())
set.seed(8675309)
train_test <- initial_split(diab_pop.no_na_vals, prop = .6, strata=diq010)
TRAIN <- training(train_test)\(~\)
\(~\)
\(~\)
\(~\)
27.2 Using nested_cv
A typical scheme for splitting the data when developing a predictive model is to create an initial split of the data into a training and test set.
Then resampling is often done on subests of the training set.
In rsample, the term analysis set is used for subsets of the resampled training set; the assessment sets are the corresponding hold-out sets used to compute performance.
\(~\)

\(~\)
Grid Search is common practiace in resampling:
- Create a tuning grid of tuning parameters s
- Apply combinations of tuning parameters on the grid tuning grid with resampling.
- Each time, the assessment data are used to estimate performance metrics that maybe pooled to tune model performance
\(~\)
Nested resampling does an additional layer of resampling that separates the tuning activities from the process used to estimate the efficacy of the model:
- An outer resampling scheme is used and, for every split in the outer resample, another full set of resampling splits are created on the original analysis set.
- Once the tuning results are complete, a model is fit to each of the outer resampling splits using the best parameter associated with that resample.
- The average of the outer method’s assessment sets are a unbiased estimate of the model.
- For example: To attempt to tune a hyper-parameter with 10 levels, we use a 10-fold cross-validation is used on the outside, and 5-fold cross-validation on the inside. Then:
- The parameter tuning will be conducted 10 times and the best parameters are determined from the average of the 5 assessment sets.
- This process occurs 10 times.
- In total, 500 models would be fit.
\(~\)
\(~\)
27.2.1 Get Started
We will use this example to tune the tree hyper-parameter from the randomForest function in the radomForest package. We will choose from a list of 12 values for which we will vary mtry in seq(1,24,2).
We will use the follow for our nested resampling strategy:
- OUTER: A twice-repeated 5-fold cross validation will be used as the outer resampling method; it will be used to generate estimates of the overall performance.
- INNER: To tune the model, we will use estimates from each of the values of the tuning parameter from the 3 bootstrap iterations.
\(~\)

\(~\)
This means that for each of the 12 values that mtry can take on in seq(1,24,2) we will have 5 * 2 * 3 = 30 models to be fit.
So, in total we will fit 5 * 2 * 3 * 12 = 360 models!
\(~\)
\(~\)
27.3 Create nested_cv object
Create the tibble with the resampling specifications:
TRAIN.nested_cv <- nested_cv(TRAIN,
outside = vfold_cv(v = 5, repeats = 2),
inside = bootstraps(times = 3))
TRAIN.nested_cv# Nested resampling:
# outer: 5-fold cross-validation repeated 2 times
# inner: Bootstrap sampling
# A tibble: 10 × 4
splits id id2 inner_resamples
<list> <chr> <chr> <list>
1 <split [900/225]> Repeat1 Fold1 <boot [3 × 2]>
2 <split [900/225]> Repeat1 Fold2 <boot [3 × 2]>
3 <split [900/225]> Repeat1 Fold3 <boot [3 × 2]>
4 <split [900/225]> Repeat1 Fold4 <boot [3 × 2]>
5 <split [900/225]> Repeat1 Fold5 <boot [3 × 2]>
6 <split [900/225]> Repeat2 Fold1 <boot [3 × 2]>
7 <split [900/225]> Repeat2 Fold2 <boot [3 × 2]>
8 <split [900/225]> Repeat2 Fold3 <boot [3 × 2]>
9 <split [900/225]> Repeat2 Fold4 <boot [3 × 2]>
10 <split [900/225]> Repeat2 Fold5 <boot [3 × 2]>
str(TRAIN.nested_cv,1)nest_cv [10 × 4] (S3: nested_cv/vfold_cv/rset/tbl_df/tbl/data.frame)
- attr(*, "v")= num 5
- attr(*, "repeats")= num 2
- attr(*, "breaks")= num 4
- attr(*, "pool")= num 0.1
- attr(*, "fingerprint")= chr "f039727a2aed6bda912104e921559e5c"
- attr(*, "outside")= language vfold_cv(v = 5, repeats = 2)
- attr(*, "inside")= language bootstraps(times = 3)
\(~\)
\(~\)
\(~\)
We can access all of the resamples of data:
TRAIN.nested_cv$inner_resamples[[1]]
# Bootstrap sampling
# A tibble: 3 × 2
splits id
<list> <chr>
1 <split [900/326]> Bootstrap1
2 <split [900/320]> Bootstrap2
3 <split [900/346]> Bootstrap3
[[2]]
# Bootstrap sampling
# A tibble: 3 × 2
splits id
<list> <chr>
1 <split [900/324]> Bootstrap1
2 <split [900/343]> Bootstrap2
3 <split [900/352]> Bootstrap3
[[3]]
# Bootstrap sampling
# A tibble: 3 × 2
splits id
<list> <chr>
1 <split [900/339]> Bootstrap1
2 <split [900/325]> Bootstrap2
3 <split [900/318]> Bootstrap3
[[4]]
# Bootstrap sampling
# A tibble: 3 × 2
splits id
<list> <chr>
1 <split [900/321]> Bootstrap1
2 <split [900/340]> Bootstrap2
3 <split [900/319]> Bootstrap3
[[5]]
# Bootstrap sampling
# A tibble: 3 × 2
splits id
<list> <chr>
1 <split [900/340]> Bootstrap1
2 <split [900/323]> Bootstrap2
3 <split [900/335]> Bootstrap3
[[6]]
# Bootstrap sampling
# A tibble: 3 × 2
splits id
<list> <chr>
1 <split [900/333]> Bootstrap1
2 <split [900/336]> Bootstrap2
3 <split [900/333]> Bootstrap3
[[7]]
# Bootstrap sampling
# A tibble: 3 × 2
splits id
<list> <chr>
1 <split [900/322]> Bootstrap1
2 <split [900/328]> Bootstrap2
3 <split [900/317]> Bootstrap3
[[8]]
# Bootstrap sampling
# A tibble: 3 × 2
splits id
<list> <chr>
1 <split [900/329]> Bootstrap1
2 <split [900/335]> Bootstrap2
3 <split [900/319]> Bootstrap3
[[9]]
# Bootstrap sampling
# A tibble: 3 × 2
splits id
<list> <chr>
1 <split [900/327]> Bootstrap1
2 <split [900/336]> Bootstrap2
3 <split [900/337]> Bootstrap3
[[10]]
# Bootstrap sampling
# A tibble: 3 × 2
splits id
<list> <chr>
1 <split [900/351]> Bootstrap1
2 <split [900/326]> Bootstrap2
3 <split [900/330]> Bootstrap3
\(~\)
\(~\)
\(~\)
We can also access individual samples or data sets used for training or assessment:
TRAIN.nested_cv$inner_resamples[[5]]# Bootstrap sampling
# A tibble: 3 × 2
splits id
<list> <chr>
1 <split [900/340]> Bootstrap1
2 <split [900/323]> Bootstrap2
3 <split [900/335]> Bootstrap3
TRAIN.nested_cv$inner_resamples[[5]]$splits[[2]]<Analysis/Assess/Total>
<900/323/900>
glimpse( analysis( TRAIN.nested_cv$inner_resamples[[5]]$splits[[2]] ) ) # used to trainRows: 900
Columns: 10
$ seqn <dbl> 85990, 87964, 86528, 85304, 86146, 86284, 87506, 86817, 86360…
$ riagendr <fct> Female, Male, Female, Male, Male, Male, Male, Male, Male, Fem…
$ ridageyr <dbl> 50, 21, 37, 68, 40, 46, 48, 60, 50, 42, 80, 43, 48, 31, 37, 6…
$ ridreth1 <fct> Non-Hispanic White, MexicanAmerican, Non-Hispanic Black, Non-…
$ dmdeduc2 <fct> Some college or AA degrees, Some college or AA degrees, Some …
$ dmdmartl <fct> Married, Never married, Never married, Married, Married, Marr…
$ indhhin2 <fct> "$100,000+", "$15,000-$19,999", "$45,000-$54,999", "$25,000-$…
$ bmxbmi <dbl> 33.6, 28.0, 23.9, 27.0, 26.5, 29.2, 27.7, 30.4, 41.2, 34.5, 2…
$ diq010 <fct> No Diabetes, No Diabetes, No Diabetes, No Diabetes, No Diabet…
$ lbxglu <dbl> 87, 86, 90, 91, 101, 79, 94, 119, 116, 100, 92, 111, 94, 103,…
glimpse( assessment( TRAIN.nested_cv$inner_resamples[[5]]$splits[[2]] ) ) # used to assessRows: 323
Columns: 10
$ seqn <dbl> 84424, 84443, 84627, 84685, 85192, 85363, 85471, 85521, 85928…
$ riagendr <fct> Male, Male, Male, Female, Female, Female, Male, Female, Male,…
$ ridageyr <dbl> 69, 64, 80, 27, 53, 66, 63, 71, 66, 38, 55, 80, 80, 72, 37, 7…
$ ridreth1 <fct> Non-Hispanic White, Other, MexicanAmerican, Non-Hispanic Whit…
$ dmdeduc2 <fct> College grad or above, Some college or AA degrees, Less than …
$ dmdmartl <fct> Divorced, Married, Married, Married, Living with partner, Mar…
$ indhhin2 <fct> "$25,000-$34,999", "$25,000-$34,999", "$10,000-$14,999", "$45…
$ bmxbmi <dbl> 33.5, 33.8, 29.1, 25.4, 22.5, 23.2, 21.7, 31.7, 29.3, 37.5, 3…
$ diq010 <fct> Diabetes, Diabetes, Diabetes, Diabetes, Diabetes, Diabetes, D…
$ lbxglu <dbl> 429, 134, 125, 100, 122, 238, 156, 147, 154, 87, 311, 125, 19…
\(~\)
\(~\)
\(~\)
\(~\)
\(~\)
\(~\)
27.4 Function mapping with purrr
We will frequently make use of map functions within the purrr library.
library(purrr)\(~\)
\(~\)
\(~\)
\(~\)
\(~\)
\(~\)
27.5 Helper Functions
\(~\)
The majority of the work is done by using a sequence of helper-functions and purrr mappers.
We know our goal is to work with the inner_resamples of our TRAIN.nested_cv tibble.
Given a cut of data, and tuning parameters mtry %in% seq(1,24,2) we will write a function to estimate F1:
\(~\)
\(~\)
27.5.1 F1_Tree Helper Function
Here we can borrow from what we already know about a predicting and scoring models along with some helper functions.
We want the first input of our function to be a cut of data, an object from our rsplit::nested_cv function to the TRAIN.nested_cv tibble and our second to be a value of ntree:
This function will:
- take: a cut of data and tuning-parameter
- train a model with: a value of tuning-parameter and the corresponding training set
- score the corresponding testing data
- return the estimate for the f1 metric
f1_mtry <- function(object, my_mtry = 4) { # my_tree = 761
model_rf <- randomForest::randomForest(diq010 ~ riagendr + ridageyr + ridreth1 + dmdeduc2 + dmdmartl + indhhin2 + bmxbmi + lbxglu,
data = analysis(object),
ntree = 637,
mtry = my_mtry)
holdout_pred_class <- predict(model_rf, assessment(object), "class")
dt <- tibble::as_tibble(cbind(assessment(object), holdout_pred_class))
yd_f1 <- yardstick::f_meas(data=dt, truth=diq010, holdout_pred_class)
F1_Score <- yd_f1$.estimate
return(F1_Score)
}
# test the function
f1_mtry(TRAIN.nested_cv$inner_resamples[[5]]$splits[[3]])[1] 0.6868687
\(~\)
\(~\)
27.5.2 F1_Tree_Wrapper Function
If you understood the last function, this one is even easier. This function will allow us to reverse the call order of f1_tree:
f1_mtry_wrapper <- function(my_mtry , object){ f1_mtry(object, my_mtry) }
f1_mtry_wrapper( 4, TRAIN.nested_cv$inner_resamples[[5]]$splits[[3]] )[1] 0.6868687
\(~\)
\(~\)
27.5.3 Tune_Over_Mtry Function
We will develop the function that will allow us to tune over mtry.
Again, this function’s input expects a sample of data as it’s input.
Given a sample of data, for each value of the tuning parameter, we want to make an assessment:
tune_over_mtry <- function(object) {
tune_over_trees_results <- tibble( mtry = seq(1,24,2) )
tune_over_trees_results$F1_Score <- map_dbl(tune_over_trees_results$mtry,
f1_mtry_wrapper,
object = object)
tune_over_trees_results
}
# test the function
tune_over_mtry( TRAIN.nested_cv$inner_resamples[[1]]$splits[[2]] )Warning in randomForest.default(m, y, ...): invalid mtry: reset to within valid
range
Warning in randomForest.default(m, y, ...): invalid mtry: reset to within valid
range
Warning in randomForest.default(m, y, ...): invalid mtry: reset to within valid
range
Warning in randomForest.default(m, y, ...): invalid mtry: reset to within valid
range
Warning in randomForest.default(m, y, ...): invalid mtry: reset to within valid
range
Warning in randomForest.default(m, y, ...): invalid mtry: reset to within valid
range
Warning in randomForest.default(m, y, ...): invalid mtry: reset to within valid
range
Warning in randomForest.default(m, y, ...): invalid mtry: reset to within valid
range
# A tibble: 12 × 2
mtry F1_Score
<dbl> <dbl>
1 1 0.3
2 3 0.635
3 5 0.635
4 7 0.602
5 9 0.593
6 11 0.593
7 13 0.610
8 15 0.610
9 17 0.602
10 19 0.610
11 21 0.610
12 23 0.610
\(~\)
\(~\)
27.5.4 Summarise_Tune_Results Function
This function will be used to summarise our results from the inner nest into the outer nest.
This function will again take an inner object from rsample::nested_cv
map_dfmaps the sample data to thetune_over_treesfunction which is already waiting for it; and an inner assessment is made.- Return the row-bound tibble
map_df(object$splits, tune_over_trees)is our result from that has 3 bootstrap results - Next we group the assessments by the value of
ntreewe perform a summary step. * Compute the count, max, median, and min of the F1_Score from the inner bootstrap estimate.- Since we have only have 3 bootstraps, the returning max, median, and min values will completely contain all the estimates from the inner nest, so we can pool them into the Outer Layer.
summarize_tune_results <- function(object) {
map_df(object$splits, tune_over_mtry) %>%
group_by(mtry) %>%
summarize(max_F1_Score = max(F1_Score, na.rm = TRUE),
median_F1_Score = median(F1_Score, na.rm = TRUE),
min_F1_Score = min(F1_Score, na.rm = TRUE),
n = length(F1_Score))
}We can’t test this function for a slice of data, as it’s intended to be used with another function:
summarize_tune_results( TRAIN.nested_cv$inner_resamples[[1]]$splits[[2]] )Error in `group_by()`:
! Must group by variables found in `.data`.
✖ Column `mtry` is not found.
#### Summary below\(~\)
\(~\)
\(~\)
\(~\)
\(~\)
27.6 Run Inner Nest
Now to run the inner nest!
toc <- Sys.time() # set timer
tuning_results <- map(TRAIN.nested_cv$inner_resamples,
summarize_tune_results) # run inner samplesWarning in randomForest.default(m, y, ...): invalid mtry: reset to within valid
range
Warning in randomForest.default(m, y, ...): invalid mtry: reset to within valid
range
Warning in randomForest.default(m, y, ...): invalid mtry: reset to within valid
range
Warning in randomForest.default(m, y, ...): invalid mtry: reset to within valid
range
Warning in randomForest.default(m, y, ...): invalid mtry: reset to within valid
range
Warning in randomForest.default(m, y, ...): invalid mtry: reset to within valid
range
Warning in randomForest.default(m, y, ...): invalid mtry: reset to within valid
range
Warning in randomForest.default(m, y, ...): invalid mtry: reset to within valid
range
Warning in randomForest.default(m, y, ...): invalid mtry: reset to within valid
range
Warning in randomForest.default(m, y, ...): invalid mtry: reset to within valid
range
Warning in randomForest.default(m, y, ...): invalid mtry: reset to within valid
range
Warning in randomForest.default(m, y, ...): invalid mtry: reset to within valid
range
Warning in randomForest.default(m, y, ...): invalid mtry: reset to within valid
range
Warning in randomForest.default(m, y, ...): invalid mtry: reset to within valid
range
Warning in randomForest.default(m, y, ...): invalid mtry: reset to within valid
range
Warning in randomForest.default(m, y, ...): invalid mtry: reset to within valid
range
Warning in randomForest.default(m, y, ...): invalid mtry: reset to within valid
range
Warning in randomForest.default(m, y, ...): invalid mtry: reset to within valid
range
Warning in randomForest.default(m, y, ...): invalid mtry: reset to within valid
range
Warning in randomForest.default(m, y, ...): invalid mtry: reset to within valid
range
Warning in randomForest.default(m, y, ...): invalid mtry: reset to within valid
range
Warning in randomForest.default(m, y, ...): invalid mtry: reset to within valid
range
Warning in randomForest.default(m, y, ...): invalid mtry: reset to within valid
range
Warning in randomForest.default(m, y, ...): invalid mtry: reset to within valid
range
Warning in randomForest.default(m, y, ...): invalid mtry: reset to within valid
range
Warning in randomForest.default(m, y, ...): invalid mtry: reset to within valid
range
Warning in randomForest.default(m, y, ...): invalid mtry: reset to within valid
range
Warning in randomForest.default(m, y, ...): invalid mtry: reset to within valid
range
Warning in randomForest.default(m, y, ...): invalid mtry: reset to within valid
range
Warning in randomForest.default(m, y, ...): invalid mtry: reset to within valid
range
Warning in randomForest.default(m, y, ...): invalid mtry: reset to within valid
range
Warning in randomForest.default(m, y, ...): invalid mtry: reset to within valid
range
Warning in randomForest.default(m, y, ...): invalid mtry: reset to within valid
range
Warning in randomForest.default(m, y, ...): invalid mtry: reset to within valid
range
Warning in randomForest.default(m, y, ...): invalid mtry: reset to within valid
range
Warning in randomForest.default(m, y, ...): invalid mtry: reset to within valid
range
Warning in randomForest.default(m, y, ...): invalid mtry: reset to within valid
range
Warning in randomForest.default(m, y, ...): invalid mtry: reset to within valid
range
Warning in randomForest.default(m, y, ...): invalid mtry: reset to within valid
range
Warning in randomForest.default(m, y, ...): invalid mtry: reset to within valid
range
Warning in randomForest.default(m, y, ...): invalid mtry: reset to within valid
range
Warning in randomForest.default(m, y, ...): invalid mtry: reset to within valid
range
Warning in randomForest.default(m, y, ...): invalid mtry: reset to within valid
range
Warning in randomForest.default(m, y, ...): invalid mtry: reset to within valid
range
Warning in randomForest.default(m, y, ...): invalid mtry: reset to within valid
range
Warning in randomForest.default(m, y, ...): invalid mtry: reset to within valid
range
Warning in randomForest.default(m, y, ...): invalid mtry: reset to within valid
range
Warning in randomForest.default(m, y, ...): invalid mtry: reset to within valid
range
Warning in randomForest.default(m, y, ...): invalid mtry: reset to within valid
range
Warning in randomForest.default(m, y, ...): invalid mtry: reset to within valid
range
Warning in randomForest.default(m, y, ...): invalid mtry: reset to within valid
range
Warning in randomForest.default(m, y, ...): invalid mtry: reset to within valid
range
Warning in randomForest.default(m, y, ...): invalid mtry: reset to within valid
range
Warning in randomForest.default(m, y, ...): invalid mtry: reset to within valid
range
Warning in randomForest.default(m, y, ...): invalid mtry: reset to within valid
range
Warning in randomForest.default(m, y, ...): invalid mtry: reset to within valid
range
Warning in randomForest.default(m, y, ...): invalid mtry: reset to within valid
range
Warning in randomForest.default(m, y, ...): invalid mtry: reset to within valid
range
Warning in randomForest.default(m, y, ...): invalid mtry: reset to within valid
range
Warning in randomForest.default(m, y, ...): invalid mtry: reset to within valid
range
Warning in randomForest.default(m, y, ...): invalid mtry: reset to within valid
range
Warning in randomForest.default(m, y, ...): invalid mtry: reset to within valid
range
Warning in randomForest.default(m, y, ...): invalid mtry: reset to within valid
range
Warning in randomForest.default(m, y, ...): invalid mtry: reset to within valid
range
Warning in randomForest.default(m, y, ...): invalid mtry: reset to within valid
range
Warning in randomForest.default(m, y, ...): invalid mtry: reset to within valid
range
Warning in randomForest.default(m, y, ...): invalid mtry: reset to within valid
range
Warning in randomForest.default(m, y, ...): invalid mtry: reset to within valid
range
Warning in randomForest.default(m, y, ...): invalid mtry: reset to within valid
range
Warning in randomForest.default(m, y, ...): invalid mtry: reset to within valid
range
Warning in randomForest.default(m, y, ...): invalid mtry: reset to within valid
range
Warning in randomForest.default(m, y, ...): invalid mtry: reset to within valid
range
Warning in randomForest.default(m, y, ...): invalid mtry: reset to within valid
range
Warning in randomForest.default(m, y, ...): invalid mtry: reset to within valid
range
Warning in randomForest.default(m, y, ...): invalid mtry: reset to within valid
range
Warning in randomForest.default(m, y, ...): invalid mtry: reset to within valid
range
Warning in randomForest.default(m, y, ...): invalid mtry: reset to within valid
range
Warning in randomForest.default(m, y, ...): invalid mtry: reset to within valid
range
Warning in randomForest.default(m, y, ...): invalid mtry: reset to within valid
range
Warning in randomForest.default(m, y, ...): invalid mtry: reset to within valid
range
Warning in randomForest.default(m, y, ...): invalid mtry: reset to within valid
range
Warning in randomForest.default(m, y, ...): invalid mtry: reset to within valid
range
Warning in randomForest.default(m, y, ...): invalid mtry: reset to within valid
range
Warning in randomForest.default(m, y, ...): invalid mtry: reset to within valid
range
Warning in randomForest.default(m, y, ...): invalid mtry: reset to within valid
range
Warning in randomForest.default(m, y, ...): invalid mtry: reset to within valid
range
Warning in randomForest.default(m, y, ...): invalid mtry: reset to within valid
range
Warning in randomForest.default(m, y, ...): invalid mtry: reset to within valid
range
Warning in randomForest.default(m, y, ...): invalid mtry: reset to within valid
range
Warning in randomForest.default(m, y, ...): invalid mtry: reset to within valid
range
Warning in randomForest.default(m, y, ...): invalid mtry: reset to within valid
range
Warning in randomForest.default(m, y, ...): invalid mtry: reset to within valid
range
Warning in randomForest.default(m, y, ...): invalid mtry: reset to within valid
range
Warning in randomForest.default(m, y, ...): invalid mtry: reset to within valid
range
Warning in randomForest.default(m, y, ...): invalid mtry: reset to within valid
range
Warning in randomForest.default(m, y, ...): invalid mtry: reset to within valid
range
Warning in randomForest.default(m, y, ...): invalid mtry: reset to within valid
range
Warning in randomForest.default(m, y, ...): invalid mtry: reset to within valid
range
Warning in randomForest.default(m, y, ...): invalid mtry: reset to within valid
range
Warning in randomForest.default(m, y, ...): invalid mtry: reset to within valid
range
Warning in randomForest.default(m, y, ...): invalid mtry: reset to within valid
range
Warning in randomForest.default(m, y, ...): invalid mtry: reset to within valid
range
Warning in randomForest.default(m, y, ...): invalid mtry: reset to within valid
range
Warning in randomForest.default(m, y, ...): invalid mtry: reset to within valid
range
Warning in randomForest.default(m, y, ...): invalid mtry: reset to within valid
range
Warning in randomForest.default(m, y, ...): invalid mtry: reset to within valid
range
Warning in randomForest.default(m, y, ...): invalid mtry: reset to within valid
range
Warning in randomForest.default(m, y, ...): invalid mtry: reset to within valid
range
Warning in randomForest.default(m, y, ...): invalid mtry: reset to within valid
range
Warning in randomForest.default(m, y, ...): invalid mtry: reset to within valid
range
Warning in randomForest.default(m, y, ...): invalid mtry: reset to within valid
range
Warning in randomForest.default(m, y, ...): invalid mtry: reset to within valid
range
Warning in randomForest.default(m, y, ...): invalid mtry: reset to within valid
range
Warning in randomForest.default(m, y, ...): invalid mtry: reset to within valid
range
Warning in randomForest.default(m, y, ...): invalid mtry: reset to within valid
range
Warning in randomForest.default(m, y, ...): invalid mtry: reset to within valid
range
Warning in randomForest.default(m, y, ...): invalid mtry: reset to within valid
range
Warning in randomForest.default(m, y, ...): invalid mtry: reset to within valid
range
Warning in randomForest.default(m, y, ...): invalid mtry: reset to within valid
range
Warning in randomForest.default(m, y, ...): invalid mtry: reset to within valid
range
Warning in randomForest.default(m, y, ...): invalid mtry: reset to within valid
range
Warning in randomForest.default(m, y, ...): invalid mtry: reset to within valid
range
Warning in randomForest.default(m, y, ...): invalid mtry: reset to within valid
range
Warning in randomForest.default(m, y, ...): invalid mtry: reset to within valid
range
Warning in randomForest.default(m, y, ...): invalid mtry: reset to within valid
range
Warning in randomForest.default(m, y, ...): invalid mtry: reset to within valid
range
Warning in randomForest.default(m, y, ...): invalid mtry: reset to within valid
range
Warning in randomForest.default(m, y, ...): invalid mtry: reset to within valid
range
Warning in randomForest.default(m, y, ...): invalid mtry: reset to within valid
range
Warning in randomForest.default(m, y, ...): invalid mtry: reset to within valid
range
Warning in randomForest.default(m, y, ...): invalid mtry: reset to within valid
range
Warning in randomForest.default(m, y, ...): invalid mtry: reset to within valid
range
Warning in randomForest.default(m, y, ...): invalid mtry: reset to within valid
range
Warning in randomForest.default(m, y, ...): invalid mtry: reset to within valid
range
Warning in randomForest.default(m, y, ...): invalid mtry: reset to within valid
range
Warning in randomForest.default(m, y, ...): invalid mtry: reset to within valid
range
Warning in randomForest.default(m, y, ...): invalid mtry: reset to within valid
range
Warning in randomForest.default(m, y, ...): invalid mtry: reset to within valid
range
Warning in randomForest.default(m, y, ...): invalid mtry: reset to within valid
range
Warning in randomForest.default(m, y, ...): invalid mtry: reset to within valid
range
Warning in randomForest.default(m, y, ...): invalid mtry: reset to within valid
range
Warning in randomForest.default(m, y, ...): invalid mtry: reset to within valid
range
Warning in randomForest.default(m, y, ...): invalid mtry: reset to within valid
range
Warning in randomForest.default(m, y, ...): invalid mtry: reset to within valid
range
Warning in randomForest.default(m, y, ...): invalid mtry: reset to within valid
range
Warning in randomForest.default(m, y, ...): invalid mtry: reset to within valid
range
Warning in randomForest.default(m, y, ...): invalid mtry: reset to within valid
range
Warning in randomForest.default(m, y, ...): invalid mtry: reset to within valid
range
Warning in randomForest.default(m, y, ...): invalid mtry: reset to within valid
range
Warning in randomForest.default(m, y, ...): invalid mtry: reset to within valid
range
Warning in randomForest.default(m, y, ...): invalid mtry: reset to within valid
range
Warning in randomForest.default(m, y, ...): invalid mtry: reset to within valid
range
Warning in randomForest.default(m, y, ...): invalid mtry: reset to within valid
range
Warning in randomForest.default(m, y, ...): invalid mtry: reset to within valid
range
Warning in randomForest.default(m, y, ...): invalid mtry: reset to within valid
range
Warning in randomForest.default(m, y, ...): invalid mtry: reset to within valid
range
Warning in randomForest.default(m, y, ...): invalid mtry: reset to within valid
range
Warning in randomForest.default(m, y, ...): invalid mtry: reset to within valid
range
Warning in randomForest.default(m, y, ...): invalid mtry: reset to within valid
range
Warning in randomForest.default(m, y, ...): invalid mtry: reset to within valid
range
Warning in randomForest.default(m, y, ...): invalid mtry: reset to within valid
range
Warning in randomForest.default(m, y, ...): invalid mtry: reset to within valid
range
Warning in randomForest.default(m, y, ...): invalid mtry: reset to within valid
range
Warning in randomForest.default(m, y, ...): invalid mtry: reset to within valid
range
Warning in randomForest.default(m, y, ...): invalid mtry: reset to within valid
range
Warning in randomForest.default(m, y, ...): invalid mtry: reset to within valid
range
Warning in randomForest.default(m, y, ...): invalid mtry: reset to within valid
range
Warning in randomForest.default(m, y, ...): invalid mtry: reset to within valid
range
Warning in randomForest.default(m, y, ...): invalid mtry: reset to within valid
range
Warning in randomForest.default(m, y, ...): invalid mtry: reset to within valid
range
Warning in randomForest.default(m, y, ...): invalid mtry: reset to within valid
range
Warning in randomForest.default(m, y, ...): invalid mtry: reset to within valid
range
Warning in randomForest.default(m, y, ...): invalid mtry: reset to within valid
range
Warning in randomForest.default(m, y, ...): invalid mtry: reset to within valid
range
Warning in randomForest.default(m, y, ...): invalid mtry: reset to within valid
range
Warning in randomForest.default(m, y, ...): invalid mtry: reset to within valid
range
Warning in randomForest.default(m, y, ...): invalid mtry: reset to within valid
range
Warning in randomForest.default(m, y, ...): invalid mtry: reset to within valid
range
Warning in randomForest.default(m, y, ...): invalid mtry: reset to within valid
range
Warning in randomForest.default(m, y, ...): invalid mtry: reset to within valid
range
Warning in randomForest.default(m, y, ...): invalid mtry: reset to within valid
range
Warning in randomForest.default(m, y, ...): invalid mtry: reset to within valid
range
Warning in randomForest.default(m, y, ...): invalid mtry: reset to within valid
range
Warning in randomForest.default(m, y, ...): invalid mtry: reset to within valid
range
Warning in randomForest.default(m, y, ...): invalid mtry: reset to within valid
range
Warning in randomForest.default(m, y, ...): invalid mtry: reset to within valid
range
Warning in randomForest.default(m, y, ...): invalid mtry: reset to within valid
range
Warning in randomForest.default(m, y, ...): invalid mtry: reset to within valid
range
Warning in randomForest.default(m, y, ...): invalid mtry: reset to within valid
range
Warning in randomForest.default(m, y, ...): invalid mtry: reset to within valid
range
Warning in randomForest.default(m, y, ...): invalid mtry: reset to within valid
range
Warning in randomForest.default(m, y, ...): invalid mtry: reset to within valid
range
Warning in randomForest.default(m, y, ...): invalid mtry: reset to within valid
range
Warning in randomForest.default(m, y, ...): invalid mtry: reset to within valid
range
Warning in randomForest.default(m, y, ...): invalid mtry: reset to within valid
range
Warning in randomForest.default(m, y, ...): invalid mtry: reset to within valid
range
Warning in randomForest.default(m, y, ...): invalid mtry: reset to within valid
range
Warning in randomForest.default(m, y, ...): invalid mtry: reset to within valid
range
Warning in randomForest.default(m, y, ...): invalid mtry: reset to within valid
range
Warning in randomForest.default(m, y, ...): invalid mtry: reset to within valid
range
Warning in randomForest.default(m, y, ...): invalid mtry: reset to within valid
range
Warning in randomForest.default(m, y, ...): invalid mtry: reset to within valid
range
Warning in randomForest.default(m, y, ...): invalid mtry: reset to within valid
range
Warning in randomForest.default(m, y, ...): invalid mtry: reset to within valid
range
Warning in randomForest.default(m, y, ...): invalid mtry: reset to within valid
range
Warning in randomForest.default(m, y, ...): invalid mtry: reset to within valid
range
Warning in randomForest.default(m, y, ...): invalid mtry: reset to within valid
range
Warning in randomForest.default(m, y, ...): invalid mtry: reset to within valid
range
Warning in randomForest.default(m, y, ...): invalid mtry: reset to within valid
range
Warning in randomForest.default(m, y, ...): invalid mtry: reset to within valid
range
Warning in randomForest.default(m, y, ...): invalid mtry: reset to within valid
range
Warning in randomForest.default(m, y, ...): invalid mtry: reset to within valid
range
Warning in randomForest.default(m, y, ...): invalid mtry: reset to within valid
range
Warning in randomForest.default(m, y, ...): invalid mtry: reset to within valid
range
Warning in randomForest.default(m, y, ...): invalid mtry: reset to within valid
range
Warning in randomForest.default(m, y, ...): invalid mtry: reset to within valid
range
Warning in randomForest.default(m, y, ...): invalid mtry: reset to within valid
range
Warning in randomForest.default(m, y, ...): invalid mtry: reset to within valid
range
Warning in randomForest.default(m, y, ...): invalid mtry: reset to within valid
range
Warning in randomForest.default(m, y, ...): invalid mtry: reset to within valid
range
Warning in randomForest.default(m, y, ...): invalid mtry: reset to within valid
range
Warning in randomForest.default(m, y, ...): invalid mtry: reset to within valid
range
Warning in randomForest.default(m, y, ...): invalid mtry: reset to within valid
range
Warning in randomForest.default(m, y, ...): invalid mtry: reset to within valid
range
Warning in randomForest.default(m, y, ...): invalid mtry: reset to within valid
range
Warning in randomForest.default(m, y, ...): invalid mtry: reset to within valid
range
Warning in randomForest.default(m, y, ...): invalid mtry: reset to within valid
range
Warning in randomForest.default(m, y, ...): invalid mtry: reset to within valid
range
Warning in randomForest.default(m, y, ...): invalid mtry: reset to within valid
range
Warning in randomForest.default(m, y, ...): invalid mtry: reset to within valid
range
Warning in randomForest.default(m, y, ...): invalid mtry: reset to within valid
range
Warning in randomForest.default(m, y, ...): invalid mtry: reset to within valid
range
Warning in randomForest.default(m, y, ...): invalid mtry: reset to within valid
range
Warning in randomForest.default(m, y, ...): invalid mtry: reset to within valid
range
Warning in randomForest.default(m, y, ...): invalid mtry: reset to within valid
range
Warning in randomForest.default(m, y, ...): invalid mtry: reset to within valid
range
Warning in randomForest.default(m, y, ...): invalid mtry: reset to within valid
range
Warning in randomForest.default(m, y, ...): invalid mtry: reset to within valid
range
Warning in randomForest.default(m, y, ...): invalid mtry: reset to within valid
range
Warning in randomForest.default(m, y, ...): invalid mtry: reset to within valid
range
tic <- Sys.time() # clock
diff.inner <- tic - toc # compute difference
diff.inner # print differenceTime difference of 2.670991 mins
tuning_results[[1]]
# A tibble: 12 × 5
mtry max_F1_Score median_F1_Score min_F1_Score n
<dbl> <dbl> <dbl> <dbl> <int>
1 1 0.493 0.308 0.231 3
2 3 0.635 0.613 0.609 3
3 5 0.630 0.605 0.602 3
4 7 0.610 0.595 0.543 3
5 9 0.610 0.587 0.563 3
6 11 0.610 0.583 0.568 3
7 13 0.610 0.587 0.556 3
8 15 0.595 0.593 0.587 3
9 17 0.610 0.587 0.575 3
10 19 0.610 0.597 0.595 3
11 21 0.610 0.595 0.587 3
12 23 0.610 0.587 0.563 3
[[2]]
# A tibble: 12 × 5
mtry max_F1_Score median_F1_Score min_F1_Score n
<dbl> <dbl> <dbl> <dbl> <int>
1 1 0.299 0.281 0.185 3
2 3 0.660 0.612 0.588 3
3 5 0.673 0.605 0.593 3
4 7 0.636 0.633 0.609 3
5 9 0.636 0.587 0.58 3
6 11 0.636 0.602 0.596 3
7 13 0.636 0.615 0.602 3
8 15 0.626 0.621 0.587 3
9 17 0.621 0.587 0.58 3
10 19 0.621 0.602 0.602 3
11 21 0.636 0.62 0.602 3
12 23 0.636 0.596 0.587 3
[[3]]
# A tibble: 12 × 5
mtry max_F1_Score median_F1_Score min_F1_Score n
<dbl> <dbl> <dbl> <dbl> <int>
1 1 0.494 0.4 0.386 3
2 3 0.681 0.658 0.609 3
3 5 0.674 0.667 0.596 3
4 7 0.680 0.659 0.589 3
5 9 0.646 0.634 0.583 3
6 11 0.673 0.651 0.577 3
7 13 0.667 0.634 0.577 3
8 15 0.651 0.646 0.583 3
9 17 0.660 0.634 0.577 3
10 19 0.660 0.634 0.568 3
11 21 0.651 0.646 0.583 3
12 23 0.667 0.651 0.577 3
[[4]]
# A tibble: 12 × 5
mtry max_F1_Score median_F1_Score min_F1_Score n
<dbl> <dbl> <dbl> <dbl> <int>
1 1 0.5 0.448 0.281 3
2 3 0.722 0.667 0.667 3
3 5 0.686 0.643 0.623 3
4 7 0.642 0.628 0.593 3
5 9 0.654 0.607 0.585 3
6 11 0.660 0.615 0.614 3
7 13 0.654 0.614 0.593 3
8 15 0.654 0.636 0.608 3
9 17 0.654 0.607 0.593 3
10 19 0.648 0.607 0.593 3
11 21 0.644 0.642 0.585 3
12 23 0.660 0.621 0.6 3
[[5]]
# A tibble: 12 × 5
mtry max_F1_Score median_F1_Score min_F1_Score n
<dbl> <dbl> <dbl> <dbl> <int>
1 1 0.441 0.405 0.346 3
2 3 0.673 0.618 0.554 3
3 5 0.687 0.609 0.523 3
4 7 0.66 0.580 0.492 3
5 9 0.653 0.6 0.492 3
6 11 0.647 0.592 0.492 3
7 13 0.66 0.592 0.492 3
8 15 0.667 0.620 0.515 3
9 17 0.641 0.583 0.492 3
10 19 0.667 0.6 0.515 3
11 21 0.647 0.6 0.492 3
12 23 0.653 0.6 0.492 3
[[6]]
# A tibble: 12 × 5
mtry max_F1_Score median_F1_Score min_F1_Score n
<dbl> <dbl> <dbl> <dbl> <int>
1 1 0.3 0.261 0.205 3
2 3 0.658 0.604 0.584 3
3 5 0.625 0.615 0.566 3
4 7 0.615 0.574 0.574 3
5 9 0.615 0.596 0.579 3
6 11 0.615 0.574 0.574 3
7 13 0.615 0.574 0.553 3
8 15 0.615 0.565 0.561 3
9 17 0.608 0.574 0.559 3
10 19 0.615 0.589 0.574 3
11 21 0.615 0.611 0.587 3
12 23 0.615 0.574 0.574 3
[[7]]
# A tibble: 12 × 5
mtry max_F1_Score median_F1_Score min_F1_Score n
<dbl> <dbl> <dbl> <dbl> <int>
1 1 0.375 0.321 0.271 3
2 3 0.682 0.627 0.568 3
3 5 0.681 0.602 0.579 3
4 7 0.674 0.587 0.556 3
5 9 0.659 0.587 0.533 3
6 11 0.630 0.579 0.533 3
7 13 0.630 0.587 0.533 3
8 15 0.638 0.587 0.539 3
9 17 0.674 0.587 0.533 3
10 19 0.667 0.587 0.545 3
11 21 0.645 0.587 0.539 3
12 23 0.674 0.595 0.539 3
[[8]]
# A tibble: 12 × 5
mtry max_F1_Score median_F1_Score min_F1_Score n
<dbl> <dbl> <dbl> <dbl> <int>
1 1 0.542 0.517 0.4 3
2 3 0.694 0.667 0.667 3
3 5 0.703 0.667 0.646 3
4 7 0.684 0.667 0.627 3
5 9 0.667 0.667 0.657 3
6 11 0.684 0.667 0.638 3
7 13 0.667 0.658 0.647 3
8 15 0.667 0.65 0.647 3
9 17 0.667 0.667 0.647 3
10 19 0.675 0.667 0.638 3
11 21 0.684 0.667 0.638 3
12 23 0.684 0.667 0.647 3
[[9]]
# A tibble: 12 × 5
mtry max_F1_Score median_F1_Score min_F1_Score n
<dbl> <dbl> <dbl> <dbl> <int>
1 1 0.471 0.4 0.382 3
2 3 0.7 0.659 0.575 3
3 5 0.7 0.659 0.627 3
4 7 0.651 0.649 0.619 3
5 9 0.644 0.641 0.605 3
6 11 0.649 0.644 0.595 3
7 13 0.636 0.605 0.605 3
8 15 0.641 0.636 0.571 3
9 17 0.636 0.632 0.605 3
10 19 0.649 0.644 0.595 3
11 21 0.644 0.613 0.605 3
12 23 0.632 0.629 0.595 3
[[10]]
# A tibble: 12 × 5
mtry max_F1_Score median_F1_Score min_F1_Score n
<dbl> <dbl> <dbl> <dbl> <int>
1 1 0.423 0.406 0.273 3
2 3 0.667 0.644 0.525 3
3 5 0.696 0.624 0.535 3
4 7 0.667 0.652 0.535 3
5 9 0.637 0.633 0.554 3
6 11 0.667 0.652 0.558 3
7 13 0.667 0.637 0.554 3
8 15 0.66 0.652 0.535 3
9 17 0.673 0.645 0.554 3
10 19 0.653 0.637 0.558 3
11 21 0.653 0.652 0.535 3
12 23 0.639 0.637 0.535 3
\(~\)
\(~\)
27.6.1 Inner Nest Results
head(str(tuning_results))List of 10
$ : tibble [12 × 5] (S3: tbl_df/tbl/data.frame)
..$ mtry : num [1:12] 1 3 5 7 9 11 13 15 17 19 ...
..$ max_F1_Score : num [1:12] 0.493 0.635 0.63 0.61 0.61 ...
..$ median_F1_Score: num [1:12] 0.308 0.613 0.605 0.595 0.587 ...
..$ min_F1_Score : num [1:12] 0.231 0.609 0.602 0.543 0.563 ...
..$ n : int [1:12] 3 3 3 3 3 3 3 3 3 3 ...
$ : tibble [12 × 5] (S3: tbl_df/tbl/data.frame)
..$ mtry : num [1:12] 1 3 5 7 9 11 13 15 17 19 ...
..$ max_F1_Score : num [1:12] 0.299 0.66 0.673 0.636 0.636 ...
..$ median_F1_Score: num [1:12] 0.281 0.612 0.605 0.633 0.587 ...
..$ min_F1_Score : num [1:12] 0.185 0.588 0.593 0.609 0.58 ...
..$ n : int [1:12] 3 3 3 3 3 3 3 3 3 3 ...
$ : tibble [12 × 5] (S3: tbl_df/tbl/data.frame)
..$ mtry : num [1:12] 1 3 5 7 9 11 13 15 17 19 ...
..$ max_F1_Score : num [1:12] 0.494 0.681 0.674 0.68 0.646 ...
..$ median_F1_Score: num [1:12] 0.4 0.658 0.667 0.659 0.634 ...
..$ min_F1_Score : num [1:12] 0.386 0.609 0.596 0.589 0.583 ...
..$ n : int [1:12] 3 3 3 3 3 3 3 3 3 3 ...
$ : tibble [12 × 5] (S3: tbl_df/tbl/data.frame)
..$ mtry : num [1:12] 1 3 5 7 9 11 13 15 17 19 ...
..$ max_F1_Score : num [1:12] 0.5 0.722 0.686 0.642 0.654 ...
..$ median_F1_Score: num [1:12] 0.448 0.667 0.643 0.628 0.607 ...
..$ min_F1_Score : num [1:12] 0.281 0.667 0.623 0.593 0.585 ...
..$ n : int [1:12] 3 3 3 3 3 3 3 3 3 3 ...
$ : tibble [12 × 5] (S3: tbl_df/tbl/data.frame)
..$ mtry : num [1:12] 1 3 5 7 9 11 13 15 17 19 ...
..$ max_F1_Score : num [1:12] 0.441 0.673 0.687 0.66 0.653 ...
..$ median_F1_Score: num [1:12] 0.405 0.618 0.609 0.58 0.6 ...
..$ min_F1_Score : num [1:12] 0.346 0.554 0.523 0.492 0.492 ...
..$ n : int [1:12] 3 3 3 3 3 3 3 3 3 3 ...
$ : tibble [12 × 5] (S3: tbl_df/tbl/data.frame)
..$ mtry : num [1:12] 1 3 5 7 9 11 13 15 17 19 ...
..$ max_F1_Score : num [1:12] 0.3 0.658 0.625 0.615 0.615 ...
..$ median_F1_Score: num [1:12] 0.261 0.604 0.615 0.574 0.596 ...
..$ min_F1_Score : num [1:12] 0.205 0.584 0.566 0.574 0.579 ...
..$ n : int [1:12] 3 3 3 3 3 3 3 3 3 3 ...
$ : tibble [12 × 5] (S3: tbl_df/tbl/data.frame)
..$ mtry : num [1:12] 1 3 5 7 9 11 13 15 17 19 ...
..$ max_F1_Score : num [1:12] 0.375 0.682 0.681 0.674 0.659 ...
..$ median_F1_Score: num [1:12] 0.321 0.627 0.602 0.587 0.587 ...
..$ min_F1_Score : num [1:12] 0.271 0.568 0.579 0.556 0.533 ...
..$ n : int [1:12] 3 3 3 3 3 3 3 3 3 3 ...
$ : tibble [12 × 5] (S3: tbl_df/tbl/data.frame)
..$ mtry : num [1:12] 1 3 5 7 9 11 13 15 17 19 ...
..$ max_F1_Score : num [1:12] 0.542 0.694 0.703 0.684 0.667 ...
..$ median_F1_Score: num [1:12] 0.517 0.667 0.667 0.667 0.667 ...
..$ min_F1_Score : num [1:12] 0.4 0.667 0.646 0.627 0.657 ...
..$ n : int [1:12] 3 3 3 3 3 3 3 3 3 3 ...
$ : tibble [12 × 5] (S3: tbl_df/tbl/data.frame)
..$ mtry : num [1:12] 1 3 5 7 9 11 13 15 17 19 ...
..$ max_F1_Score : num [1:12] 0.471 0.7 0.7 0.651 0.644 ...
..$ median_F1_Score: num [1:12] 0.4 0.659 0.659 0.649 0.641 ...
..$ min_F1_Score : num [1:12] 0.382 0.575 0.627 0.619 0.605 ...
..$ n : int [1:12] 3 3 3 3 3 3 3 3 3 3 ...
$ : tibble [12 × 5] (S3: tbl_df/tbl/data.frame)
..$ mtry : num [1:12] 1 3 5 7 9 11 13 15 17 19 ...
..$ max_F1_Score : num [1:12] 0.423 0.667 0.696 0.667 0.637 ...
..$ median_F1_Score: num [1:12] 0.406 0.644 0.624 0.652 0.633 ...
..$ min_F1_Score : num [1:12] 0.273 0.525 0.535 0.535 0.554 ...
..$ n : int [1:12] 3 3 3 3 3 3 3 3 3 3 ...
NULL
tuning_results[[1]]# A tibble: 12 × 5
mtry max_F1_Score median_F1_Score min_F1_Score n
<dbl> <dbl> <dbl> <dbl> <int>
1 1 0.493 0.308 0.231 3
2 3 0.635 0.613 0.609 3
3 5 0.630 0.605 0.602 3
4 7 0.610 0.595 0.543 3
5 9 0.610 0.587 0.563 3
6 11 0.610 0.583 0.568 3
7 13 0.610 0.587 0.556 3
8 15 0.595 0.593 0.587 3
9 17 0.610 0.587 0.575 3
10 19 0.610 0.597 0.595 3
11 21 0.610 0.595 0.587 3
12 23 0.610 0.587 0.563 3
tuning_results[[6]]# A tibble: 12 × 5
mtry max_F1_Score median_F1_Score min_F1_Score n
<dbl> <dbl> <dbl> <dbl> <int>
1 1 0.3 0.261 0.205 3
2 3 0.658 0.604 0.584 3
3 5 0.625 0.615 0.566 3
4 7 0.615 0.574 0.574 3
5 9 0.615 0.596 0.579 3
6 11 0.615 0.574 0.574 3
7 13 0.615 0.574 0.553 3
8 15 0.615 0.565 0.561 3
9 17 0.608 0.574 0.559 3
10 19 0.615 0.589 0.574 3
11 21 0.615 0.611 0.587 3
12 23 0.615 0.574 0.574 3
label_nested_cv <- TRAIN.nested_cv %>%
mutate(RowNumber = row_number()) %>%
select(RowNumber,id,id2)
label_nested_cv# A tibble: 10 × 3
RowNumber id id2
<int> <chr> <chr>
1 1 Repeat1 Fold1
2 2 Repeat1 Fold2
3 3 Repeat1 Fold3
4 4 Repeat1 Fold4
5 5 Repeat1 Fold5
6 6 Repeat2 Fold1
7 7 Repeat2 Fold2
8 8 Repeat2 Fold3
9 9 Repeat2 Fold4
10 10 Repeat2 Fold5
tuning_results_2 <- NULL
for (i in 1:length(tuning_results)){
tuning_results_2[[i]] <- cbind( tuning_results[[i]], label_nested_cv[i, c('id','id2')] )
}
tuning_results_2[[1]] mtry max_F1_Score median_F1_Score min_F1_Score n id id2
1 1 0.4927536 0.3076923 0.2307692 3 Repeat1 Fold1
2 3 0.6352941 0.6133333 0.6086957 3 Repeat1 Fold1
3 5 0.6301370 0.6052632 0.6024096 3 Repeat1 Fold1
4 7 0.6097561 0.5945946 0.5428571 3 Repeat1 Fold1
5 9 0.6097561 0.5866667 0.5633803 3 Repeat1 Fold1
6 11 0.6097561 0.5833333 0.5675676 3 Repeat1 Fold1
7 13 0.6097561 0.5866667 0.5555556 3 Repeat1 Fold1
8 15 0.5945946 0.5925926 0.5866667 3 Repeat1 Fold1
9 17 0.6097561 0.5866667 0.5753425 3 Repeat1 Fold1
10 19 0.6097561 0.5974026 0.5945946 3 Repeat1 Fold1
11 21 0.6097561 0.5945946 0.5866667 3 Repeat1 Fold1
12 23 0.6097561 0.5866667 0.5633803 3 Repeat1 Fold1
library(ggplot2)
library(scales)
Attaching package: 'scales'
The following object is masked from 'package:purrr':
discard
The following object is masked from 'package:readr':
col_factor
pooled_inner <- tuning_results_2 %>% bind_rows()
str(pooled_inner)'data.frame': 120 obs. of 7 variables:
$ mtry : num 1 3 5 7 9 11 13 15 17 19 ...
$ max_F1_Score : num 0.493 0.635 0.63 0.61 0.61 ...
$ median_F1_Score: num 0.308 0.613 0.605 0.595 0.587 ...
$ min_F1_Score : num 0.231 0.609 0.602 0.543 0.563 ...
$ n : int 3 3 3 3 3 3 3 3 3 3 ...
$ id : chr "Repeat1" "Repeat1" "Repeat1" "Repeat1" ...
$ id2 : chr "Fold1" "Fold1" "Fold1" "Fold1" ...
\(~\)
\(~\)
\(~\)
\(~\)
27.6.2 Inner Nest Results Graphs
Check out the Facet - Fold tab below:
27.6.2.1 Facet - Fold
ggplot(pooled_inner,
aes(x = mtry , y = median_F1_Score, shape = id, color = id )) +
geom_point(aes(size = median_F1_Score)) +
geom_line() +
facet_wrap(.~id2)
27.6.2.2 Facet - Fold with Errors
ggplot(pooled_inner,
aes(x = mtry , y = median_F1_Score, shape = id, color = id )) +
geom_line() +
geom_errorbar(aes(ymin=min_F1_Score, ymax=max_F1_Score)) +
facet_wrap(.~id2)
27.6.2.3 Facet - Repeat
ggplot(pooled_inner,
aes(x = mtry , y = median_F1_Score, shape = id2, color = id2 )) +
geom_point(aes(size = median_F1_Score)) +
geom_line() +
facet_wrap(.~id)
27.6.2.4 Unorganized
ggplot(pooled_inner,
aes(x = mtry , y = median_F1_Score, shape = id2, color = id )) +
geom_point(aes(size = median_F1_Score)) +
geom_line() 
27.7
#{.unnumbered}
\(~\)
\(~\)
\(~\)
\(~\)
\(~\)
27.8 Max Best F1 Value Function
We will use this helper function to select the corresponding row of data that contains the max_F1_Score from our summary in pooled_inner.
max_BEST_F1 <- function(data) { data[ which.max(data$max_F1_Score) , ] }
# test the function
tuning_results[[10]]# A tibble: 12 × 5
mtry max_F1_Score median_F1_Score min_F1_Score n
<dbl> <dbl> <dbl> <dbl> <int>
1 1 0.423 0.406 0.273 3
2 3 0.667 0.644 0.525 3
3 5 0.696 0.624 0.535 3
4 7 0.667 0.652 0.535 3
5 9 0.637 0.633 0.554 3
6 11 0.667 0.652 0.558 3
7 13 0.667 0.637 0.554 3
8 15 0.66 0.652 0.535 3
9 17 0.673 0.645 0.554 3
10 19 0.653 0.637 0.558 3
11 21 0.653 0.652 0.535 3
12 23 0.639 0.637 0.535 3
max_BEST_F1(tuning_results[[10]])# A tibble: 1 × 5
mtry max_F1_Score median_F1_Score min_F1_Score n
<dbl> <dbl> <dbl> <dbl> <int>
1 5 0.696 0.624 0.535 3
\(~\)
\(~\)
\(~\)
\(~\)
\(~\)
\(~\)
27.9 Running Outer Nest From Inner Pooled
We will find the values of ntree that had the highest performing F1 by sample, we will map those and an outer split of the sample to f1_tree to obtain our outer-inner estimates.
tuning_results %>% map_df(max_BEST_F1)# A tibble: 10 × 5
mtry max_F1_Score median_F1_Score min_F1_Score n
<dbl> <dbl> <dbl> <dbl> <int>
1 3 0.635 0.613 0.609 3
2 5 0.673 0.605 0.593 3
3 3 0.681 0.658 0.609 3
4 3 0.722 0.667 0.667 3
5 5 0.687 0.609 0.523 3
6 3 0.658 0.604 0.584 3
7 3 0.682 0.627 0.568 3
8 5 0.703 0.667 0.646 3
9 3 0.7 0.659 0.575 3
10 5 0.696 0.624 0.535 3
# These are the values of `ntree` that had highest perfroming `F1` by sample :
max_BEST_F1_vals <- tuning_results %>%
map_df(max_BEST_F1) %>%
select(mtry)
TRAIN.nested_cv.results <- bind_cols(TRAIN.nested_cv, max_BEST_F1_vals)
glimpse(TRAIN.nested_cv.results)Rows: 10
Columns: 5
$ splits <list> [<vfold_split[900 x 225 x 1125 x 10]>], [<vfold_split…
$ id <chr> "Repeat1", "Repeat1", "Repeat1", "Repeat1", "Repeat1",…
$ id2 <chr> "Fold1", "Fold2", "Fold3", "Fold4", "Fold5", "Fold1", …
$ inner_resamples <list> [<bootstraps[3 x 2]>], [<bootstraps[3 x 2]>], [<boots…
$ mtry <dbl> 3, 5, 3, 3, 5, 3, 3, 5, 3, 5
toc <- Sys.time() # start clock
TRAIN.nested_cv.results$F1_SCORE <- map2_dbl(TRAIN.nested_cv.results$splits,
TRAIN.nested_cv.results$mtry,
f1_mtry)#f1_tree)
tic <- Sys.time() #
diff.inner_then_outer <- tic - toc
diff.inner - diff.inner_then_outerTime difference of 154.9291 secs
TRAIN.nested_cv.results$F1_SCORE [1] 0.6666667 0.7761194 0.5531915 0.5660377 0.6545455 0.6296296 0.7457627
[8] 0.6764706 0.6071429 0.6274510
summary(TRAIN.nested_cv.results$F1_SCORE) Min. 1st Qu. Median Mean 3rd Qu. Max.
0.5532 0.6122 0.6421 0.6503 0.6740 0.7761
TRAIN.nested_cv.results# Nested resampling:
# outer: 5-fold cross-validation repeated 2 times
# inner: Bootstrap sampling
# A tibble: 10 × 6
splits id id2 inner_resamples mtry F1_SCORE
<list> <chr> <chr> <list> <dbl> <dbl>
1 <split [900/225]> Repeat1 Fold1 <boot [3 × 2]> 3 0.667
2 <split [900/225]> Repeat1 Fold2 <boot [3 × 2]> 5 0.776
3 <split [900/225]> Repeat1 Fold3 <boot [3 × 2]> 3 0.553
4 <split [900/225]> Repeat1 Fold4 <boot [3 × 2]> 3 0.566
5 <split [900/225]> Repeat1 Fold5 <boot [3 × 2]> 5 0.655
6 <split [900/225]> Repeat2 Fold1 <boot [3 × 2]> 3 0.630
7 <split [900/225]> Repeat2 Fold2 <boot [3 × 2]> 3 0.746
8 <split [900/225]> Repeat2 Fold3 <boot [3 × 2]> 5 0.676
9 <split [900/225]> Repeat2 Fold4 <boot [3 × 2]> 3 0.607
10 <split [900/225]> Repeat2 Fold5 <boot [3 × 2]> 5 0.627
\(~\)
\(~\)
\(~\)
\(~\)
\(~\)
\(~\)
27.10 Run Outer Nest
toc <- Sys.time() # start clock
outer_only <- map(TRAIN.nested_cv$splits, tune_over_mtry) # run functionWarning in randomForest.default(m, y, ...): invalid mtry: reset to within valid
range
Warning in randomForest.default(m, y, ...): invalid mtry: reset to within valid
range
Warning in randomForest.default(m, y, ...): invalid mtry: reset to within valid
range
Warning in randomForest.default(m, y, ...): invalid mtry: reset to within valid
range
Warning in randomForest.default(m, y, ...): invalid mtry: reset to within valid
range
Warning in randomForest.default(m, y, ...): invalid mtry: reset to within valid
range
Warning in randomForest.default(m, y, ...): invalid mtry: reset to within valid
range
Warning in randomForest.default(m, y, ...): invalid mtry: reset to within valid
range
Warning in randomForest.default(m, y, ...): invalid mtry: reset to within valid
range
Warning in randomForest.default(m, y, ...): invalid mtry: reset to within valid
range
Warning in randomForest.default(m, y, ...): invalid mtry: reset to within valid
range
Warning in randomForest.default(m, y, ...): invalid mtry: reset to within valid
range
Warning in randomForest.default(m, y, ...): invalid mtry: reset to within valid
range
Warning in randomForest.default(m, y, ...): invalid mtry: reset to within valid
range
Warning in randomForest.default(m, y, ...): invalid mtry: reset to within valid
range
Warning in randomForest.default(m, y, ...): invalid mtry: reset to within valid
range
Warning in randomForest.default(m, y, ...): invalid mtry: reset to within valid
range
Warning in randomForest.default(m, y, ...): invalid mtry: reset to within valid
range
Warning in randomForest.default(m, y, ...): invalid mtry: reset to within valid
range
Warning in randomForest.default(m, y, ...): invalid mtry: reset to within valid
range
Warning in randomForest.default(m, y, ...): invalid mtry: reset to within valid
range
Warning in randomForest.default(m, y, ...): invalid mtry: reset to within valid
range
Warning in randomForest.default(m, y, ...): invalid mtry: reset to within valid
range
Warning in randomForest.default(m, y, ...): invalid mtry: reset to within valid
range
Warning in randomForest.default(m, y, ...): invalid mtry: reset to within valid
range
Warning in randomForest.default(m, y, ...): invalid mtry: reset to within valid
range
Warning in randomForest.default(m, y, ...): invalid mtry: reset to within valid
range
Warning in randomForest.default(m, y, ...): invalid mtry: reset to within valid
range
Warning in randomForest.default(m, y, ...): invalid mtry: reset to within valid
range
Warning in randomForest.default(m, y, ...): invalid mtry: reset to within valid
range
Warning in randomForest.default(m, y, ...): invalid mtry: reset to within valid
range
Warning in randomForest.default(m, y, ...): invalid mtry: reset to within valid
range
Warning in randomForest.default(m, y, ...): invalid mtry: reset to within valid
range
Warning in randomForest.default(m, y, ...): invalid mtry: reset to within valid
range
Warning in randomForest.default(m, y, ...): invalid mtry: reset to within valid
range
Warning in randomForest.default(m, y, ...): invalid mtry: reset to within valid
range
Warning in randomForest.default(m, y, ...): invalid mtry: reset to within valid
range
Warning in randomForest.default(m, y, ...): invalid mtry: reset to within valid
range
Warning in randomForest.default(m, y, ...): invalid mtry: reset to within valid
range
Warning in randomForest.default(m, y, ...): invalid mtry: reset to within valid
range
Warning in randomForest.default(m, y, ...): invalid mtry: reset to within valid
range
Warning in randomForest.default(m, y, ...): invalid mtry: reset to within valid
range
Warning in randomForest.default(m, y, ...): invalid mtry: reset to within valid
range
Warning in randomForest.default(m, y, ...): invalid mtry: reset to within valid
range
Warning in randomForest.default(m, y, ...): invalid mtry: reset to within valid
range
Warning in randomForest.default(m, y, ...): invalid mtry: reset to within valid
range
Warning in randomForest.default(m, y, ...): invalid mtry: reset to within valid
range
Warning in randomForest.default(m, y, ...): invalid mtry: reset to within valid
range
Warning in randomForest.default(m, y, ...): invalid mtry: reset to within valid
range
Warning in randomForest.default(m, y, ...): invalid mtry: reset to within valid
range
Warning in randomForest.default(m, y, ...): invalid mtry: reset to within valid
range
Warning in randomForest.default(m, y, ...): invalid mtry: reset to within valid
range
Warning in randomForest.default(m, y, ...): invalid mtry: reset to within valid
range
Warning in randomForest.default(m, y, ...): invalid mtry: reset to within valid
range
Warning in randomForest.default(m, y, ...): invalid mtry: reset to within valid
range
Warning in randomForest.default(m, y, ...): invalid mtry: reset to within valid
range
Warning in randomForest.default(m, y, ...): invalid mtry: reset to within valid
range
Warning in randomForest.default(m, y, ...): invalid mtry: reset to within valid
range
Warning in randomForest.default(m, y, ...): invalid mtry: reset to within valid
range
Warning in randomForest.default(m, y, ...): invalid mtry: reset to within valid
range
Warning in randomForest.default(m, y, ...): invalid mtry: reset to within valid
range
Warning in randomForest.default(m, y, ...): invalid mtry: reset to within valid
range
Warning in randomForest.default(m, y, ...): invalid mtry: reset to within valid
range
Warning in randomForest.default(m, y, ...): invalid mtry: reset to within valid
range
Warning in randomForest.default(m, y, ...): invalid mtry: reset to within valid
range
Warning in randomForest.default(m, y, ...): invalid mtry: reset to within valid
range
Warning in randomForest.default(m, y, ...): invalid mtry: reset to within valid
range
Warning in randomForest.default(m, y, ...): invalid mtry: reset to within valid
range
Warning in randomForest.default(m, y, ...): invalid mtry: reset to within valid
range
Warning in randomForest.default(m, y, ...): invalid mtry: reset to within valid
range
Warning in randomForest.default(m, y, ...): invalid mtry: reset to within valid
range
Warning in randomForest.default(m, y, ...): invalid mtry: reset to within valid
range
Warning in randomForest.default(m, y, ...): invalid mtry: reset to within valid
range
Warning in randomForest.default(m, y, ...): invalid mtry: reset to within valid
range
Warning in randomForest.default(m, y, ...): invalid mtry: reset to within valid
range
Warning in randomForest.default(m, y, ...): invalid mtry: reset to within valid
range
Warning in randomForest.default(m, y, ...): invalid mtry: reset to within valid
range
Warning in randomForest.default(m, y, ...): invalid mtry: reset to within valid
range
Warning in randomForest.default(m, y, ...): invalid mtry: reset to within valid
range
Warning in randomForest.default(m, y, ...): invalid mtry: reset to within valid
range
tic <- Sys.time() # time
diff.outer <- tic-toc
diff.outerTime difference of 1.011248 mins
diff.outer - diff.innerTime difference of -1.659744 mins
outer_only[[1]]
# A tibble: 12 × 2
mtry F1_Score
<dbl> <dbl>
1 1 0.423
2 3 0.667
3 5 0.667
4 7 0.688
5 9 0.667
6 11 0.667
7 13 0.688
8 15 0.667
9 17 0.667
10 19 0.667
11 21 0.667
12 23 0.688
[[2]]
# A tibble: 12 × 2
mtry F1_Score
<dbl> <dbl>
1 1 0.356
2 3 0.788
3 5 0.788
4 7 0.758
5 9 0.746
6 11 0.746
7 13 0.735
8 15 0.735
9 17 0.746
10 19 0.746
11 21 0.746
12 23 0.735
[[3]]
# A tibble: 12 × 2
mtry F1_Score
<dbl> <dbl>
1 1 0.278
2 3 0.458
3 5 0.490
4 7 0.52
5 9 0.510
6 11 0.510
7 13 0.510
8 15 0.52
9 17 0.490
10 19 0.52
11 21 0.52
12 23 0.510
[[4]]
# A tibble: 12 × 2
mtry F1_Score
<dbl> <dbl>
1 1 0.293
2 3 0.593
3 5 0.538
4 7 0.538
5 9 0.538
6 11 0.510
7 13 0.510
8 15 0.538
9 17 0.510
10 19 0.510
11 21 0.510
12 23 0.510
[[5]]
# A tibble: 12 × 2
mtry F1_Score
<dbl> <dbl>
1 1 0.596
2 3 0.655
3 5 0.655
4 7 0.655
5 9 0.643
6 11 0.643
7 13 0.655
8 15 0.655
9 17 0.655
10 19 0.618
11 21 0.655
12 23 0.655
[[6]]
# A tibble: 12 × 2
mtry F1_Score
<dbl> <dbl>
1 1 0.326
2 3 0.604
3 5 0.627
4 7 0.56
5 9 0.588
6 11 0.56
7 13 0.588
8 15 0.588
9 17 0.56
10 19 0.588
11 21 0.577
12 23 0.588
[[7]]
# A tibble: 12 × 2
mtry F1_Score
<dbl> <dbl>
1 1 0.553
2 3 0.746
3 5 0.746
4 7 0.746
5 9 0.721
6 11 0.733
7 13 0.721
8 15 0.733
9 17 0.746
10 19 0.710
11 21 0.733
12 23 0.733
[[8]]
# A tibble: 12 × 2
mtry F1_Score
<dbl> <dbl>
1 1 0.5
2 3 0.657
3 5 0.676
4 7 0.667
5 9 0.667
6 11 0.676
7 13 0.676
8 15 0.667
9 17 0.676
10 19 0.676
11 21 0.667
12 23 0.676
[[9]]
# A tibble: 12 × 2
mtry F1_Score
<dbl> <dbl>
1 1 0.391
2 3 0.621
3 5 0.561
4 7 0.519
5 9 0.536
6 11 0.536
7 13 0.571
8 15 0.571
9 17 0.545
10 19 0.536
11 21 0.561
12 23 0.509
[[10]]
# A tibble: 12 × 2
mtry F1_Score
<dbl> <dbl>
1 1 0.167
2 3 0.6
3 5 0.627
4 7 0.6
5 9 0.654
6 11 0.627
7 13 0.667
8 15 0.627
9 17 0.642
10 19 0.642
11 21 0.627
12 23 0.627
str(outer_only,1)List of 10
$ : tibble [12 × 2] (S3: tbl_df/tbl/data.frame)
$ : tibble [12 × 2] (S3: tbl_df/tbl/data.frame)
$ : tibble [12 × 2] (S3: tbl_df/tbl/data.frame)
$ : tibble [12 × 2] (S3: tbl_df/tbl/data.frame)
$ : tibble [12 × 2] (S3: tbl_df/tbl/data.frame)
$ : tibble [12 × 2] (S3: tbl_df/tbl/data.frame)
$ : tibble [12 × 2] (S3: tbl_df/tbl/data.frame)
$ : tibble [12 × 2] (S3: tbl_df/tbl/data.frame)
$ : tibble [12 × 2] (S3: tbl_df/tbl/data.frame)
$ : tibble [12 × 2] (S3: tbl_df/tbl/data.frame)
outer_only_2 <- NULL
for (i in 1:length(outer_only)){
outer_only_2[[i]] <- cbind( outer_only[[i]], label_nested_cv[i, c('id','id2')] )
}
outer_only_3 <- outer_only_2 %>% bind_rows()
outer_summary <- outer_only_3 %>%
group_by(id,id2, mtry) %>%
summarize(outer_F1_Score = mean(F1_Score),
n = length(F1_Score),
.groups = 'keep') %>%
ungroup()
outer_summary# A tibble: 120 × 5
id id2 mtry outer_F1_Score n
<chr> <chr> <dbl> <dbl> <int>
1 Repeat1 Fold1 1 0.423 1
2 Repeat1 Fold1 3 0.667 1
3 Repeat1 Fold1 5 0.667 1
4 Repeat1 Fold1 7 0.688 1
5 Repeat1 Fold1 9 0.667 1
6 Repeat1 Fold1 11 0.667 1
7 Repeat1 Fold1 13 0.688 1
8 Repeat1 Fold1 15 0.667 1
9 Repeat1 Fold1 17 0.667 1
10 Repeat1 Fold1 19 0.667 1
# ℹ 110 more rows
\(~\)
\(~\)
27.10.1 Outer Nest Summary Graphs
Some output summary graphs:
27.10.1.1 Facet - Fold
ggplot(outer_summary,
aes(x=mtry, y=outer_F1_Score, color = id , shape = id ) ) +
geom_point(aes(size = outer_F1_Score)) +
geom_line() +
facet_wrap(.~id2)
27.10.1.2 Unorganized
ggplot(outer_summary,
aes(x=mtry, y=outer_F1_Score, color = id , shape = id2 ) ) +
geom_point(aes(size = outer_F1_Score)) +
geom_line()
27.10.1.3 Facet - Repeat
ggplot(outer_summary,
aes(x=mtry, y=outer_F1_Score, color = id2 , shape = id2 ) ) +
geom_point(aes(size = outer_F1_Score)) +
geom_line() +
facet_wrap(.~id)
\(~\)
\(~\)
\(~\)
\(~\)
\(~\)
\(~\)
27.11 Inner V Outer
Pooled_Inner_L <- pooled_inner %>%
select(id2, id, mtry, max_F1_Score, median_F1_Score, min_F1_Score) %>%
gather(max_F1_Score,
median_F1_Score,
min_F1_Score,
key='Inner_F1_Scores',
value=Inner_F1_Score) %>%
mutate(Inner_Type = ifelse( Inner_F1_Scores == 'max_F1_Score', "MAX",
ifelse( Inner_F1_Scores == 'median_F1_Score', "MED",
ifelse( Inner_F1_Scores == 'min_F1_Score', "MIN", "NA" ) ) ) )
Outer_Summary <- outer_summary %>%
select(id2, id, mtry, outer_F1_Score)
inner_v_outer <- Pooled_Inner_L %>%
left_join(Outer_Summary, by=c('id','id2','mtry')) %>%
select(id2, id, Inner_Type, mtry, Inner_F1_Score, outer_F1_Score) %>%
arrange(id2, id, Inner_Type, mtry, Inner_F1_Score, outer_F1_Score)
glimpse(inner_v_outer)Rows: 360
Columns: 6
$ id2 <chr> "Fold1", "Fold1", "Fold1", "Fold1", "Fold1", "Fold1", "…
$ id <chr> "Repeat1", "Repeat1", "Repeat1", "Repeat1", "Repeat1", …
$ Inner_Type <chr> "MAX", "MAX", "MAX", "MAX", "MAX", "MAX", "MAX", "MAX",…
$ mtry <dbl> 1, 3, 5, 7, 9, 11, 13, 15, 17, 19, 21, 23, 1, 3, 5, 7, …
$ Inner_F1_Score <dbl> 0.4927536, 0.6352941, 0.6301370, 0.6097561, 0.6097561, …
$ outer_F1_Score <dbl> 0.4230769, 0.6666667, 0.6666667, 0.6875000, 0.6666667, …
cnt_inner_v_outer_grps <- inner_v_outer %>%
group_by(id2,id, Inner_Type, mtry) %>%
tally()
glimpse(cnt_inner_v_outer_grps)Rows: 360
Columns: 5
Groups: id2, id, Inner_Type [30]
$ id2 <chr> "Fold1", "Fold1", "Fold1", "Fold1", "Fold1", "Fold1", "Fold…
$ id <chr> "Repeat1", "Repeat1", "Repeat1", "Repeat1", "Repeat1", "Rep…
$ Inner_Type <chr> "MAX", "MAX", "MAX", "MAX", "MAX", "MAX", "MAX", "MAX", "MA…
$ mtry <dbl> 1, 3, 5, 7, 9, 11, 13, 15, 17, 19, 21, 23, 1, 3, 5, 7, 9, 1…
$ n <int> 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1,…
\(~\)
\(~\)
27.11.1 Inner V Outer Summary Graphs
27.11.1.1 Facet - my_tree
inner_min_max_v_outer <- inner_v_outer %>%
mutate( mtry = as.character(mtry) ) %>%
mutate( Fold_Sample = paste0(id2,'_',id)) %>%
spread( key=Inner_Type, Inner_F1_Score)
ggplot(inner_min_max_v_outer ,
aes(x = outer_F1_Score , y = MED, shape = id , color = id2 )) +
geom_point( ) +
geom_line( ) +
geom_errorbar(aes(ymin=MIN, ymax=MAX)) +
facet_wrap(.~ mtry ) `geom_line()`: Each group consists of only one observation.
ℹ Do you need to adjust the group aesthetic?
`geom_line()`: Each group consists of only one observation.
ℹ Do you need to adjust the group aesthetic?
`geom_line()`: Each group consists of only one observation.
ℹ Do you need to adjust the group aesthetic?
`geom_line()`: Each group consists of only one observation.
ℹ Do you need to adjust the group aesthetic?
`geom_line()`: Each group consists of only one observation.
ℹ Do you need to adjust the group aesthetic?
`geom_line()`: Each group consists of only one observation.
ℹ Do you need to adjust the group aesthetic?
`geom_line()`: Each group consists of only one observation.
ℹ Do you need to adjust the group aesthetic?
`geom_line()`: Each group consists of only one observation.
ℹ Do you need to adjust the group aesthetic?
`geom_line()`: Each group consists of only one observation.
ℹ Do you need to adjust the group aesthetic?
`geom_line()`: Each group consists of only one observation.
ℹ Do you need to adjust the group aesthetic?
`geom_line()`: Each group consists of only one observation.
ℹ Do you need to adjust the group aesthetic?
`geom_line()`: Each group consists of only one observation.
ℹ Do you need to adjust the group aesthetic?
