Title: | Automatic Machine Learning with 'tidymodels' |
---|---|
Description: | The goal of this package will be to provide a simple interface for automatic machine learning that fits the 'tidymodels' framework. The intention is to work for regression and classification problems with a simple verb framework. |
Authors: | Steven Sanderson [aut, cre, cph] |
Maintainer: | Steven Sanderson <[email protected]> |
License: | MIT + file LICENSE |
Version: | 0.0.5.9000 |
Built: | 2025-01-22 04:46:01 UTC |
Source: | https://github.com/spsanderson/tidyAML |
This function checks for duplicate rows in a data frame.
check_duplicate_rows(.data)
check_duplicate_rows(.data)
.data |
A data frame. |
This function checks for duplicate rows by comparing each row in the data frame to every other row. If a row is identical to another row, it is considered a duplicate.
A logical vector indicating whether each row is a duplicate or not.
Steven P. Sanderson II, MPH
Other Utility:
core_packages()
,
create_splits()
,
create_workflow_set()
,
fast_classification_parsnip_spec_tbl()
,
fast_regression_parsnip_spec_tbl()
,
full_internal_make_wflw()
,
install_deps()
,
load_deps()
,
match_args()
,
quantile_normalize()
data <- data.frame( x = c(1, 2, 3, 1), y = c(2, 3, 4, 2), z = c(3, 2, 5, 3) ) check_duplicate_rows(data)
data <- data.frame( x = c(1, 2, 3, 1), y = c(2, 3, 4, 2), z = c(3, 2, 5, 3) ) check_duplicate_rows(data)
Lists the core packages necessary to run all potential modeling algorithms.
core_packages()
core_packages()
Lists the core packages necessary to run all potential modeling algorithms.
A character vector
Steven P. Sanderson II, MPH
Other Utility:
check_duplicate_rows()
,
create_splits()
,
create_workflow_set()
,
fast_classification_parsnip_spec_tbl()
,
fast_regression_parsnip_spec_tbl()
,
full_internal_make_wflw()
,
install_deps()
,
load_deps()
,
match_args()
,
quantile_normalize()
core_packages()
core_packages()
parsnip
Creates a list/tibble of parsnip model specifications.
create_model_spec( .parsnip_eng = list("lm"), .mode = list("regression"), .parsnip_fns = list("linear_reg"), .return_tibble = TRUE )
create_model_spec( .parsnip_eng = list("lm"), .mode = list("regression"), .parsnip_fns = list("linear_reg"), .return_tibble = TRUE )
.parsnip_eng |
The input must be a list. The default for this is set to |
.mode |
The input must be a list. The default is 'regression' |
.parsnip_fns |
The input must be a list. The default for this is set to |
.return_tibble |
The default is TRUE. FALSE will return a list object. |
Creates a list/tibble of parsnip model specifications. With this function
you can generate a list/tibble output of any model specification and engine you
choose that is supported by the parsnip
ecosystem.
A list or a tibble.
Steven P. Sanderson II, MPH
Other Model_Generator:
fast_classification()
,
fast_regression()
create_model_spec( .parsnip_eng = list("lm","glm","glmnet","cubist"), .parsnip_fns = list( "linear_reg","linear_reg","linear_reg", "cubist_rules" ) ) create_model_spec( .parsnip_eng = list("lm","glm","glmnet","cubist"), .parsnip_fns = list( "linear_reg","linear_reg","linear_reg", "cubist_rules" ), .return_tibble = FALSE )
create_model_spec( .parsnip_eng = list("lm","glm","glmnet","cubist"), .parsnip_fns = list( "linear_reg","linear_reg","linear_reg", "cubist_rules" ) ) create_model_spec( .parsnip_eng = list("lm","glm","glmnet","cubist"), .parsnip_fns = list( "linear_reg","linear_reg","linear_reg", "cubist_rules" ), .return_tibble = FALSE )
Create a splits object.
create_splits(.data, .split_type = "initial_split", .split_args = NULL)
create_splits(.data, .split_type = "initial_split", .split_args = NULL)
.data |
The data being passed to make a split on |
.split_type |
The default is "initial_split", you can pass any other split
type from the |
.split_args |
The default is NULL in order to use the default split arguments. If you want to pass other arguments then must pass a list with the parameter name and the argument. |
Create a splits object that returns a list object of both the
splits object itself and the splits type. This function supports all splits
types from the rsample
package.
A list object
Steven P. Sanderson II, MPH
Other Utility:
check_duplicate_rows()
,
core_packages()
,
create_workflow_set()
,
fast_classification_parsnip_spec_tbl()
,
fast_regression_parsnip_spec_tbl()
,
full_internal_make_wflw()
,
install_deps()
,
load_deps()
,
match_args()
,
quantile_normalize()
create_splits(mtcars, .split_type = "vfold_cv")
create_splits(mtcars, .split_type = "vfold_cv")
Create a workflow set object tibble from a model spec tibble.
create_workflow_set(.model_tbl = NULL, .recipe_list = list(), .cross = TRUE)
create_workflow_set(.model_tbl = NULL, .recipe_list = list(), .cross = TRUE)
.model_tbl |
The model table that is generated from a function like
|
.recipe_list |
Provide a list of recipes here that will get added to the workflow set object. |
.cross |
The default is TRUE, can be set to FALSE. This is passed to the
|
Create a workflow set
object/tibble from a model spec tibble where
the object class type is tidyaml_base_tbl
. This function will take in a list
of recipes and will grab the model specifications from the base tibble to
create the workflow sets object. You can also supply the logical of TRUE/FALSe
the .cross
parameter which gets passed to the corresponding parameter as an
argumnt to the workflowsets::workflow_set()
function.
A list object of workflows.
Steven P. Sanderson II, MPH
https://workflowsets.tidymodels.org/
Other Utility:
check_duplicate_rows()
,
core_packages()
,
create_splits()
,
fast_classification_parsnip_spec_tbl()
,
fast_regression_parsnip_spec_tbl()
,
full_internal_make_wflw()
,
install_deps()
,
load_deps()
,
match_args()
,
quantile_normalize()
library(recipes) rec_obj <- recipe(mpg ~ ., data = mtcars) spec_tbl <- fast_regression_parsnip_spec_tbl( .parsnip_fns = "linear_reg", .parsnip_eng = c("lm","glm") ) create_workflow_set( spec_tbl, list(rec_obj) )
library(recipes) rec_obj <- recipe(mpg ~ ., data = mtcars) spec_tbl <- fast_regression_parsnip_spec_tbl( .parsnip_fns = "linear_reg", .parsnip_eng = c("lm","glm") ) create_workflow_set( spec_tbl, list(rec_obj) )
Extract a model specification from a tidyAML model tibble.
extract_model_spec(.data, .model_id = NULL)
extract_model_spec(.data, .model_id = NULL)
.data |
The model table that must have the class |
.model_id |
The model number that you want to select, Must be an integer
or sequence of integers, ie. |
This function allows you to get a model specification or more from
a tibble with a class of "tidyaml_mod_spec_tbl". It allows you to select the
model by the .model_id
column. You can call the model id's by an integer
or a sequence of integers.
A tibble with the chosen model specification(s).
Steven P. Sanderson II, MPH
Other Extractor:
extract_regression_residuals()
,
extract_wflw()
,
extract_wflw_fit()
,
extract_wflw_pred()
,
get_model()
spec_tbl <- fast_regression_parsnip_spec_tbl( .parsnip_fns = "linear_reg", .parsnip_eng = c("lm","glm") ) extract_model_spec(spec_tbl, 1) extract_model_spec(spec_tbl, 1:2)
spec_tbl <- fast_regression_parsnip_spec_tbl( .parsnip_fns = "linear_reg", .parsnip_eng = c("lm","glm") ) extract_model_spec(spec_tbl, 1) extract_model_spec(spec_tbl, 1:2)
This function extracts residuals from a fast regression model
table (fast_regression()
).
extract_regression_residuals(.model_tbl, .pivot_long = FALSE)
extract_regression_residuals(.model_tbl, .pivot_long = FALSE)
.model_tbl |
A fast regression model specification table ( |
.pivot_long |
A logical value indicating if the output should be pivoted.
The default is |
The function checks if the input model specification table inherits the class 'fst_reg_spec_tbl' and if it contains the column 'pred_wflw'. It then manipulates the data, grouping it by model, and extracts residuals for each model. The result is a list of data frames, each containing residuals, actual values, and predicted values for a specific model.
The function returns a list of data frames, each containing residuals, actual values, and predicted values for a specific model.
Steven P. Sanderson II, MPH
Other Extractor:
extract_model_spec()
,
extract_wflw()
,
extract_wflw_fit()
,
extract_wflw_pred()
,
get_model()
library(recipes, quietly = TRUE) rec_obj <- recipe(mpg ~ ., data = mtcars) fr_tbl <- fast_regression(mtcars, rec_obj, .parsnip_eng = c("lm","glm"), .parsnip_fns = "linear_reg") extract_regression_residuals(fr_tbl) extract_regression_residuals(fr_tbl, .pivot_long = TRUE)
library(recipes, quietly = TRUE) rec_obj <- recipe(mpg ~ ., data = mtcars) fr_tbl <- fast_regression(mtcars, rec_obj, .parsnip_eng = c("lm","glm"), .parsnip_fns = "linear_reg") extract_regression_residuals(fr_tbl) extract_regression_residuals(fr_tbl, .pivot_long = TRUE)
Extract a model workflow from a tidyAML model tibble.
extract_wflw(.data, .model_id = NULL)
extract_wflw(.data, .model_id = NULL)
.data |
The model table that must have the class |
.model_id |
The model number that you want to select, Must be an integer
or sequence of integers, ie. |
This function allows you to get a model workflow or more from
a tibble with a class of "tidyaml_mod_spec_tbl". It allows you to select the
model by the .model_id
column. You can call the model id's by an integer
or a sequence of integers.
A tibble with the chosen model workflow(s).
Steven P. Sanderson II, MPH
Other Extractor:
extract_model_spec()
,
extract_regression_residuals()
,
extract_wflw_fit()
,
extract_wflw_pred()
,
get_model()
library(recipes) rec_obj <- recipe(mpg ~ ., data = mtcars) frt_tbl <- fast_regression(mtcars, rec_obj, .parsnip_eng = c("lm","glm"), .parsnip_fns = "linear_reg") extract_wflw(frt_tbl, 1) extract_wflw(frt_tbl, 1:2)
library(recipes) rec_obj <- recipe(mpg ~ ., data = mtcars) frt_tbl <- fast_regression(mtcars, rec_obj, .parsnip_eng = c("lm","glm"), .parsnip_fns = "linear_reg") extract_wflw(frt_tbl, 1) extract_wflw(frt_tbl, 1:2)
Extract a model fitted workflow from a tidyAML model tibble.
extract_wflw_fit(.data, .model_id = NULL)
extract_wflw_fit(.data, .model_id = NULL)
.data |
The model table that must have the class |
.model_id |
The model number that you want to select, Must be an integer
or sequence of integers, ie. |
This function allows you to get a model fitted workflow or more from
a tibble with a class of "tidyaml_mod_spec_tbl". It allows you to select the
model by the .model_id
column. You can call the model id's by an integer
or a sequence of integers.
A tibble with the chosen model workflow(s).
Steven P. Sanderson II, MPH
Other Extractor:
extract_model_spec()
,
extract_regression_residuals()
,
extract_wflw()
,
extract_wflw_pred()
,
get_model()
library(recipes) rec_obj <- recipe(mpg ~ ., data = mtcars) frt_tbl <- fast_regression(mtcars, rec_obj, .parsnip_eng = c("lm","glm"), .parsnip_fns = "linear_reg") extract_wflw_fit(frt_tbl, 1) extract_wflw_fit(frt_tbl, 1:2)
library(recipes) rec_obj <- recipe(mpg ~ ., data = mtcars) frt_tbl <- fast_regression(mtcars, rec_obj, .parsnip_eng = c("lm","glm"), .parsnip_fns = "linear_reg") extract_wflw_fit(frt_tbl, 1) extract_wflw_fit(frt_tbl, 1:2)
Extract a model workflow predictions from a tidyAML model tibble.
extract_wflw_pred(.data, .model_id = NULL)
extract_wflw_pred(.data, .model_id = NULL)
.data |
The model table that must have the class |
.model_id |
The model number that you want to select, Must be an integer
or sequence of integers, ie. |
This function allows you to get a model workflow predictions or more from
a tibble with a class of "tidyaml_mod_spec_tbl". It allows you to select the
model by the .model_id
column. You can call the model id's by an integer
or a sequence of integers.
A tibble with the chosen model workflow(s).
Steven P. Sanderson II, MPH
Other Extractor:
extract_model_spec()
,
extract_regression_residuals()
,
extract_wflw()
,
extract_wflw_fit()
,
get_model()
library(recipes) rec_obj <- recipe(mpg ~ ., data = mtcars) frt_tbl <- fast_regression(mtcars, rec_obj, .parsnip_eng = c("lm","glm"), .parsnip_fns = "linear_reg") extract_wflw_pred(frt_tbl, 1) extract_wflw_pred(frt_tbl, 1:2)
library(recipes) rec_obj <- recipe(mpg ~ ., data = mtcars) frt_tbl <- fast_regression(mtcars, rec_obj, .parsnip_eng = c("lm","glm"), .parsnip_fns = "linear_reg") extract_wflw_pred(frt_tbl, 1) extract_wflw_pred(frt_tbl, 1:2)
parsnip
Creates a list/tibble of parsnip model specifications.
fast_classification( .data, .rec_obj, .parsnip_fns = "all", .parsnip_eng = "all", .split_type = "initial_split", .split_args = NULL, .drop_na = TRUE )
fast_classification( .data, .rec_obj, .parsnip_fns = "all", .parsnip_eng = "all", .split_type = "initial_split", .split_args = NULL, .drop_na = TRUE )
.data |
The data being passed to the function for the classification problem |
.rec_obj |
The recipe object being passed. |
.parsnip_fns |
The default is 'all' which will create all possible classification model specifications supported. |
.parsnip_eng |
the default is 'all' which will create all possible classification model specifications supported. |
.split_type |
The default is 'initial_split', you can pass any type of
split supported by |
.split_args |
The default is NULL, when NULL then the default parameters of the split type will be executed for the rsample split type. |
.drop_na |
The default is TRUE, which will drop all NA's from the data. |
With this function you can generate a tibble output of any classification
model specification and it's fitted workflow
object. Per recipes documentation
explicitly with step_string2factor()
it is encouraged to mutate your predictor
into a factor before you create your recipe.
A list or a tibble.
Steven P. Sanderson II, MPH
Other Model_Generator:
create_model_spec()
,
fast_regression()
library(recipes) library(dplyr) library(tidyr) df <- Titanic |> as_tibble() |> uncount(n) |> mutate(across(everything(), as.factor)) rec_obj <- recipe(Survived ~ ., data = df) fct_tbl <- fast_classification( .data = df, .rec_obj = rec_obj, .parsnip_eng = c("glm","earth") ) fct_tbl
library(recipes) library(dplyr) library(tidyr) df <- Titanic |> as_tibble() |> uncount(n) |> mutate(across(everything(), as.factor)) rec_obj <- recipe(Survived ~ ., data = df) fct_tbl <- fast_classification( .data = df, .rec_obj = rec_obj, .parsnip_eng = c("glm","earth") ) fct_tbl
parsnip
Creates a tibble of parsnip classification model specifications.
fast_classification_parsnip_spec_tbl( .parsnip_fns = "all", .parsnip_eng = "all" )
fast_classification_parsnip_spec_tbl( .parsnip_fns = "all", .parsnip_eng = "all" )
.parsnip_fns |
The default for this is set to |
.parsnip_eng |
The default for this is set to |
Creates a tibble of parsnip classification model specifications. This will create a tibble of 32 different classification model specifications which can be filtered. The model specs are created first and then filtered out. This will only create models for classification problems. To find all of the supported models in this package you can visit https://www.tidymodels.org/find/parsnip/
A tibble with an added class of 'fst_class_spec_tbl'
Steven P. Sanderson II, MPH
Other Utility:
check_duplicate_rows()
,
core_packages()
,
create_splits()
,
create_workflow_set()
,
fast_regression_parsnip_spec_tbl()
,
full_internal_make_wflw()
,
install_deps()
,
load_deps()
,
match_args()
,
quantile_normalize()
fast_classification_parsnip_spec_tbl(.parsnip_fns = "logistic_reg") fast_classification_parsnip_spec_tbl(.parsnip_eng = c("earth","dbarts"))
fast_classification_parsnip_spec_tbl(.parsnip_fns = "logistic_reg") fast_classification_parsnip_spec_tbl(.parsnip_eng = c("earth","dbarts"))
parsnip
Creates a list/tibble of parsnip model specifications.
fast_regression( .data, .rec_obj, .parsnip_fns = "all", .parsnip_eng = "all", .split_type = "initial_split", .split_args = NULL, .drop_na = TRUE )
fast_regression( .data, .rec_obj, .parsnip_fns = "all", .parsnip_eng = "all", .split_type = "initial_split", .split_args = NULL, .drop_na = TRUE )
.data |
The data being passed to the function for the regression problem |
.rec_obj |
The recipe object being passed. |
.parsnip_fns |
The default is 'all' which will create all possible regression model specifications supported. |
.parsnip_eng |
the default is 'all' which will create all possible regression model specifications supported. |
.split_type |
The default is 'initial_split', you can pass any type of
split supported by |
.split_args |
The default is NULL, when NULL then the default parameters of the split type will be executed for the rsample split type. |
.drop_na |
The default is TRUE, which will drop all NA's from the data. |
With this function you can generate a tibble output of any regression
model specification and it's fitted workflow
object.
A list or a tibble.
Steven P. Sanderson II, MPH
Other Model_Generator:
create_model_spec()
,
fast_classification()
library(recipes, quietly = TRUE) rec_obj <- recipe(mpg ~ ., data = mtcars) frt_tbl <- fast_regression( mtcars, rec_obj, .parsnip_eng = c("lm","glm","gee"), .parsnip_fns = "linear_reg" ) frt_tbl
library(recipes, quietly = TRUE) rec_obj <- recipe(mpg ~ ., data = mtcars) frt_tbl <- fast_regression( mtcars, rec_obj, .parsnip_eng = c("lm","glm","gee"), .parsnip_fns = "linear_reg" ) frt_tbl
parsnip
Creates a tibble of parsnip regression model specifications.
fast_regression_parsnip_spec_tbl(.parsnip_fns = "all", .parsnip_eng = "all")
fast_regression_parsnip_spec_tbl(.parsnip_fns = "all", .parsnip_eng = "all")
.parsnip_fns |
The default for this is set to |
.parsnip_eng |
The default for this is set to |
Creates a tibble of parsnip regression model specifications. This will create a tibble of 46 different regression model specifications which can be filtered. The model specs are created first and then filtered out. This will only create models for regression problems. To find all of the supported models in this package you can visit https://www.tidymodels.org/find/parsnip/
A tibble with an added class of 'fst_reg_spec_tbl'
Steven P. Sanderson II, MPH
Other Utility:
check_duplicate_rows()
,
core_packages()
,
create_splits()
,
create_workflow_set()
,
fast_classification_parsnip_spec_tbl()
,
full_internal_make_wflw()
,
install_deps()
,
load_deps()
,
match_args()
,
quantile_normalize()
fast_regression_parsnip_spec_tbl(.parsnip_fns = "linear_reg") fast_regression_parsnip_spec_tbl(.parsnip_eng = c("lm","glm"))
fast_regression_parsnip_spec_tbl(.parsnip_fns = "linear_reg") fast_regression_parsnip_spec_tbl(.parsnip_eng = c("lm","glm"))
This function creates a full internal workflow for a model and recipe combination.
full_internal_make_wflw(.model_tbl, .rec_obj)
full_internal_make_wflw(.model_tbl, .rec_obj)
.model_tbl |
A model specification table ( |
.rec_obj |
A recipe object. |
The function checks if the input model specification table inherits the class 'tidyaml_mod_spec_tbl'. It then manipulates the input table, making adjustments for factors and creating a list of grouped models. For each model-recipe pair, it uses the appropriate internal function based on the model type to create a workflow object. The specific internal function is selected using a switch statement based on the class of the model.
The function returns a workflow object for the first model-recipe pair based on the internal function selected.
Steven P. Sanderson II, MPH
Other Utility:
check_duplicate_rows()
,
core_packages()
,
create_splits()
,
create_workflow_set()
,
fast_classification_parsnip_spec_tbl()
,
fast_regression_parsnip_spec_tbl()
,
install_deps()
,
load_deps()
,
match_args()
,
quantile_normalize()
library(dplyr) library(recipes) rec_obj <- recipe(mpg ~ ., data = mtcars) mod_tbl <- make_regression_base_tbl() mod_tbl <- mod_tbl |> filter( .parsnip_engine %in% c("lm", "glm") & .parsnip_fns == "linear_reg" ) class(mod_tbl) <- c("tidyaml_mod_spec_tbl", class(mod_tbl)) mod_spec_tbl <- internal_make_spec_tbl(mod_tbl) result <- full_internal_make_wflw(mod_spec_tbl, rec_obj) result
library(dplyr) library(recipes) rec_obj <- recipe(mpg ~ ., data = mtcars) mod_tbl <- make_regression_base_tbl() mod_tbl <- mod_tbl |> filter( .parsnip_engine %in% c("lm", "glm") & .parsnip_fns == "linear_reg" ) class(mod_tbl) <- c("tidyaml_mod_spec_tbl", class(mod_tbl)) mod_spec_tbl <- internal_make_spec_tbl(mod_tbl) result <- full_internal_make_wflw(mod_spec_tbl, rec_obj) result
Get a model from a tidyAML model tibble.
get_model(.data, .model_id = NULL)
get_model(.data, .model_id = NULL)
.data |
The model table that must have the class |
.model_id |
The model number that you want to select, Must be an integer
or sequence of integers, ie. |
This function allows you to get a model or models from a tibble with
a class of "tidyaml_mod_spec_tbl". It allows you to select the model by the
.model_id
column. You can call the model id's by an integer or a sequence
of integers.
A tibble with the chosen models.
Steven P. Sanderson II, MPH
Other Extractor:
extract_model_spec()
,
extract_regression_residuals()
,
extract_wflw()
,
extract_wflw_fit()
,
extract_wflw_pred()
library(recipes) rec_obj <- recipe(mpg ~ ., data = mtcars) spec_tbl <- fast_regression_parsnip_spec_tbl( .parsnip_fns = "linear_reg", .parsnip_eng = c("lm","glm") ) get_model(spec_tbl, 1) get_model(spec_tbl, 1:2)
library(recipes) rec_obj <- recipe(mpg ~ ., data = mtcars) spec_tbl <- fast_regression_parsnip_spec_tbl( .parsnip_fns = "linear_reg", .parsnip_eng = c("lm","glm") ) get_model(spec_tbl, 1) get_model(spec_tbl, 1:2)
Installs all dependencies in the core_packages()
function.
install_deps()
install_deps()
Installs all dependencies in the core_packages()
function.
No return value, called for side effects
Steven P. Sanderson II, MPH
Other Utility:
check_duplicate_rows()
,
core_packages()
,
create_splits()
,
create_workflow_set()
,
fast_classification_parsnip_spec_tbl()
,
fast_regression_parsnip_spec_tbl()
,
full_internal_make_wflw()
,
load_deps()
,
match_args()
,
quantile_normalize()
## Not run: install_deps() ## End(Not run)
## Not run: install_deps() ## End(Not run)
Safely Make a fitted workflow from a model spec tibble.
internal_make_fitted_wflw(.model_tbl, .splits_obj)
internal_make_fitted_wflw(.model_tbl, .splits_obj)
.model_tbl |
The model table that is generated from a function like
|
.splits_obj |
The splits object from the auto_ml function. It is internal
to the |
Create a fitted parnsip
model from a workflow
object.
A list object of workflows.
Steven P. Sanderson II, MPH
Other Internals:
internal_make_spec_tbl()
,
internal_make_wflw()
,
internal_make_wflw_gee_lin_reg()
,
internal_make_wflw_predictions()
,
internal_set_args_to_tune()
,
make_classification_base_tbl()
,
make_regression_base_tbl()
library(recipes, quietly = TRUE) mod_spec_tbl <- fast_regression_parsnip_spec_tbl( .parsnip_eng = c("lm","glm"), .parsnip_fns = "linear_reg" ) rec_obj <- recipe(mpg ~ ., data = mtcars) splits_obj <- create_splits(mtcars, "initial_split") mod_tbl <- mod_spec_tbl |> mutate(wflw = full_internal_make_wflw(mod_spec_tbl, rec_obj)) internal_make_fitted_wflw(mod_tbl, splits_obj)
library(recipes, quietly = TRUE) mod_spec_tbl <- fast_regression_parsnip_spec_tbl( .parsnip_eng = c("lm","glm"), .parsnip_fns = "linear_reg" ) rec_obj <- recipe(mpg ~ ., data = mtcars) splits_obj <- create_splits(mtcars, "initial_split") mod_tbl <- mod_spec_tbl |> mutate(wflw = full_internal_make_wflw(mod_spec_tbl, rec_obj)) internal_make_fitted_wflw(mod_tbl, splits_obj)
Make a Model Spec tibble.
internal_make_spec_tbl(.model_tbl)
internal_make_spec_tbl(.model_tbl)
.model_tbl |
This is the data that should be coming from inside of the regression/classification to parsnip spec functions. |
Make a Model Spec tibble.
A model spec tbl.
Steven P. Sanderson II, MPH
Other Internals:
internal_make_fitted_wflw()
,
internal_make_wflw()
,
internal_make_wflw_gee_lin_reg()
,
internal_make_wflw_predictions()
,
internal_set_args_to_tune()
,
make_classification_base_tbl()
,
make_regression_base_tbl()
make_regression_base_tbl() |> internal_make_spec_tbl() make_classification_base_tbl() |> internal_make_spec_tbl()
make_regression_base_tbl() |> internal_make_spec_tbl() make_classification_base_tbl() |> internal_make_spec_tbl()
Safely Make a workflow from a model spec tibble.
internal_make_wflw(.model_tbl, .rec_obj)
internal_make_wflw(.model_tbl, .rec_obj)
.model_tbl |
The model table that is generated from a function like
|
.rec_obj |
The recipe object that is going to be used to make the workflow object. |
Create a model specification tibble that has a workflows::workflow()
list column.
A list object of workflows.
Steven P. Sanderson II, MPH
Other Internals:
internal_make_fitted_wflw()
,
internal_make_spec_tbl()
,
internal_make_wflw_gee_lin_reg()
,
internal_make_wflw_predictions()
,
internal_set_args_to_tune()
,
make_classification_base_tbl()
,
make_regression_base_tbl()
library(recipes, quietly = TRUE) mod_spec_tbl <- fast_regression_parsnip_spec_tbl( .parsnip_eng = c("lm","glm","gee"), .parsnip_fns = "linear_reg" ) rec_obj <- recipe(mpg ~ ., data = mtcars) internal_make_wflw(mod_spec_tbl, rec_obj)
library(recipes, quietly = TRUE) mod_spec_tbl <- fast_regression_parsnip_spec_tbl( .parsnip_eng = c("lm","glm","gee"), .parsnip_fns = "linear_reg" ) rec_obj <- recipe(mpg ~ ., data = mtcars) internal_make_wflw(mod_spec_tbl, rec_obj)
Safely Make a workflow from a model spec tibble.
internal_make_wflw_gee_lin_reg(.model_tbl, .rec_obj)
internal_make_wflw_gee_lin_reg(.model_tbl, .rec_obj)
.model_tbl |
The model table that is generated from a function like
|
.rec_obj |
The recipe object that is going to be used to make the workflow object. |
Create a model specification tibble that has a workflows::workflow()
list column.
A list object of workflows.
Steven P. Sanderson II, MPH
Other Internals:
internal_make_fitted_wflw()
,
internal_make_spec_tbl()
,
internal_make_wflw()
,
internal_make_wflw_predictions()
,
internal_set_args_to_tune()
,
make_classification_base_tbl()
,
make_regression_base_tbl()
library(dplyr) library(recipes) library(multilevelmod) mod_tbl <- make_regression_base_tbl() mod_tbl <- mod_tbl |> filter( .parsnip_engine %in% c("gee") & .parsnip_fns == "linear_reg" ) class(mod_tbl) <- c("tidyaml_mod_spec_tbl", class(mod_tbl)) mod_spec_tbl <- internal_make_spec_tbl(mod_tbl) rec_obj <- recipe(mpg ~ ., data = mtcars) internal_make_wflw_gee_lin_reg(mod_spec_tbl, rec_obj)
library(dplyr) library(recipes) library(multilevelmod) mod_tbl <- make_regression_base_tbl() mod_tbl <- mod_tbl |> filter( .parsnip_engine %in% c("gee") & .parsnip_fns == "linear_reg" ) class(mod_tbl) <- c("tidyaml_mod_spec_tbl", class(mod_tbl)) mod_spec_tbl <- internal_make_spec_tbl(mod_tbl) rec_obj <- recipe(mpg ~ ., data = mtcars) internal_make_wflw_gee_lin_reg(mod_spec_tbl, rec_obj)
Safely Make predictions on a fitted workflow from a model spec tibble.
internal_make_wflw_predictions(.model_tbl, .splits_obj)
internal_make_wflw_predictions(.model_tbl, .splits_obj)
.model_tbl |
The model table that is generated from a function like
|
.splits_obj |
The splits object from the auto_ml function. It is internal
to the |
Create predictions on a fitted parnsip
model from a workflow
object.
A list object tibble of the outcome variable and it's values along with the testing and training predictions in a single tibble.
.data_category | .data_type | .value |
actual | actual | 21.0 |
actual | actual | 21.0 |
actual | actual | 22.8 |
... | ... | ... |
predicted | training | 21.0 |
... | ... | ... |
predicted | training | 21.0 |
Steven P. Sanderson II, MPH
Other Internals:
internal_make_fitted_wflw()
,
internal_make_spec_tbl()
,
internal_make_wflw()
,
internal_make_wflw_gee_lin_reg()
,
internal_set_args_to_tune()
,
make_classification_base_tbl()
,
make_regression_base_tbl()
library(recipes, quietly = TRUE) mod_spec_tbl <- fast_regression_parsnip_spec_tbl( .parsnip_eng = c("lm","glm"), .parsnip_fns = "linear_reg" ) rec_obj <- recipe(mpg ~ ., data = mtcars) splits_obj <- create_splits(mtcars, "initial_split") mod_tbl <- mod_spec_tbl |> mutate(wflw = full_internal_make_wflw(mod_spec_tbl, rec_obj)) mod_fitted_tbl <- mod_tbl |> mutate(fitted_wflw = internal_make_fitted_wflw(mod_tbl, splits_obj)) internal_make_wflw_predictions(mod_fitted_tbl, splits_obj)
library(recipes, quietly = TRUE) mod_spec_tbl <- fast_regression_parsnip_spec_tbl( .parsnip_eng = c("lm","glm"), .parsnip_fns = "linear_reg" ) rec_obj <- recipe(mpg ~ ., data = mtcars) splits_obj <- create_splits(mtcars, "initial_split") mod_tbl <- mod_spec_tbl |> mutate(wflw = full_internal_make_wflw(mod_spec_tbl, rec_obj)) mod_fitted_tbl <- mod_tbl |> mutate(fitted_wflw = internal_make_fitted_wflw(mod_tbl, splits_obj)) internal_make_wflw_predictions(mod_fitted_tbl, splits_obj)
Make a tuned model specification object.
internal_set_args_to_tune(.model_tbl)
internal_set_args_to_tune(.model_tbl)
.model_tbl |
The model table that is generated from a function like
|
This will take a model specification that is created from a function
like fast_regression_parsnip_spec_tbl()
and update the model_spec
args
to tune::tune()
. This is done dynamically, meaning you do not need
to know the names of the parameters inside of the model specification.
A list object of workflows.
Steven P. Sanderson II, MPH
Other Internals:
internal_make_fitted_wflw()
,
internal_make_spec_tbl()
,
internal_make_wflw()
,
internal_make_wflw_gee_lin_reg()
,
internal_make_wflw_predictions()
,
make_classification_base_tbl()
,
make_regression_base_tbl()
library(dplyr) mod_tbl <- fast_regression_parsnip_spec_tbl() mod_tbl$model_spec[[1]] updated_mod_tbl <- mod_tbl |> mutate(model_spec = internal_set_args_to_tune(mod_tbl)) updated_mod_tbl$model_spec[[1]]
library(dplyr) mod_tbl <- fast_regression_parsnip_spec_tbl() mod_tbl$model_spec[[1]] updated_mod_tbl <- mod_tbl |> mutate(model_spec = internal_set_args_to_tune(mod_tbl)) updated_mod_tbl$model_spec[[1]]
Load all the core packages necessary to run all potential modeling algorithms.
load_deps()
load_deps()
Load all the core packages necessary to run all potential modeling algorithms.
No return value, called for side effects
Steven P. Sanderson II, MPH
Other Utility:
check_duplicate_rows()
,
core_packages()
,
create_splits()
,
create_workflow_set()
,
fast_classification_parsnip_spec_tbl()
,
fast_regression_parsnip_spec_tbl()
,
full_internal_make_wflw()
,
install_deps()
,
match_args()
,
quantile_normalize()
## Not run: load_deps() ## End(Not run)
## Not run: load_deps() ## End(Not run)
Creates a base tibble to create parsnip classification model specifications.
make_classification_base_tbl()
make_classification_base_tbl()
Creates a base tibble to create parsnip classification model specifications.
A tibble
Steven P. Sanderson II, MPH
Other Internals:
internal_make_fitted_wflw()
,
internal_make_spec_tbl()
,
internal_make_wflw()
,
internal_make_wflw_gee_lin_reg()
,
internal_make_wflw_predictions()
,
internal_set_args_to_tune()
,
make_regression_base_tbl()
make_classification_base_tbl()
make_classification_base_tbl()
Creates a base tibble to create parsnip regression model specifications.
make_regression_base_tbl()
make_regression_base_tbl()
Creates a base tibble to create parsnip regression model specifications.
A tibble
Steven P. Sanderson II, MPH
Other Internals:
internal_make_fitted_wflw()
,
internal_make_spec_tbl()
,
internal_make_wflw()
,
internal_make_wflw_gee_lin_reg()
,
internal_make_wflw_predictions()
,
internal_set_args_to_tune()
,
make_classification_base_tbl()
make_regression_base_tbl()
make_regression_base_tbl()
Match a functions arguments.
match_args(f, args)
match_args(f, args)
f |
The parsnip function such as |
args |
The arguments you want to supply to |
Match a functions arguments, the bad ones passed will be rejected but the remaining passing ones will be returned.
A list of matched arguments.
Steven P. Sanderson II, MPH
Other Utility:
check_duplicate_rows()
,
core_packages()
,
create_splits()
,
create_workflow_set()
,
fast_classification_parsnip_spec_tbl()
,
fast_regression_parsnip_spec_tbl()
,
full_internal_make_wflw()
,
install_deps()
,
load_deps()
,
quantile_normalize()
match_args( f = "linear_reg", args = list( mode = "regression", engine = "lm", trees = 1, mtry = 1 ) )
match_args( f = "linear_reg", args = list( mode = "regression", engine = "lm", trees = 1, mtry = 1 ) )
Create a ggplot2 plot of regression predictions.
plot_regression_predictions(.data, .output = "list")
plot_regression_predictions(.data, .output = "list")
.data |
The data from the output of the |
.output |
The default is "list" which will return a list of plots. The other option is "facet" which will return a single faceted plot. |
Create a ggplot2 plot of regression predictions, the actual, training,
and testing values. The output of this function can either be a list of plots
or a single faceted plot. This function takes the output of the function
extract_wflw_pred()
function.
A list of ggplot2 plots or a faceted plot.
Steven P. Sanderson II, MPH
Other Plotting:
plot_regression_residuals()
library(recipes) rec_obj <- recipe(mpg ~ ., data = mtcars) frt_tbl <- fast_regression( mtcars, rec_obj, .parsnip_eng = c("lm","glm"), .parsnip_fns = "linear_reg" ) extract_wflw_pred(frt_tbl,1) |> plot_regression_predictions() extract_wflw_pred(frt_tbl,1:nrow(frt_tbl)) |> plot_regression_predictions(.output = "facet")
library(recipes) rec_obj <- recipe(mpg ~ ., data = mtcars) frt_tbl <- fast_regression( mtcars, rec_obj, .parsnip_eng = c("lm","glm"), .parsnip_fns = "linear_reg" ) extract_wflw_pred(frt_tbl,1) |> plot_regression_predictions() extract_wflw_pred(frt_tbl,1:nrow(frt_tbl)) |> plot_regression_predictions(.output = "facet")
Create a ggplot2 plot of regression residuals.
plot_regression_residuals(.data)
plot_regression_residuals(.data)
.data |
The data from the output of the |
Create a ggplot2 plot of regression residuals. The output of this
function can either be a list of plots or a single faceted plot. This function
takes the output of the extract_regression_residuals()
function.
A list of ggplot2 plots or a faceted plot.
Steven P. Sanderson II, MPH
Other Plotting:
plot_regression_predictions()
library(recipes) rec_obj <- recipe(mpg ~ ., data = mtcars) frt_tbl <- fast_regression( mtcars, rec_obj, .parsnip_eng = c("lm","glm"), .parsnip_fns = "linear_reg" ) extract_regression_residuals(frt_tbl, FALSE)[1] |> plot_regression_residuals() extract_regression_residuals(frt_tbl, TRUE)[1] |> plot_regression_residuals()
library(recipes) rec_obj <- recipe(mpg ~ ., data = mtcars) frt_tbl <- fast_regression( mtcars, rec_obj, .parsnip_eng = c("lm","glm"), .parsnip_fns = "linear_reg" ) extract_regression_residuals(frt_tbl, FALSE)[1] |> plot_regression_residuals() extract_regression_residuals(frt_tbl, TRUE)[1] |> plot_regression_residuals()
This function will perform quantile normalization on two or more distributions of equal length. Quantile normalization is a technique used to make the distribution of values across different samples more similar. It ensures that the distributions of values for each sample have the same quantiles. This function takes a numeric matrix as input and returns a quantile-normalized matrix.
quantile_normalize(.data, .return_tibble = FALSE)
quantile_normalize(.data, .return_tibble = FALSE)
.data |
A numeric matrix where each column represents a sample. |
.return_tibble |
A logical value that determines if the output should be a tibble. Default is 'FALSE'. |
This function performs quantile normalization on a numeric matrix by following these steps:
Sort each column of the input matrix.
Calculate the mean of each row across the sorted columns.
Replace each column's sorted values with the row means.
Unsort the columns to their original order.
A list object that has the following:
A numeric matrix that has been quantile normalized.
The row means of the quantile normalized matrix.
The sorted data
The ranked indices
Steven P. Sanderson II, MPH
rowMeans
: Calculate row means.
apply
: Apply a function over the margins of an array.
order
: Order the elements of a vector.
Other Utility:
check_duplicate_rows()
,
core_packages()
,
create_splits()
,
create_workflow_set()
,
fast_classification_parsnip_spec_tbl()
,
fast_regression_parsnip_spec_tbl()
,
full_internal_make_wflw()
,
install_deps()
,
load_deps()
,
match_args()
# Create a sample numeric matrix data <- matrix(rnorm(20), ncol = 4) # Perform quantile normalization normalized_data <- quantile_normalize(data) normalized_data as.data.frame(normalized_data$normalized_data) |> sapply(function(x) quantile(x, probs = seq(0, 1, 1 / 4))) quantile_normalize(data, .return_tibble = TRUE)
# Create a sample numeric matrix data <- matrix(rnorm(20), ncol = 4) # Perform quantile normalization normalized_data <- quantile_normalize(data) normalized_data as.data.frame(normalized_data$normalized_data) |> sapply(function(x) quantile(x, probs = seq(0, 1, 1 / 4))) quantile_normalize(data, .return_tibble = TRUE)