Title: | Veras Miscellaneous |
---|---|
Description: | Contains a collection of useful functions for basic data computation and manipulation, wrapper functions for generating 'ggplot2' graphics, including statistical model diagnostic plots, methods for computing statistical models quality measures (such as AIC, BIC, r squared, root mean squared error) and general utilities. |
Authors: | Lucas Veras [aut, cre] |
Maintainer: | Lucas Veras <[email protected]> |
License: | MIT + file LICENSE |
Version: | 0.1.2 |
Built: | 2025-02-18 04:58:11 UTC |
Source: | https://github.com/verasls/lvmisc |
Create a custom error condition created with
rlang::abort()
with a - hopefully - more useful
error message and metadata.
abort_argument_type(arg, must, not) abort_argument_class(arg, must, not) abort_argument_length(arg, must, not) abort_argument_diff_length(arg1, arg2) abort_argument_value(arg, valid_values)
abort_argument_type(arg, must, not) abort_argument_class(arg, must, not) abort_argument_length(arg, must, not) abort_argument_diff_length(arg1, arg2) abort_argument_value(arg, valid_values)
arg |
A character string with the argument name. |
must |
A character string specifying a condition the argument must fulfill. |
not |
Either a character string specifying a condition the argument
must not fulfill or the bare (unquoted) argument name. In the last case,
the function evaluates the argument type ( |
arg1 , arg2
|
A character string with the argument name. |
valid_values |
A character vector with the valid values. |
Each function returns a classed error condition.
abort_argument_type()
returns a error_argument_type
class,
abort_argument_length()
returns a error_argument_length
class, abort_argument_diff_length()
returns a
error_argument_diff_length
class and abort_argument_value()
returns a error_argument_value
class.
abort_column_not_found()
,
abort_no_method_for_class()
Creates a custom error condition created with
rlang::abort()
with a - hopefully - more useful
error message and metadata.
abort_column_not_found(data, col_name)
abort_column_not_found(data, col_name)
data |
A data frame. |
col_name |
A character vector with the column name. |
Returns an error condition of classerror_column_not_found.
abort_argument_type()
,
abort_argument_class()
,
abort_argument_length()
,
abort_argument_diff_length()
,
abort_no_method_for_class()
,
abort_package_not_installed()
Creates a custom error condition created with
rlang::abort()
with a - hopefully - more useful
error message and metadata.
abort_no_method_for_class(fun, class, ...)
abort_no_method_for_class(fun, class, ...)
fun |
A character vector with the function name. |
class |
A character vector with the class name. |
... |
Extra message to be added to the error message. Must be character string. |
Returns an error condition of classerror_no_method_for_class.
abort_argument_type()
,
abort_argument_class()
,
abort_argument_length()
,
abort_argument_diff_length()
,
abort_column_not_found()
,
abort_package_not_installed()
Creates a custom error condition created with
rlang::abort()
with a - hopefully - more useful
error message and metadata.
abort_package_not_installed(package)
abort_package_not_installed(package)
package |
A character string with the required package name. |
Returns an error condition of classerror_package_not_installed.
abort_argument_type()
,
abort_argument_class()
,
abort_argument_length()
,
abort_argument_diff_length()
,
abort_column_not_found()
,
abort_no_method_for_class()
Computes some common model accuracy indices, such as the R squared, mean absolute error, mean absolute percent error and root mean square error.
accuracy(model, na.rm = FALSE) ## Default S3 method: accuracy(model, na.rm = FALSE) ## S3 method for class 'lvmisc_cv' accuracy(model, na.rm = FALSE) ## S3 method for class 'lm' accuracy(model, na.rm = FALSE) ## S3 method for class 'lmerMod' accuracy(model, na.rm = FALSE)
accuracy(model, na.rm = FALSE) ## Default S3 method: accuracy(model, na.rm = FALSE) ## S3 method for class 'lvmisc_cv' accuracy(model, na.rm = FALSE) ## S3 method for class 'lm' accuracy(model, na.rm = FALSE) ## S3 method for class 'lmerMod' accuracy(model, na.rm = FALSE)
model |
An object of class |
na.rm |
A logical value indicating whether or not to strip |
The method for the lm
class (or for the lvmisc_cv
class of a lm
) returns a data frame with the columns AIC
(Akaike information criterion), BIC
(Bayesian information
criterion), R2
(R squared), R2_adj
(adjusted R squared),
MAE
(mean absolute error), MAPE
(mean absolute percent
error) and RMSE
(root mean square error).
The method for the lmerMod
(or for the lvmisc_cv
class of a
lmerMod
) returns a data frame with the columns R2_marg
and
R2_cond
instead of the columns R2
and R2_adj
.
All the other columns are the same as the method for lm
.
R2_marg
is the marginal R squared, which considers only the variance
by the fixed effects of a mixed model, and R2_cond
is the
conditional R squared, which considers both fixed and random effects
variance.
An object of class lvmisc_accuracy
. See "Details" for more
information.
mtcars <- tibble::as_tibble(mtcars, rownames = "car") m <- stats::lm(disp ~ mpg, mtcars) cv <- loo_cv(m, mtcars, car, keep = "used") accuracy(m) accuracy(cv)
mtcars <- tibble::as_tibble(mtcars, rownames = "car") m <- stats::lm(disp ~ mpg, mtcars) cv <- loo_cv(m, mtcars, car, keep = "used") accuracy(m) accuracy(cv)
Computes the bias (mean error) between the input vectors.
bias(actual, predicted, na.rm = FALSE)
bias(actual, predicted, na.rm = FALSE)
actual |
A numeric vector with the actual values. |
predicted |
A numeric vector with the predicted values. Each element in
this vector must be a prediction for the corresponding element in
|
na.rm |
A logical value indicating whether |
A double scalar with the bias value.
actual <- runif(10) predicted <- runif(10) bias(actual, predicted)
actual <- runif(10) predicted <- runif(10) bias(actual, predicted)
bmi
calculates the BMI in kilograms per meter squared.
bmi(mass, height)
bmi(mass, height)
mass , height
|
A numerical vector with body mass and height data. |
Returns a double vector with the element-wise body mass index (BMI).
mass <- sample(50:100, 20) height <- rnorm(20, mean = 1.7, sd = 0.2) bmi(mass, height)
mass <- sample(50:100, 20) height <- rnorm(20, mean = 1.7, sd = 0.2) bmi(mass, height)
bmi_cat
returns the element-wise BMI category as factor with 6 levels:
Underweight (18.5 < BMI)
Normal weight (18.5 BMI < 25)
Overweight (25 BMI < 30)
Obesity class I (30 BMI < 35)
Obesity class II (35 BMI < 40)
Obesity class III (BMI 40)
bmi_cat(bmi)
bmi_cat(bmi)
bmi |
A numeric vector with BMI data. |
A vector of class factor
with 6 levels: "Underweight",
"Normal weight", "Overweight", "Obesity class I", "Obesity class II"
and "Obesity class III".
mass <- sample(50:100, 20) height <- rnorm(20, mean = 1.7, sd = 0.2) bmi <- bmi(mass, height) bmi_cat(bmi)
mass <- sample(50:100, 20) height <- rnorm(20, mean = 1.7, sd = 0.2) bmi <- bmi(mass, height) bmi_cat(bmi)
Center a variable by subtracting the mean from each element. Centering can
be performed by the grand mean when by = NULL
(the default), or by
group means when by
is a factor variable.
center_variable(variable, scale = FALSE, by = NULL)
center_variable(variable, scale = FALSE, by = NULL)
variable |
A numeric vector. |
scale |
A logical vector. If |
by |
A vector with the |
A numeric vector.
df <- data.frame( id = 1:20, group = as.factor(sample(c("A", "B"), 20, replace = TRUE)), body_mass = rnorm(20, mean = 65, sd = 12) ) df$body_mass_centered <- center_variable(df$body_mass, by = df$group) df
df <- data.frame( id = 1:20, group = as.factor(sample(c("A", "B"), 20, replace = TRUE)), body_mass = rnorm(20, mean = 65, sd = 12) ) df$body_mass_centered <- center_variable(df$body_mass, by = df$group) df
Clear the console by printing 50 times the new line character ("\n"
).
cl()
cl()
Prints to console. Called by its side-effects.
Replace valid observations by NA
s when a given subject has more then
max_na
missing values.
clean_observations(data, id, var, max_na)
clean_observations(data, id, var, max_na)
data |
A data frame, or data frame extension (e.g. a tibble). |
id |
The bare (unquoted) name of the column that identifies each subject. |
var |
The bare (unquoted) name of the column to be cleaned. |
max_na |
An integer indicating the maximum number of |
The original data
with the var
observations matching
the max_na
criterion replaced by NA
.
set.seed(10) data <- data.frame( id = rep(1:5, each = 4), time = rep(1:4, 5), score = sample(c(1:5, rep(NA, 2)), 20, replace = TRUE) ) clean_observations(data, id, score, 1)
set.seed(10) data <- data.frame( id = rep(1:5, each = 4), time = rep(1:4, 5), score = sample(c(1:5, rep(NA, 2)), 20, replace = TRUE) ) clean_observations(data, id, score, 1)
Computes some common model accuracy indices of several different models at once, allowing model comparison.
compare_accuracy(..., rank_by = NULL, quiet = FALSE)
compare_accuracy(..., rank_by = NULL, quiet = FALSE)
... |
A list of models. The models can be of the same or of different
classes, including |
rank_by |
A character string with the name of an accuracy index to rank the models by. |
quiet |
A logical indicating whether or not to show any warnings. If
|
A data.frame
with a model per row and an index per column.
m1 <- lm(Sepal.Length ~ Species, data = iris) m2 <- lm(Sepal.Length ~ Species + Petal.Length, data = iris) m3 <- lm(Sepal.Length ~ Species * Petal.Length, data = iris) compare_accuracy(m1, m2, m3) if (require(lme4, quietly = TRUE)) { mtcars <- tibble::as_tibble(mtcars, rownames = "cars") m1 <- lm(Sepal.Length ~ Species, data = iris) m2 <- lmer( Sepal.Length ~ Sepal.Width + Petal.Length + (1 | Species), data = iris ) m3 <- lm(disp ~ mpg * hp, mtcars) cv3 <- loo_cv(m3, mtcars, cars) compare_accuracy(m1, m2, cv3, rank_by = "AIC") }
m1 <- lm(Sepal.Length ~ Species, data = iris) m2 <- lm(Sepal.Length ~ Species + Petal.Length, data = iris) m3 <- lm(Sepal.Length ~ Species * Petal.Length, data = iris) compare_accuracy(m1, m2, m3) if (require(lme4, quietly = TRUE)) { mtcars <- tibble::as_tibble(mtcars, rownames = "cars") m1 <- lm(Sepal.Length ~ Species, data = iris) m2 <- lmer( Sepal.Length ~ Sepal.Width + Petal.Length + (1 | Species), data = iris ) m3 <- lm(disp ~ mpg * hp, mtcars) cv3 <- loo_cv(m3, mtcars, cars) compare_accuracy(m1, m2, cv3, rank_by = "AIC") }
Creates a project structure, including sub-directories, and initialization of a git repository.
create_proj( path, sub_dirs = "default", use_git = TRUE, use_gitignore = "default", use_readme = TRUE )
create_proj( path, sub_dirs = "default", use_git = TRUE, use_gitignore = "default", use_readme = TRUE )
path |
A path to a directory that does not exist. |
sub_dirs |
A character vector. If |
use_git |
A logical value indicating whether or not to initialize a git
repository. Defaults to |
use_gitignore |
A character vector. If |
use_readme |
A logical value. If |
Path to the newly created project, invisibly.
Creates a factor based on equally spaced quantiles of a variable.
divide_by_quantile(data, n, na.rm = TRUE)
divide_by_quantile(data, n, na.rm = TRUE)
data |
A numeric vector. |
n |
An integer specifying the number of levels in the factor to be created. |
na.rm |
A logical vector indicating whether the |
A vector of class factor
indicating in which quantile the
element in data
belongs.
x <- c(sample(1:20, 9), NA) divide_by_quantile(x, 3)
x <- c(sample(1:20, 9), NA) divide_by_quantile(x, 3)
Computes the element-wise error between the input vectors.
error(actual, predicted)
error(actual, predicted)
actual |
A numeric vector with the actual values |
predicted |
A numeric vector with the predicted values. Each element in
this vector must be a prediction for the corresponding element in
|
Returns a double vector with the element-wise error values.
error_pct()
,
error_abs()
,
error_abs_pct()
,
error_sqr()
.
actual <- runif(10) predicted <- runif(10) error(actual, predicted)
actual <- runif(10) predicted <- runif(10) error(actual, predicted)
Computes the element-wise absolute errors between the input vectors.
error_abs(actual, predicted)
error_abs(actual, predicted)
actual |
A numeric vector with the actual values |
predicted |
A numeric vector with the predicted values. Each element in
this vector must be a prediction for the corresponding element in
|
Returns a double vector with the element-wise absolute error values.
error()
,
error_pct()
,
error_abs_pct()
,
error_sqr()
.
actual <- runif(10) predicted <- runif(10) error_abs(actual, predicted)
actual <- runif(10) predicted <- runif(10) error_abs(actual, predicted)
Computes the element-wise absolute percent errors between the input vectors.
error_abs_pct(actual, predicted)
error_abs_pct(actual, predicted)
actual |
A numeric vector with the actual values |
predicted |
A numeric vector with the predicted values. Each element in
this vector must be a prediction for the corresponding element in
|
Returns a double vector with the element-wise absolute percent error values.
A vector of the class lvmisc_percent
with the element-wise
absolute percent error values.
error()
,
error_pct()
,
error_abs()
,
error_sqr()
.
actual <- runif(10) predicted <- runif(10) error_abs_pct(actual, predicted)
actual <- runif(10) predicted <- runif(10) error_abs_pct(actual, predicted)
Computes the element-wise percent error between the input vectors.
error_pct(actual, predicted)
error_pct(actual, predicted)
actual |
A numeric vector with the actual values |
predicted |
A numeric vector with the predicted values. Each element in
this vector must be a prediction for the corresponding element in
|
Returns a double vector with the element-wise percent error values.
A vector of the class lvmisc_percent
with the element-wise
percent error values.
error()
,
error_abs()
,
error_abs_pct()
,
error_sqr()
.
actual <- runif(10) predicted <- runif(10) error_pct(actual, predicted)
actual <- runif(10) predicted <- runif(10) error_pct(actual, predicted)
Computes the element-wise squared errors between the input vectors.
error_sqr(actual, predicted)
error_sqr(actual, predicted)
actual |
A numeric vector with the actual values |
predicted |
A numeric vector with the predicted values. Each element in
this vector must be a prediction for the corresponding element in
|
Returns a double vector with the element-wise squared error values.
error()
,
error_pct()
,
error_abs()
,
error_abs_pct()
.
actual <- runif(10) predicted <- runif(10) error_sqr(actual, predicted)
actual <- runif(10) predicted <- runif(10) error_sqr(actual, predicted)
Extract information from the trained models from a cross-validation
get_cv_fixed_eff(cv) get_cv_r2(cv)
get_cv_fixed_eff(cv) get_cv_r2(cv)
cv |
An object of class |
get_cv_fixed_eff()
returns a tibble with the estimated
value for each coefficient of each trained model and its associated
standard error. get_cv_r2()
returns a tibble with the R squared
for each of the trained models.
is_outlier
returns a logical vector indicating whether a value is an
outlier based on the rule of 1.5 times the interquartile range above the
third quartile or below the first quartile.
is_outlier(x, na.rm = FALSE)
is_outlier(x, na.rm = FALSE)
x |
A numerical vector |
na.rm |
A logical value indicating whether |
A logical vector.
stats::IQR()
,
stats::quantile()
x <- c(1:8, NA, 15) is_outlier(x, na.rm = TRUE)
x <- c(1:8, NA, 15) is_outlier(x, na.rm = TRUE)
Computes the Bland-Altman limits of agreement between the input vectors.
loa(actual, predicted, na.rm = FALSE)
loa(actual, predicted, na.rm = FALSE)
actual |
A numeric vector with the actual values. |
predicted |
A numeric vector with the predicted values. Each element in
this vector must be a prediction for the corresponding element in
|
na.rm |
A logical value indicating whether |
A named list with the lower and upper limits of agreement values, respectively.
actual <- runif(10) predicted <- runif(10) loa(actual, predicted)
actual <- runif(10) predicted <- runif(10) loa(actual, predicted)
Cross-validates the model using the leave-one-out approach. In this method each subject's data is separated into a testing data set, and all other subject's are kept in the training data set, with as many resamples as the number of subjects in the original data set. It computes the model's predicted value in the testing data set for each subject.
loo_cv(model, data, id, keep = "all") ## Default S3 method: loo_cv(model, data, id, keep = "all") ## S3 method for class 'lm' loo_cv(model, data, id, keep = "all") ## S3 method for class 'lmerMod' loo_cv(model, data, id, keep = "all")
loo_cv(model, data, id, keep = "all") ## Default S3 method: loo_cv(model, data, id, keep = "all") ## S3 method for class 'lm' loo_cv(model, data, id, keep = "all") ## S3 method for class 'lmerMod' loo_cv(model, data, id, keep = "all")
model |
An object containing a model. |
data |
A data frame. |
id |
The bare (unquoted) name of the column which identifies subjects. |
keep |
A character string which controls which columns are present in the output. Can be one of three options:
|
Returns an object of class lvmisc_cv
. A tibble containing the
".actual"
and ".predicted"
columns.
mtcars$car <- row.names(mtcars) m <- stats::lm(disp ~ mpg, mtcars) loo_cv(m, mtcars, car, keep = "used")
mtcars$car <- row.names(mtcars) m <- stats::lm(disp ~ mpg, mtcars) loo_cv(m, mtcars, car, keep = "used")
lt()
prints the last error and the full backtrace and le()
returns the last error with a simplified backtrace. These functions are
just wrappers to rlang::last_trace()
and
rlang::last_error()
respectively.
lt() le()
lt() le()
An object of class rlang_trace
.
An object of class rlang_error
.
lunique
returns the number of non-NA
unique elements and
lna
returns the number of NA
s.
lunique(x) lna(x)
lunique(x) lna(x)
x |
A vector. |
A non-negative integer.
x <- sample(c(1:3, NA), 10, replace = TRUE) lunique(x) lna(x)
x <- sample(c(1:3, NA), 10, replace = TRUE) lunique(x) lna(x)
Computes the average error between the input vectors.
mean_error(actual, predicted, na.rm = FALSE)
mean_error(actual, predicted, na.rm = FALSE)
actual |
A numeric vector with the actual values. |
predicted |
A numeric vector with the predicted values. Each element in
this vector must be a prediction for the corresponding element in
|
na.rm |
A logical value indicating whether |
Returns a double scalar with the mean error value.
mean_error_pct()
,
mean_error_abs()
,
mean_error_abs_pct()
,
mean_error_sqr()
,
mean_error_sqr_root()
actual <- runif(10) predicted <- runif(10) mean_error(actual, predicted)
actual <- runif(10) predicted <- runif(10) mean_error(actual, predicted)
Computes the average absolute error between the input vectors.
mean_error_abs(actual, predicted, na.rm = FALSE)
mean_error_abs(actual, predicted, na.rm = FALSE)
actual |
A numeric vector with the actual values. |
predicted |
A numeric vector with the predicted values. Each element in
this vector must be a prediction for the corresponding element in
|
na.rm |
A logical value indicating whether |
Returns a double scalar with the mean absolute error value.
mean_error()
,
mean_error_pct()
,
mean_error_abs_pct()
,
mean_error_sqr()
,
mean_error_sqr_root()
actual <- runif(10) predicted <- runif(10) mean_error_abs(actual, predicted)
actual <- runif(10) predicted <- runif(10) mean_error_abs(actual, predicted)
Computes the average absolute percent error between the input vectors.
mean_error_abs_pct(actual, predicted, na.rm = FALSE)
mean_error_abs_pct(actual, predicted, na.rm = FALSE)
actual |
A numeric vector with the actual values. |
predicted |
A numeric vector with the predicted values. Each element in
this vector must be a prediction for the corresponding element in
|
na.rm |
A logical value indicating whether |
Returns a double scalar with the mean absolute percent error value.
A vector of the class lvmisc_percent
.
mean_error()
,
mean_error_abs()
,
mean_error_pct()
,
mean_error_sqr()
,
mean_error_sqr_root()
actual <- runif(10) predicted <- runif(10) mean_error_abs_pct(actual, predicted)
actual <- runif(10) predicted <- runif(10) mean_error_abs_pct(actual, predicted)
Computes the average percent error between the input vectors.
mean_error_pct(actual, predicted, na.rm = FALSE)
mean_error_pct(actual, predicted, na.rm = FALSE)
actual |
A numeric vector with the actual values. |
predicted |
A numeric vector with the predicted values. Each element in
this vector must be a prediction for the corresponding element in
|
na.rm |
A logical value indicating whether |
Returns a double scalar with the mean percent error value.
A vector of the class lvmisc_percent
.
mean_error()
,
mean_error_abs()
,
mean_error_abs_pct()
,
mean_error_sqr()
,
mean_error_sqr_root()
actual <- runif(10) predicted <- runif(10) mean_error_pct(actual, predicted)
actual <- runif(10) predicted <- runif(10) mean_error_pct(actual, predicted)
Computes the average square error between the input vectors.
mean_error_sqr(actual, predicted, na.rm = FALSE)
mean_error_sqr(actual, predicted, na.rm = FALSE)
actual |
A numeric vector with the actual values. |
predicted |
A numeric vector with the predicted values. Each element in
this vector must be a prediction for the corresponding element in
|
na.rm |
A logical value indicating whether |
Returns a double scalar with the mean square error value.
mean_error()
,
mean_error_abs()
,
mean_error_pct()
,
mean_error_abs_pct()
,
mean_error_sqr_root()
actual <- runif(10) predicted <- runif(10) mean_error_sqr(actual, predicted)
actual <- runif(10) predicted <- runif(10) mean_error_sqr(actual, predicted)
Computes the root mean square error between the input vectors.
mean_error_sqr_root(actual, predicted, na.rm = FALSE)
mean_error_sqr_root(actual, predicted, na.rm = FALSE)
actual |
A numeric vector with the actual values. |
predicted |
A numeric vector with the predicted values. Each element in
this vector must be a prediction for the corresponding element in
|
na.rm |
A logical value indicating whether |
Returns a double scalar with the root mean square error value.
mean_error()
,
mean_error_abs()
,
mean_error_pct()
,
mean_error_abs_pct()
,
mean_error_sqr()
actual <- runif(10) predicted <- runif(10) mean_error_sqr_root(actual, predicted)
actual <- runif(10) predicted <- runif(10) mean_error_sqr_root(actual, predicted)
Value matching
x %!in% table
x %!in% table
x |
Vector with the values to be matched. |
table |
Vector with the values to be matched against. |
A logical vector indicating which values are not in table
.
x <- 8:12 x %!in% 1:10
x <- 8:12 x %!in% 1:10
Shortcut to print all rows of a data frame or tibble. Useful to inspect the whole tibble, as it prints by default only the first 20 rows.
pa(data)
pa(data)
data |
A data frame or tibble. |
Prints data
and returns it invisibly.
print()
and
printing tibbles.
df <- dplyr::starwars pa(df)
df <- dplyr::starwars pa(df)
percent
vectorCreates a double vector that represents percentages. When printed, it is
multiplied by 100 and suffixed with %
.
percent(x = double()) is_percent(x) as_percent(x)
percent(x = double()) is_percent(x) as_percent(x)
x |
|
An S3 vector of class lvmisc_percent
.
percent(c(0.25, 0.5, 0.75))
percent(c(0.25, 0.5, 0.75))
percent_change
returns the element-wise percent change between two
numeric vectors.
percent_change(baseline, followup)
percent_change(baseline, followup)
baseline , followup
|
A numeric vector with data to compute the percent change. |
A vector of class lvmisc_percent
.
baseline <- sample(20:40, 10) followup <- baseline * runif(10, min = 0.5, max = 1.5) percent_change(baseline, followup)
baseline <- sample(20:40, 10) followup <- baseline * runif(10, min = 0.5, max = 1.5) percent_change(baseline, followup)
Create a Bland-Altman plot as described by Bland & Altman (1986).
plot_bland_altman(x, ...)
plot_bland_altman(x, ...)
x |
An object of class |
... |
Additional arguments to be passed to |
A ggplot
object.
Bland, J.M. & Altman, D.G. (1986). Statistical methods for assessing agreement between two methods of clinical measurement. Lancet, 8(1), 307-10. doi:10.1016/S0140-6736(86)90837-8
mtcars <- tibble::as_tibble(mtcars, rownames = "car") m <- stats::lm(disp ~ mpg, mtcars) cv <- loo_cv(m, mtcars, car) plot_bland_altman(cv, colour = as.factor(am))
mtcars <- tibble::as_tibble(mtcars, rownames = "car") m <- stats::lm(disp ~ mpg, mtcars) cv <- loo_cv(m, mtcars, car) plot_bland_altman(cv, colour = as.factor(am))
Plotting functions for some common model diagnostics.
plot_model(model) plot_model_residual_fitted(model) plot_model_scale_location(model) plot_model_qq(model) plot_model_cooks_distance(model) plot_model_multicollinearity(model)
plot_model(model) plot_model_residual_fitted(model) plot_model_scale_location(model) plot_model_qq(model) plot_model_cooks_distance(model) plot_model_multicollinearity(model)
model |
An object containing a model. |
plot_model_residual_fitted()
plots the model residuals
versus the fitted values. plot_model_scale_location()
plots the
square root of absolute value of the model residuals versus the fitted
values. plot_model_qq()
plots a QQ plot of the model standardized
residuals. plot_model_cooks_distance()
plots a bat chart of each
observation Cook's distance value. plot_model_multicollinearity()
plots a bar chart of the variance inflation factor (VIF) for each of the
model terms. plot_model()
returns a plot grid with all the
applicable plot diagnostics to a given model.
A ggplot
object.
m <- lm(disp ~ mpg + hp + cyl + mpg:cyl, mtcars) plot_model(m) plot_model_residual_fitted(m) plot_model_scale_location(m) plot_model_qq(m) plot_model_cooks_distance(m) plot_model_multicollinearity(m)
m <- lm(disp ~ mpg + hp + cyl + mpg:cyl, mtcars) plot_model(m) plot_model_residual_fitted(m) plot_model_scale_location(m) plot_model_qq(m) plot_model_cooks_distance(m) plot_model_multicollinearity(m)
These functions are intended to be used to quickly generate simple
exploratory plots using the package ggplot2
.
plot_scatter(data, x, y, ...) plot_line(data, x, y, ...) plot_hist(data, x, bin_width = NULL, ...) plot_qq(data, x, ...)
plot_scatter(data, x, y, ...) plot_line(data, x, y, ...) plot_hist(data, x, bin_width = NULL, ...) plot_qq(data, x, ...)
data |
A data frame. |
x , y
|
x and y aesthetics as the bare (unquoted) name of a column in
|
... |
Additional arguments to be passed to the |
bin_width |
The width of the bins in a histogram. When |
A ggplot
object.
plot_scatter(mtcars, disp, mpg, color = factor(cyl)) plot_line(Orange, age, circumference, colour = Tree) plot_hist(iris, Petal.Width, bin_width = "FD") plot_qq(mtcars, mpg)
plot_scatter(mtcars, disp, mpg, color = factor(cyl)) plot_line(Orange, age, circumference, colour = Tree) plot_hist(iris, Petal.Width, bin_width = "FD") plot_qq(mtcars, mpg)
Returns the R squared values according to the model class.
r2(model) ## Default S3 method: r2(model) ## S3 method for class 'lm' r2(model) ## S3 method for class 'lmerMod' r2(model)
r2(model) ## Default S3 method: r2(model) ## S3 method for class 'lm' r2(model) ## S3 method for class 'lmerMod' r2(model)
model |
An object containing a model. |
R squared computations.
If the model is a linear model, it returns a data.frame
with the R squared and adjusted R squared values. If the model is a
linear mixed model it return a data.frame
with the marginal and
conditional R squared values as described by Nakagawa and Schielzeth
(2013). See the formulas for the computations in "Details".
Where is the variance explained by the model and
is the residual variance.
Where is the number of data points and
is the number of
predictors in the model.
Where is the variance of the fixed effects,
is
the variance of the random effects and
is the
residual variance.
Nakagawa, S., & Schielzeth, H. (2013). A general and simple method for obtaining R2 from generalized linear mixed-effects models. Methods in Ecology and Evolution, 4(2), 133–142. doi:10.1111/j.2041-210x.2012.00261.x.
m1 <- lm(Sepal.Length ~ Species, data = iris) r2(m1) if (require(lme4, quietly = TRUE)) { m2 <- lmer( Sepal.Length ~ Sepal.Width + Petal.Length + (1 | Species), data = iris ) r2(m2) }
m1 <- lm(Sepal.Length ~ Species, data = iris) r2(m1) if (require(lme4, quietly = TRUE)) { m2 <- lmer( Sepal.Length ~ Sepal.Width + Petal.Length + (1 | Species), data = iris ) r2(m2) }
Returns a vector with the length equal to the number of rows in the
data
with the baseline value of the var
repeated for every
time
value of each id
.
repeat_baseline_values(data, var, id, time, baseline_level, repeat_NA = TRUE)
repeat_baseline_values(data, var, id, time, baseline_level, repeat_NA = TRUE)
data |
A data frame. |
var |
The bare (unquoted) name of the column with the values to be repeated. |
id |
The bare (unquoted) name of the column that identifies each subject. |
time |
The bare (unquoted) name of the column with the time values. |
baseline_level |
The value of |
repeat_NA |
A logical vector indicating whether or not |
A vector of the same lenght and class of var
.
df <- data.frame( id = rep(1:5, each = 4), time = rep(1:4, 5), score = rnorm(20, mean = 10, sd = 2) ) df$baseline_score <- repeat_baseline_values(df, score, id, time, 1) df
df <- data.frame( id = rep(1:5, each = 4), time = rep(1:4, 5), score = rnorm(20, mean = 10, sd = 2) ) df$baseline_score <- repeat_baseline_values(df, score, id, time, 1) df
Captures the sequence of calls that lead to the current function. It is just
a wrapper to rlang::trace_back()
.
tb(...)
tb(...)
... |
Passed to |
An object of class rlang_trace
.
Computes the variance inflation factor (VIF). The VIF is a measure of how much the variance of a regression coefficient is increased due to collinearity.
vif(model) ## Default S3 method: vif(model) ## S3 method for class 'lm' vif(model) ## S3 method for class 'lmerMod' vif(model)
vif(model) ## Default S3 method: vif(model) ## S3 method for class 'lm' vif(model) ## S3 method for class 'lmerMod' vif(model)
model |
An object containing a model. |
As a rule of thumb for the interpretation of the VIF value, a VIF less than 5 indicates a low correlation of a given model term with the others, a VIF between 5 and 10 indicates a moderate correlation and a VIF greater than 10 indicates a high correlation.
It returns a data.frame
with three columns: the name of the
model term, the VIF value and its classification (see "Details").
James, G., Witten, D., Hastie, T., & Tibshirani, R. (eds.). (2013). An introduction to statistical learning: with applications in R. New York: Springer.
m <- lm(disp ~ mpg + cyl + mpg:cyl, mtcars) vif(m)
m <- lm(disp ~ mpg + cyl + mpg:cyl, mtcars) vif(m)