To view the full wiki click here: Full TidyDensity Wiki
{TidyDensity} is a comprehensive R package that makes working with
random numbers and probability distributions easy, intuitive, and tidy.
Whether youβre simulating data, exploring distributions, or performing
statistical analysis, TidyDensity provides a unified interface that
integrates seamlessly with the tidyverse ecosystem.
- 35+ Probability Distributions: Generate random data from a wide variety of continuous and discrete distributions
- Tidy Output: All functions return tibbles with a consistent, predictable structure
- Rich Metadata: Each distribution includes density (
d_), probability (p_), quantile (q_), and random generation (r_) components - Beautiful Visualizations: Built-in plotting functions with support for multiple plot types
- Parameter Estimation: Estimate distribution parameters from empirical data using MLE, MME, and MVUE methods
- Bootstrap Analysis: Perform bootstrap resampling with integrated plotting and analysis tools
- Mixture Models: Create and analyze mixture distributions
- Interactive Plots: Generate interactive visualizations with plotly integration
Install the released version from CRAN:
install.packages("TidyDensity")Or install the development version from GitHub:
# install.packages("devtools")
devtools::install_github("spsanderson/TidyDensity")Generate random data from a normal distribution and visualize it:
library(TidyDensity)
library(dplyr)
library(ggplot2)
# Generate data from normal distribution
tn <- tidy_normal(.n = 100, .mean = 0, .sd = 1, .num_sims = 6)
# View the tibble structure
tn
#> # A tibble: 600 Γ 7
#> sim_number x y dx dy p q
#> <fct> <int> <dbl> <dbl> <dbl> <dbl> <dbl>
#> 1 1 1 -0.626 -3.51 0.000235 0.266 -0.626
#> 2 1 2 0.184 -3.37 0.000617 0.573 0.184
#> 3 1 3 -0.836 -3.22 0.00147 0.202 -0.836
#> 4 1 4 1.60 -3.07 0.00322 0.945 1.60
#> # ... with 596 more rowsAll tidy_ distribution functions return a tibble with the following
columns:
sim_number: Simulation identifierx: Index of generated pointy: The randomly generated valuedx: Density function x-valuesdy: Density function y-values (PDF)p: Cumulative probability (CDF)q: Quantile values
TidyDensity includes tidy_autoplot() for quick, publication-ready
visualizations:
# Density plot
tidy_autoplot(tn, .plot_type = "density")
# Quantile plot
tidy_autoplot(tn, .plot_type = "quantile")
# Probability plot
tidy_autoplot(tn, .plot_type = "probability")
# QQ plot
tidy_autoplot(tn, .plot_type = "qq")When simulating many distributions, the legend is automatically hidden for clarity:
tn <- tidy_normal(.n = 100, .num_sims = 20)
tidy_autoplot(tn, .plot_type = "density")TidyDensity supports 35+ probability distributions:
- Normal Family: Normal, Log-Normal, Inverse Normal
- Exponential Family: Exponential, Inverse Exponential
- Gamma Family: Gamma, Inverse Gamma
- Beta Family: Beta, Generalized Beta
- Pareto Family: Pareto, Inverse Pareto, Single Parameter Pareto, Generalized Pareto
- Weibull Family: Weibull, Inverse Weibull
- Burr Family: Burr, Inverse Burr
- Other: Cauchy, Chi-Square, F-Distribution, t-Distribution, Logistic, Paralogistic, Triangular, Uniform
- Bernoulli
- Binomial
- Zero-Truncated Binomial
- Geometric
- Zero-Truncated Geometric
- Hypergeometric
- Negative Binomial
- Poisson
- Zero-Truncated Poisson
Each distribution has a corresponding tidy_*() function, e.g.,
tidy_beta(), tidy_gamma(), tidy_poisson().
Estimate distribution parameters from empirical data:
# Generate sample data
x <- mtcars$mpg
# Estimate normal distribution parameters
est <- util_normal_param_estimate(x, .auto_gen_empirical = TRUE)
# View parameter estimates
est$parameter_tbl
#> # A tibble: 2 Γ 7
#> dist_type samp_size min max mean method shape_est
#> <chr> <int> <dbl> <dbl> <dbl> <chr> <dbl>
#> 1 Gaussian 32 10.4 33.9 20.1 MLE/MME 6.03
#> 2 Gaussian 32 10.4 33.9 20.1 MVUE 6.10
# Compare empirical data with fitted distribution
est$combined_data_tbl |>
tidy_combined_autoplot()Perform bootstrap resampling for robust statistical inference:
# Bootstrap resampling
bs <- tidy_bootstrap(mtcars$mpg, .num_sims = 2000)
# Bootstrap statistics
bootstrap_stat <- tidy_bootstrap(mtcars$mpg) |>
bootstrap_unnest_tbl() |>
summarise(
mean_est = mean(y),
sd_est = sd(y),
ci_lower = quantile(y, 0.025),
ci_upper = quantile(y, 0.975)
)Create mixture distributions by combining multiple distributions:
# Create a mixture of two normal distributions
mix <- tidy_mixture_density(
.tbl_list = list(
tidy_normal(.n = 100, .mean = -2, .sd = 0.5),
tidy_normal(.n = 100, .mean = 2, .sd = 0.5)
),
.mixture_type = "add"
)
tidy_autoplot(mix, .plot_type = "density")Work directly with your own data:
# Create empirical distribution from data
emp <- tidy_empirical(mtcars$mpg, .num_sims = 5)
# Plot empirical distribution
tidy_autoplot(emp, .plot_type = "density")Compare multiple distributions with different parameters:
# Create multiple simulations with different parameters
multi <- tidy_multi_single_dist(
.tidy_dist = "tidy_normal",
.param_list = list(
list(.n = 100, .mean = 0, .sd = 1),
list(.n = 100, .mean = 0, .sd = 2),
list(.n = 100, .mean = 2, .sd = 1)
)
)
tidy_autoplot(multi, .plot_type = "density")Contributions are welcome! Hereβs how you can help:
- π Report bugs or request features via GitHub Issues
- π Submit pull requests for bug fixes or new features
- π Improve documentation or add examples
- β Star the repository to show your support
Please follow our Code of Conduct when participating in this project.
If you use TidyDensity in your research, please cite it:
citation("TidyDensity")- π Read the documentation
- π Report bugs at GitHub Issues
- π¬ Ask questions on GitHub Discussions
Steven P. Sanderson II, MPH
- Email: spsanderson@gmail.com
- ORCID: 0009-0006-7661-8247
MIT License - see LICENSE.md for details