Introduction to Lomax Random Numbers in R
Lomax random numbers are a type of discrete distribution used to model real-world phenomena where the probability of occurrence decreases as the value increases. In this article, we will explore how to generate Lomax random numbers using both the VGAM package and an alternative inverse transform sampling method.
Background on Lomax Distribution
The Lomax distribution is a type of Pareto-type II distribution, which is characterized by its probability density function (PDF):
[ f(x) = \alpha \left( \frac{1}{x} + \frac{\beta}{x^\alpha} \right)^{-\alpha} ]
where $\alpha$ and $\beta$ are shape and scale parameters, respectively. The cumulative distribution function (CDF) is given by:
[ F(x) = 1 - \left( \frac{1}{x + \beta} \right)^\alpha ]
Generating Lomax Random Numbers using VGAM
The VGAM package provides a function called rlomax() to generate Lomax random numbers. This function takes three arguments: the number of observations, the shape parameter, and the scale parameter.
library(VGAM)
N <- 1e5
set.seed(2017)
x.VGAM <- rlomax(N, scale = 1, shape3.q = 2)
Generating Lomax Random Numbers using Inverse Transform Sampling
To generate Lomax random numbers using inverse transform sampling, we need to solve the following equation for $F^{-1}(x)$:
[ F(x) = x ]
where $U \sim Uniform(0, 1)$.
rlomax.its <- function(N, scale, shape) {
scale * ((1 - runif(N)) ^ (-1/shape) - 1)
}
Comparison of the Two Methods
We can compare the two methods by generating Lomax random numbers with different sample sizes and plotting their density plots.
set.seed(2017)
bind_rows(map(
setNames(2:5, paste0("N=10^", 2:5)),
~list(ITS = rlomax.its(10^(.x), 1, 2), VGAM = rlomax(10^(.x), 1, 2))),
.id = "N") %>%
gather(key, value, -N) %>%
ggplot(aes(log10(value), fill = key)) +
geom_density(alpha = 0.4) +
facet_wrap(~ N)
Benchmarking the Two Methods
We can benchmark the two methods using the microbenchmark() function to see which one is faster.
library(microbenchmark)
res <- microbenchmark(
ITS = rlomax.its(1e6, 1, 2),
VGAM = rlomax(1e6, 1, 2))
#Unit: milliseconds
# expr min lq mean median uq max neval cld
# ITS 79.22709 84.11703 88.48358 86.29181 91.07074 109.3536 100 a
# VGAM 159.56578 175.88731 218.92212 183.09769 222.64697 359.9311 100 b
Dependence of Runtime on Sample Size
We can plot the dependence of runtime on sample size to see how the two methods compare.
library(tidyverse)
library(ggthemes);
res <- map_df(seq(2, 6, length.out = 20), function(x)
cbind(x = 10^(x), microbenchmark(
ITS = rlomax.its(10^(x), 1, 2),
VGAM = rlomax(10^(x), 1, 2))))
res %>%
mutate(N = factor(as.numeric(factor(x)))) %>%
ggplot(aes(x = N, y = log10(time), colour = expr)) +
geom_tufteboxplot(outlier.colour="transparent") +
theme_minimal() +
scale_x_discrete(
breaks = c(1, 5, 10, 15, 20),
labels = paste0("10^", 2:6))
Conclusion
In conclusion, we have seen how to generate Lomax random numbers using both the VGAM package and an alternative inverse transform sampling method. We have also compared the two methods in terms of their density plots and runtime performance. The inverse transform sampling method is slightly faster on average, but its performance depends on the sample size.
Last modified on 2023-06-24