How to Generate Lomax Random Numbers in R: A Comparison of Two Methods

Introduction to Lomax Random Numbers in R

Lomax random numbers are a type of discrete distribution used to model real-world phenomena where the probability of occurrence decreases as the value increases. In this article, we will explore how to generate Lomax random numbers using both the VGAM package and an alternative inverse transform sampling method.

Background on Lomax Distribution

The Lomax distribution is a type of Pareto-type II distribution, which is characterized by its probability density function (PDF):

[ f(x) = \alpha \left( \frac{1}{x} + \frac{\beta}{x^\alpha} \right)^{-\alpha} ]

where $\alpha$ and $\beta$ are shape and scale parameters, respectively. The cumulative distribution function (CDF) is given by:

[ F(x) = 1 - \left( \frac{1}{x + \beta} \right)^\alpha ]

Generating Lomax Random Numbers using VGAM

The VGAM package provides a function called rlomax() to generate Lomax random numbers. This function takes three arguments: the number of observations, the shape parameter, and the scale parameter.

library(VGAM)
N <- 1e5
set.seed(2017)
x.VGAM <- rlomax(N, scale = 1, shape3.q = 2)

Generating Lomax Random Numbers using Inverse Transform Sampling

To generate Lomax random numbers using inverse transform sampling, we need to solve the following equation for $F^{-1}(x)$:

[ F(x) = x ]

where $U \sim Uniform(0, 1)$.

rlomax.its <- function(N, scale, shape) {
    scale * ((1 - runif(N)) ^ (-1/shape) - 1)
}

Comparison of the Two Methods

We can compare the two methods by generating Lomax random numbers with different sample sizes and plotting their density plots.

set.seed(2017)
bind_rows(map(
    setNames(2:5, paste0("N=10^", 2:5)),
    ~list(ITS = rlomax.its(10^(.x), 1, 2), VGAM = rlomax(10^(.x), 1, 2))),
    .id = "N") %&gt;%
    gather(key, value, -N) %&gt;%
    ggplot(aes(log10(value), fill = key)) +
    geom_density(alpha = 0.4) +
    facet_wrap(~ N)

Benchmarking the Two Methods

We can benchmark the two methods using the microbenchmark() function to see which one is faster.

library(microbenchmark)
res <- microbenchmark(
    ITS = rlomax.its(1e6, 1, 2),
    VGAM = rlomax(1e6, 1, 2))
#Unit: milliseconds
# expr       min        lq      mean    median        uq      max neval cld
#  ITS  79.22709  84.11703  88.48358  86.29181  91.07074 109.3536   100  a
# VGAM 159.56578 175.88731 218.92212 183.09769 222.64697 359.9311   100   b

Dependence of Runtime on Sample Size

We can plot the dependence of runtime on sample size to see how the two methods compare.

library(tidyverse)
library(ggthemes);
res <- map_df(seq(2, 6, length.out = 20), function(x)
    cbind(x = 10^(x), microbenchmark(
        ITS = rlomax.its(10^(x), 1, 2),
        VGAM = rlomax(10^(x), 1, 2))))
res %&gt;%
    mutate(N = factor(as.numeric(factor(x)))) %&gt;%
    ggplot(aes(x = N, y = log10(time), colour = expr)) +
    geom_tufteboxplot(outlier.colour="transparent") +
    theme_minimal() +
    scale_x_discrete(
        breaks = c(1, 5, 10, 15, 20),
        labels = paste0("10^", 2:6))

Conclusion

In conclusion, we have seen how to generate Lomax random numbers using both the VGAM package and an alternative inverse transform sampling method. We have also compared the two methods in terms of their density plots and runtime performance. The inverse transform sampling method is slightly faster on average, but its performance depends on the sample size.


Last modified on 2023-06-24