Skip to contents

[Experimental]

Computes the difference in means between two samples along with a confidence interval. The interval is computed using a t-distribution framework via safe_t_test().

Supports both independent and paired samples. For paired data, observations are matched using paired_by, and the inference is based on within-pair differences using a paired t-distribution framework.

Usage

s_diff_mean_ci(
  df1,
  df2,
  .var,
  paired = FALSE,
  paired_by = NULL,
  conf.level = 0.95,
  ...
)

Arguments

df1

(data.frame)
Dataset for the first sample.

df2

(data.frame)
Dataset for the second sample.

.var

(character(1))
Column name in df1 and df2 containing numeric values.

paired

(logical(1))
Whether the samples are paired.

paired_by

(character or NULL)
Column name(s) in df1 and df2 used to match observations between datasets. Required when paired = TRUE and must uniquely identify each pair in both datasets.

conf.level

(proportion)
Confidence level for the interval.

...

Additional arguments passed to safe_t_test().

Value

A named list with a single element diff_mean_ci, containing the difference in means and confidence interval estimates.

Details

The first sample is taken from df1[[.var]] and the second from df2[[.var]].

If paired = TRUE, observations are matched using paired_by. In this case, the difference in means and its confidence interval are computed using a t-statistic for paired data (based on within-pair differences). Otherwise, a t-statistic for two independent samples is used.

Any NA or NaN values in columns specified by paired_by are ignored and excluded from matching (see merge(..., incomparables = c(NA, NaN))).

When paired = TRUE, only complete pairs are passed to safe_t_test() (i.e., rows with missing values in .var are removed prior to computation). For unpaired cases, missing values are removed separately from each sample before computation.

Examples

df1 <- data.frame(
  USUBJID = c("X01", "X02", "X03", "X04", "X05"),
  CHG = c(4, 1, -1, 9, -2)
)
df2 <- data.frame(
  USUBJID = c("X01", "X02", "X03", "X04", "X05"),
  CHG = c(-2, 4, -2, 5, 2)
)

# Paired
s_diff_mean_ci(df1, df2, "CHG", paired = TRUE, paired_by = "USUBJID")
#> $diff_mean_ci
#> diff_mean    ci_lwr    ci_upr 
#>  0.800000 -4.569389  6.169389 
#> attr(,"label")
#> [1] "Difference in Means + 95% CI"
#> 

# Unpaired
s_diff_mean_ci(df1, df2, "CHG")
#> $diff_mean_ci
#> diff_mean    ci_lwr    ci_upr 
#>  0.800000 -4.980938  6.580938 
#> attr(,"label")
#> [1] "Difference in Means + 95% CI"
#>