Group Sequential Methods
When to Use This Skill
- Designing group sequential trials with interim analyses
- Implementing alpha spending functions
- Setting futility stopping rules
- Calculating information fractions
- Using sim_gs_n() for GS simulations
- Integrating with gsDesign2 package
Fundamental Concepts
Group Sequential Design
A group sequential design allows for:
- Early stopping for efficacy: If treatment effect is larger than expected
- Early stopping for futility: If treatment effect is unlikely to reach significance
- Reduced expected sample size: When treatment effect is present
Information Fraction
Information fraction at analysis k:
I_k / I_K = (events at analysis k) / (total planned events)
For time-to-event trials, information ≈ number of events.
Type I Error Spending
The key constraint: Σ α_k ≤ α (overall Type I error)
Spending functions distribute alpha across analyses.
Alpha Spending Functions
O'Brien-Fleming (OBF)
Properties:
- Conservative at early analyses
- Nearly full alpha at final analysis
- Difficult to stop early
- Maintains nominal Type I error
Formula:
α*(t) = 2 - 2Φ(z_{α/2} / √t)
When to Use:
- Want maximum power at final analysis
- Early efficacy stopping unlikely
- Regulatory preference for conservative early bounds
Pocock
Properties:
- Equal spending at each analysis
- Easier to stop early
- Inflated final alpha
- Lower power at final analysis
Formula:
α*(t) = α × log(1 + (e-1)t)
When to Use:
- Early stopping is a priority
- Treatment effect expected to be large
- Willing to sacrifice final analysis power
Hwang-Shih-DeCani (HSD)
Properties:
- Flexible family indexed by γ
- γ = -4: Similar to OBF
- γ = 1: Similar to Pocock
- γ = 0: Linear (Pocock-like)
Formula:
α*(t) = α × (1 - e^{-γt}) / (1 - e^{-γ})
When to Use:
- Want flexibility between OBF and Pocock
- Customized spending pattern needed
Spending Function Comparison
| Function | Early Spending | Final Power | Early Stopping |
|---|---|---|---|
| OBF | Low | High | Difficult |
| Pocock | High | Lower | Easier |
| HSD(γ=-4) | Low | High | Difficult |
| HSD(γ=1) | High | Lower | Easier |
Futility Boundaries
Binding Futility
- If futility boundary crossed, trial MUST stop
- Affects Type I error calculation
- More powerful than non-binding
Non-Binding Futility
- Crossing futility boundary is advisory
- Trial can continue at investigator discretion
- Conservative: assumes no early stopping for futility in Type I error
Beta-Spending for Futility
Similar to alpha-spending, but for Type II error:
β*(t) = spending function × β
simtrial GS Implementation
create_cut() - Define Analysis Timing
# Interim Analysis 1
ia1_cut <- create_cut(
planned_calendar_time = 20, # Minimum 20 months
target_event_overall = 100, # Target 100 events
max_extension_for_target_event = 24, # Wait up to 24 months for events
min_n_overall = 200, # At least 200 enrolled
min_followup = 12 # 12 months minimum follow-up
)
# Interim Analysis 2
ia2_cut <- create_cut(
planned_calendar_time = 32,
target_event_overall = 200,
max_extension_for_target_event = 34,
min_time_after_previous_analysis = 10 # At least 10 months after IA1
)
# Final Analysis
fa_cut <- create_cut(
planned_calendar_time = 45,
target_event_overall = 350
)
sim_gs_n() - Run GS Simulations
library(simtrial)
library(gsDesign2)
# Define enrollment
enroll_rate <- define_enroll_rate(
duration = c(4, 12),
rate = c(10, 30)
)
# Define failure rates
fail_rate <- define_fail_rate(
duration = c(3, 100),
fail_rate = log(2)/9,
hr = c(1, 0.6),
dropout_rate = 0.001
)
# Run simulation
results <- sim_gs_n(
n_sim = 1000,
sample_size = 400,
enroll_rate = enroll_rate,
fail_rate = fail_rate,
test = wlr,
cut = list(ia1 = ia1_cut, ia2 = ia2_cut, fa = fa_cut),
weight = fh(rho = 0, gamma = 0)
)
Integration with gsDesign2
library(gsDesign2)
# Design with gsDesign2
design <- gs_design_ahr(
enroll_rate = define_enroll_rate(duration = c(4, 12), rate = c(10, 30)),
fail_rate = define_fail_rate(
duration = c(3, 100),
fail_rate = log(2)/9,
hr = c(1, 0.6),
dropout_rate = 0.001
),
alpha = 0.025,
beta = 0.1,
analysis_time = c(24, 36, 48),
upper = gs_spending_bound,
upar = list(sf = gsDesign::sfLDOF, total_spend = 0.025),
lower = gs_spending_bound,
lpar = list(sf = gsDesign::sfHSD, param = -4, total_spend = 0.1)
) |> to_integer()
# Simulate with design object
sim_results <- sim_gs_n(
n_sim = 1000,
sample_size = max(design$analysis$n),
enroll_rate = design$enroll_rate,
fail_rate = design$fail_rate,
test = wlr,
cut = NULL, # Auto-generated from design
original_design = design,
weight = fh(rho = 0, gamma = 0)
)
Bound Updates with sim_gs_n()
When using original_design, sim_gs_n() can compute updated bounds:
# Results include planned and updated bounds
results <- sim_gs_n(
# ... parameters ...
original_design = design,
ia_alpha_spending = "min_planned_actual", # Conservative
fa_alpha_spending = "full_alpha" # Spend full alpha at FA
)
# Output includes:
# - planned_upper_bound, planned_lower_bound
# - updated_upper_bound, updated_lower_bound
Alpha Spending Options:
| ia_alpha_spending | Description |
|---|---|
| "min_planned_actual" | Conservative: min of planned and actual |
| "actual" | Spend based on actual information |
| fa_alpha_spending | Description |
|---|---|
| "full_alpha" | Spend remaining alpha at final |
| "info_frac" | Spend based on information fraction |
Different Tests Across Analyses
# Different tests at each analysis
ia1_test <- create_test(wlr, weight = fh(rho = 0, gamma = 0))
ia2_test <- create_test(wlr, weight = fh(rho = 0, gamma = 0.5))
fa_test <- create_test(wlr, weight = mb(delay = 6, w_max = Inf))
results <- sim_gs_n(
n_sim = 1000,
sample_size = 400,
enroll_rate = enroll_rate,
fail_rate = fail_rate,
test = list(ia1 = ia1_test, ia2 = ia2_test, fa = fa_test),
cut = list(ia1 = ia1_cut, ia2 = ia2_cut, fa = fa_cut)
)
Common GS Patterns
Two-Look Design (1 IA + FA)
# IA at 50% information, FA at 100%
ia_cut <- create_cut(target_event_overall = 150) # 50%
fa_cut <- create_cut(target_event_overall = 300) # 100%
sim_gs_n(
n_sim = 1000,
sample_size = 400,
test = wlr,
cut = list(ia = ia_cut, fa = fa_cut),
weight = fh(0, 0)
)
Three-Look Design (2 IA + FA)
# Standard 33%, 67%, 100% information
ia1_cut <- create_cut(target_event_overall = 100)
ia2_cut <- create_cut(target_event_overall = 200)
fa_cut <- create_cut(target_event_overall = 300)
Event-Driven with Calendar Constraints
# Events-based but with minimum calendar time
ia_cut <- create_cut(
target_event_overall = 150,
planned_calendar_time = 18, # At least 18 months
max_extension_for_target_event = 24 # Max 24 months
)
Operating Characteristics
Key Metrics to Evaluate
- Power: P(reject H0 | H1 true)
- Type I Error: P(reject H0 | H0 true)
- Expected Sample Size: E[N] under H0 and H1
- Expected Events: E[events] at each analysis
- Stopping Probabilities: P(stop at analysis k)
Simulation Summary
# Summarize simulation results
results_summary <- results |>
group_by(analysis) |>
summarise(
mean_events = mean(event),
mean_z = mean(z),
power = mean(z < qnorm(0.025)), # One-sided
.groups = "drop"
)
Best Practices
- Information Fraction: Target evenly spaced (e.g., 50%, 100% or 33%, 67%, 100%)
- Alpha Spending: OBF is default for most regulatory submissions
- Futility: Use non-binding to preserve flexibility
- Validation: Compare simulated power to gsDesign analytical results
- Documentation: Record all boundary calculations for regulatory submission
- Parallelization: Use
plan("multisession")for large simulations
Regulatory Considerations
- Pre-specify number and timing of interim analyses
- Pre-specify spending function and parameters
- Document stopping rules clearly in protocol
- Consider DSMB recommendations for unblinded reviews
- Maintain blinding for operational team
