askill
statistical-analysis

statistical-analysisSafety 90Repository

Probability, distributions, hypothesis testing, and statistical inference. Use for A/B testing, experimental design, or statistical validation.

303 stars
6.1k downloads
Updated 2/7/2026

Package Files

Loading files...
SKILL.md

Statistical Analysis

Apply statistical methods to understand data and validate findings.

Quick Start

from scipy import stats
import numpy as np

# Descriptive statistics
data = np.array([1, 2, 3, 4, 5])
print(f"Mean: {np.mean(data)}")
print(f"Std: {np.std(data)}")

# Hypothesis testing
group1 = [23, 25, 27, 29, 31]
group2 = [20, 22, 24, 26, 28]
t_stat, p_value = stats.ttest_ind(group1, group2)
print(f"P-value: {p_value}")

Core Tests

T-Test (Compare Means)

# One-sample: Compare to population mean
stats.ttest_1samp(data, 100)

# Two-sample: Compare two groups
stats.ttest_ind(group1, group2)

# Paired: Before/after comparison
stats.ttest_rel(before, after)

Chi-Square (Categorical Data)

from scipy.stats import chi2_contingency

observed = np.array([[10, 20], [15, 25]])
chi2, p_value, dof, expected = chi2_contingency(observed)

ANOVA (Multiple Groups)

f_stat, p_value = stats.f_oneway(group1, group2, group3)

Confidence Intervals

from scipy import stats

confidence_level = 0.95
mean = np.mean(data)
se = stats.sem(data)
ci = stats.t.interval(confidence_level, len(data)-1, mean, se)

print(f"95% CI: [{ci[0]:.2f}, {ci[1]:.2f}]")

Correlation

# Pearson (linear)
r, p_value = stats.pearsonr(x, y)

# Spearman (rank-based)
rho, p_value = stats.spearmanr(x, y)

Distributions

# Normal
x = np.linspace(-3, 3, 100)
pdf = stats.norm.pdf(x, loc=0, scale=1)

# Sampling
samples = np.random.normal(0, 1, 1000)

# Test normality
stat, p_value = stats.shapiro(data)

A/B Testing Framework

def ab_test(control, treatment, alpha=0.05):
    """
    Run A/B test with statistical significance

    Returns: significant (bool), p_value (float)
    """
    t_stat, p_value = stats.ttest_ind(control, treatment)

    significant = p_value < alpha
    improvement = (np.mean(treatment) - np.mean(control)) / np.mean(control) * 100

    return {
        'significant': significant,
        'p_value': p_value,
        'improvement': f"{improvement:.2f}%"
    }

Interpretation

P-value < 0.05: Reject null hypothesis (statistically significant)

P-value >= 0.05: Fail to reject null (not significant)

Common Pitfalls

  • Multiple testing without correction
  • Small sample sizes
  • Ignoring assumptions (normality, independence)
  • Confusing correlation with causation
  • p-hacking (searching for significance)

Troubleshooting

Common Issues

Problem: Non-normal data for t-test

# Check normality first
stat, p = stats.shapiro(data)
if p < 0.05:
    # Use non-parametric alternative
    stat, p = stats.mannwhitneyu(group1, group2)  # Instead of ttest_ind

Problem: Multiple comparisons inflating false positives

from statsmodels.stats.multitest import multipletests

# Apply Bonferroni correction
p_values = [0.01, 0.03, 0.04, 0.02, 0.06]
rejected, p_adjusted, _, _ = multipletests(p_values, method='bonferroni')

Problem: Underpowered study (sample too small)

from statsmodels.stats.power import TTestIndPower

# Calculate required sample size
power_analysis = TTestIndPower()
sample_size = power_analysis.solve_power(
    effect_size=0.5,  # Medium effect (Cohen's d)
    power=0.8,        # 80% power
    alpha=0.05        # 5% significance
)
print(f"Required n per group: {sample_size:.0f}")

Problem: Heterogeneous variances

# Check with Levene's test
stat, p = stats.levene(group1, group2)
if p < 0.05:
    # Use Welch's t-test (default in scipy)
    t, p = stats.ttest_ind(group1, group2, equal_var=False)

Problem: Outliers affecting results

from scipy.stats import zscore

# Detect outliers (|z| > 3)
z_scores = np.abs(zscore(data))
clean_data = data[z_scores < 3]

# Or use robust statistics
median = np.median(data)
mad = np.median(np.abs(data - median))  # Median Absolute Deviation

Debug Checklist

  • Check sample size adequacy (power analysis)
  • Test normality assumption (Shapiro-Wilk)
  • Test homogeneity of variance (Levene's)
  • Check for outliers (z-scores, IQR)
  • Apply multiple testing correction if needed
  • Report effect sizes, not just p-values

Install

Download ZIP
Requires askill CLI v1.0+

AI Quality Score

95/100Analyzed 2/13/2026

A comprehensive and highly actionable guide to statistical analysis in Python using scipy and numpy. It covers descriptive statistics, hypothesis testing, confidence intervals, and A/B testing with clear code examples. The inclusion of troubleshooting steps for common statistical issues (normality, power analysis) and safety warnings makes it an excellent resource.

90
95
100
90
95

Metadata

Licenseunknown
Version-
Updated2/7/2026
Publisherbenchflow-ai

Tags

ci-cdtesting