askill
clonalstats

clonalstatsSafety 88Repository

Generate comprehensive clonality statistics and diversity visualizations for TCR/BCR repertoire analysis. Quantifies clonal expansion, measures diversity metrics (Shannon, Simpson, Gini), and creates publication-ready plots.

20 stars
1.2k downloads
Updated 2/15/2026

Package Files

Loading files...
SKILL.md

ClonalStats Process Configuration

Purpose

Generate comprehensive clonality statistics and diversity visualizations for TCR/BCR repertoire analysis. Quantifies clonal expansion, measures diversity metrics (Shannon, Simpson, Gini), and creates publication-ready plots.

When to Use

  • To quantify clonal expansion patterns in TCR/BCR data
  • For diversity analysis comparing multiple samples or conditions
  • To identify hyperexpanded clones and their distribution
  • For rarefaction analysis to assess sampling depth
  • After ScRepCombiningExpression to analyze integrated TCR+RNA data

Configuration Structure

Process Enablement

[ClonalStats]
cache = true

Input Specification

[ClonalStats.in]
screpfile = ["ScRepCombiningExpression"]

Core Environment Variables

[ClonalStats.envs]
# Clone definition: "gene" (VDJC), "aa" (CDR3 amino acid), "nt" (CDR3 nucleotide)
clone_call = "aa"
# Chain analysis: "both", "TRA", "TRB", "TRG", "IGH", "IGL"
chain = "both"
# Data transformations (dplyr::mutate syntax)
mutaters = {}
# Data filtering (dplyr::filter syntax)
subset = null
# Output device parameters
devpars = {width = 800, height = 600, res = 100}
# Save code and data (large files - use with caution)
save_code = false
save_data = false

Case-Based Plot Generation

[ClonalStats.envs.cases."Case Name"]
viz_type = "volume"  # volume, abundance, length, residency, stat,
                    # composition, overlap, diversity, geneusage,
                    # positional, kmer, rarefaction

Diversity Metrics

MetricRangeInterpretationBest For
shannon0 - ∞Higher = more diversityGeneral comparison
inv.simpson1 - ∞Higher = more diversityCommon clones
gini.coeff0 - 10 = equality, 1 = inequalityClonality dominance
norm.entropy0 - 1Higher = more diversityEvenness-focused
chao1≥ richnessEstimates total richnessSmall samples
d50CountClones making up 50%Practical dominance

Interpretation:

  • High diversity = Many unique clones, even distribution (healthy repertoire)
  • Low diversity = Few dominant clones (antigen-specific response, infection, cancer)
  • Gini ≈ 1 = Very skewed, few clones dominate
  • Gini ≈ 0 = Even distribution

Visualization Types

viz_type options:

  • volume - Number of clones per sample/group
  • abundance - Clone abundance distribution (trend/histogram/density)
  • length - CDR3 sequence length distribution
  • residency - Clones present across groups (venn/upset)
  • stat - Expanded clone analysis (pies/sankey)
  • diversity - Diversity metrics (bar/box/violin)
  • geneusage - V/D/J gene usage frequency
  • rarefaction - Sampling depth assessment

Configuration Examples

Minimal Configuration

[ClonalStats.in]
screpfile = ["ScRepCombiningExpression"]

Standard Diversity Analysis

[ClonalStats.in]
screpfile = ["ScRepCombiningExpression"]

[ClonalStats.envs.cases."Diversity"]
viz_type = "diversity"
method = "shannon"
plot_type = "box"
group_by = "Diagnosis"
comparisons = true

[ClonalStats.envs.cases."Gini Coeff"]
viz_type = "diversity"
method = "gini.coeff"
plot_type = "violin"
group_by = "Diagnosis"
add_box = true

Expanded Clone Analysis

[ClonalStats.in]
screpfile = ["ScRepCombiningExpression"]

[ClonalStats.envs.cases."Expanded Clones"]
viz_type = "stat"
plot_type = "pies"
group_by = "Diagnosis"
subgroup_by = "seurat_clusters"
clones = {"Expanded (>2)" = "sel(Colitis > 2)"}

Rarefaction Analysis

[ClonalStats.in]
screpfile = ["ScRepCombiningExpression"]

[ClonalStats.envs.cases."Rarefaction"]
viz_type = "rarefaction"
group_by = "Patient"
q = 1  # 0=richness, 1=shannon, 2=simpson
n_boots = 20

Complete Analysis Suite

[ClonalStats.in]
screpfile = ["ScRepCombiningExpression"]

[ClonalStats.envs.cases."Volume"]
viz_type = "volume"

[ClonalStats.envs.cases."Abundance"]
viz_type = "abundance"
plot_type = "density"

[ClonalStats.envs.cases."Diversity"]
viz_type = "diversity"
method = "shannon"

[ClonalStats.envs.cases."Rarefaction"]
viz_type = "rarefaction"

Common Patterns

Disease vs Healthy

[ClonalStats.envs.cases."Comparison"]
viz_type = "diversity"
method = "gini.coeff"
plot_type = "box"
group_by = "Condition"
comparisons = true

Time Course

[ClonalStats.envs.cases."Timepoint"]
viz_type = "volume"
x = "Timepoint"

[ClonalStats.envs.cases."Diversity"]
viz_type = "diversity"
method = "shannon"
group_by = "Timepoint"

Treatment Response

[ClonalStats.envs.cases."Response"]
viz_type = "diversity"
method = "gini.coeff"
group_by = "Response"
plot_type = "box"
comparisons = true

Dependencies

  • Upstream: ScRepCombiningExpression (required)
  • Related: ScRepLoading, CDR3Clustering, TESSA (optional)

Validation Rules

  • Input must be valid scRepertoire object
  • For viz_type = "diversity", method must be supported
  • For rarefaction, n_boots should be ≥ 10
  • Use sel() syntax in clones parameter for filtering

Troubleshooting

Sample column not found: Input must have Sample column or specify x parameter.

Strange diversity values: Small repertoire sizes cause bias. Use plot_type = "box".

Rarefaction curves noisy: Increase n_boots (try 50-100).

Too many clones in stat plots: Use subset or stricter clones thresholds.

Plot generation slow: Use clone_call = "gene" for speed, apply subset.

Missing comparisons: Set comparisons = true to add significance tests.

Best Practices

  1. Start with default cases to see standard visualizations
  2. Use multiple diversity metrics: Shannon + Gini
  3. Check rarefaction curves to ensure sufficient sampling
  4. Document clone thresholds when defining expanded clones
  5. Use clone_call = "gene" for speed, "aa" for granularity
  6. Set save_data = true for debugging (watch disk space)
  7. Validate findings with complementary diversity indices
  8. Consider sample size: small samples underestimate richness

Install

Download ZIP
Requires askill CLI v1.0+

AI Quality Score

82/100Analyzed 2/19/2026

Comprehensive SKILL.md for ClonalStats process in immunopipe pipeline. Provides extensive TOML configuration examples, diversity metrics reference table, visualization types, and troubleshooting guidance. Highly actionable with clear structure. Somewhat project-specific but well-structured technical reference suitable for the immunopipe ecosystem. Scores bonus points for 'When to Use' section, structured steps, tags, skills folder location, and high-density technical content.

88
88
65
85
90

Metadata

Licenseunknown
Version-
Updated2/15/2026
Publisherpwwang

Tags

observability