askill
ensemble-content-scorer

ensemble-content-scorerSafety 100Repository

Multi-model consensus scoring for content ideas. Scores the same idea with Claude, GPT-4o, Gemini, and Grok in parallel, then aggregates for a balanced verdict. Reduces single-model bias and improves viral predictions.

1 stars
1.2k downloads
Updated 1/3/2026

Package Files

Loading files...
SKILL.md

Ensemble Content Scorer

Wisdom of crowds, but for AI. This skill scores your content ideas using multiple AI models, then aggregates for consensus. More reliable than single-model predictions.


WHAT IT DOES

                Content Idea
                     │
    ┌────────────────┼────────────────┐
    │                │                │
    ▼                ▼                ▼
[Claude]        [GPT-4o]         [Gemini]
  Score            Score            Score
    │                │                │
    └────────────────┼────────────────┘
                     │
                     ▼
            [Aggregator (Claude)]
                     │
                     ▼
         Consensus Score + Verdict

WHY MULTI-MODEL?

Single ModelEnsemble
May have biasesBiases cancel out
One perspectiveMultiple perspectives
Black box scoreTransparent reasoning
May miss nuancesCatches different angles

TRIGGERS

Use this skill when you say:

  • "Score this content idea"
  • "Is this topic worth pursuing?"
  • "Rate my video concept"
  • "Predict if this will go viral"
  • "Ensemble score: [topic]"

USAGE

In Claude Code (Recommended)

"Ensemble score: Statins myth-busting for Indian audience"

"Score this video idea: Why your LDL target depends on your risk"

"Rate these ideas and rank them:
1. GLP-1 agonists explained
2. Heart attack warning signs
3. Is coconut oil heart-healthy?"

CLI Mode

# Score single idea
python scripts/score_content.py --idea "Statins myth-busting for Indian audience"

# Score multiple ideas
python scripts/score_content.py --ideas "GLP-1 explained" "Statin myths" "CAC scoring"

# Use specific models
python scripts/score_content.py --idea "Topic" --models claude,gpt4o,gemini

SCORING DIMENSIONS

Each model scores on these dimensions (1-10):

DimensionWhat It Measures
RelevanceHow relevant to target audience (Indian patients/doctors)
NoveltyHow fresh is the angle? Been covered before?
Expertise MatchDoes it match your expertise as interventional cardiologist?
Engagement PotentialWill it capture and hold attention?
Share-abilityWill people share this? Controversy potential?
Evergreen FactorWill this be relevant in 6 months?

Total Score: 0-60


OUTPUT FORMAT

# ENSEMBLE CONTENT SCORE

**Idea:** Statins myth-busting for Indian audience - why most "side effects" aren't real

**Date:** 2025-01-01

---

## INDIVIDUAL MODEL SCORES

### Claude (Anthropic)
| Dimension | Score | Reasoning |
|-----------|-------|-----------|
| Relevance | 9/10 | High - statins widely prescribed in India, misinformation common |
| Novelty | 7/10 | Topic covered before, but Indian-specific angle is fresher |
| Expertise | 9/10 | Perfect for interventional cardiologist |
| Engagement | 8/10 | Controversial enough to spark discussion |
| Shareability | 8/10 | Will trigger debates |
| Evergreen | 9/10 | Statin myths persist |
| **Total** | **50/60** | |

### GPT-4o (OpenAI)
| Dimension | Score | Reasoning |
|-----------|-------|-----------|
| Relevance | 9/10 | Very relevant for Indian audience |
| Novelty | 6/10 | Many statin videos exist |
| Expertise | 10/10 | Perfect fit |
| Engagement | 9/10 | Myth-busting format works |
| Shareability | 8/10 | Good controversy factor |
| Evergreen | 8/10 | Will stay relevant |
| **Total** | **50/60** | |

### Gemini (Google)
| Dimension | Score | Reasoning |
|-----------|-------|-----------|
| Relevance | 8/10 | Good for health-conscious Indians |
| Novelty | 7/10 | Indian angle adds freshness |
| Expertise | 9/10 | Great fit |
| Engagement | 7/10 | Educational more than viral |
| Shareability | 7/10 | Moderate share potential |
| Evergreen | 9/10 | Long-lasting relevance |
| **Total** | **47/60** | |

---

## CONSENSUS SCORE

| Model | Total Score |
|-------|-------------|
| Claude | 50/60 |
| GPT-4o | 50/60 |
| Gemini | 47/60 |
| **Average** | **49/60 (81.7%)** |
| **Std Dev** | 1.7 (High Consensus) |

---

## VERDICT

🟢 **STRONG PURSUE** (Score: 49/60, Consensus: High)

All models agree this is a strong content idea. The combination of:
- High relevance to your audience
- Perfect expertise match
- Good controversy factor
- Evergreen potential

Makes this a priority topic for your content calendar.

---

## RECOMMENDATIONS

1. **Angle Enhancement**: Focus on the "nocebo effect" - most statin "side effects" are psychosomatic
2. **Hook Suggestion**: "90% of statin side effects aren't real - here's the data"
3. **Format**: 12-15 minute deep dive with studies
4. **Hinglish Tip**: Use "side effect ka drama" for relatability

---

## DISSENT ANALYSIS

- **Gemini** scored lower on engagement (7 vs 8-9)
- Suggests: May need stronger hook to maximize viral potential
- Consider: Adding patient testimonial or counter-narrative

SCORING TIERS

Score RangeVerdictAction
50-60🟢 STRONG PURSUEHigh priority, create immediately
40-49🟡 WORTH PURSUINGGood idea, add to calendar
30-39🟠 NEEDS REFINEMENTHas potential, needs angle work
20-29🔴 RECONSIDERWeak idea, low priority
0-19⛔ SKIPNot worth the effort

CONSENSUS INTERPRETATION

Std DeviationInterpretation
< 3High consensus - models agree
3-5Moderate consensus - some disagreement
> 5Low consensus - divisive idea (may be worth exploring!)

INTEGRATION

Enhances:

  • viral-content-predictor - More reliable predictions
  • youtube-script-master - Validate topics before scripting
  • content-repurposer - Know which content to repurpose

Workflow:

Idea Generation → Ensemble Score → [High Score?] → Create Content
                         ↓
                   [Low Score?] → Refine or Skip

MODELS USED

ModelProviderCostNotes
Claude SonnetAnthropicSubscriptionYour primary
GPT-4oOpenAIAPIStrong analysis
Gemini ProGoogleFREEGood for fact-checking
GrokxAIAPITwitter trend awareness

Minimum required: 2 models (Claude + one other) Recommended: 3+ models for robust consensus


DEPENDENCIES

anthropic>=0.18.0
openai>=1.0.0           # For GPT-4o
google-generativeai>=0.3.0  # For Gemini
python-dotenv>=1.0.0
rich>=13.0.0

API KEYS NEEDED

KeyPurposeStatus
ANTHROPIC_API_KEYClaudeAlready have
OPENAI_API_KEYGPT-4oAlready have
GOOGLE_API_KEYGeminiAlready have
XAI_API_KEYGrok (optional)Already have

BATCH SCORING

For scoring multiple ideas at once:

python scripts/score_content.py --batch \
    --ideas "GLP-1 for heart failure" \
            "Statin myth-busting" \
            "CAC scoring guide" \
            "Why LDL matters" \
            "Exercise for heart health"

Output:

| Rank | Idea | Score | Verdict |
|------|------|-------|---------|
| 1 | Statin myth-busting | 49/60 | 🟢 STRONG PURSUE |
| 2 | GLP-1 for heart failure | 45/60 | 🟡 WORTH PURSUING |
| 3 | CAC scoring guide | 42/60 | 🟡 WORTH PURSUING |
| 4 | Why LDL matters | 38/60 | 🟠 NEEDS REFINEMENT |
| 5 | Exercise for heart health | 35/60 | 🟠 NEEDS REFINEMENT |

NOTES

  • Speed: ~30 seconds for single idea (parallel API calls)
  • Cost: Minimal - short prompts to each model
  • Reliability: Consensus typically more accurate than single model
  • When to ignore: If YOU have strong conviction, trust your expertise

This skill helps you invest your time in content that's more likely to succeed.

Install

Download ZIP
Requires askill CLI v1.0+

AI Quality Score

88/100Analyzed 2/19/2026

Highly comprehensive skill with excellent structure, clear documentation, and actionable content. Well-organized with triggers, usage examples, scoring dimensions, and output formats. The multi-model consensus approach is well-explained with diagrams and tables. Minor deduction for internal-only signals (specific user context, personal API keys, custom skill references) which limits cross-project reusability.

100
95
65
95
95

Metadata

Licenseunknown
Version-
Updated1/3/2026
Publisherdrshailesh88

Tags

apigithub-actionsllm