askill
mit-exam-generator

mit-exam-generatorSafety 90Repository

Generate rigorous MIT PhD-level qualifying examinations from Markdown/Obsidian notes. This skill should be used when users request quiz creation, exam generation, assessment materials, practice questions, or study guides. Triggers on "create quiz", "generate exam", "make practice questions", "assessment", "test me on", or any request for educational testing materials from source content.

143 stars
2.9k downloads
Updated 2/22/2026

Package Files

Loading files...
SKILL.md

MIT PhD Qualifying Exam Generator

Generate rigorous academic assessments from structured Markdown content.

What This Skill Does

  • Generates 200-question PhD qualifying exams from Markdown/Obsidian notes
  • Scales question count proportionally for sparse content
  • Auto-detects difficulty from content complexity
  • Merges multiple source documents with weighted question distribution
  • Validates answer distribution, difficulty spread, and source coverage

What This Skill Does NOT Do

  • Process PDFs, images, or non-Markdown formats
  • Generate answer explanations with direct quotes (exam integrity)
  • Create exams from external web content
  • Provide answer keys without completing the full exam

Required Clarifications

Before generating, clarify with user:

QuestionOptionsDefault
Multi-doc strategyMerge thematically / Separate sections per sourceMerge thematically
Difficulty emphasisBalanced / Favor higher levels / Favor foundationalBalanced
Include timing guidanceYes (with per-section time) / NoYes

Optional Clarifications

Ask only if relevant:

  • Custom question count override?
  • Specific sections to emphasize or exclude?
  • Target audience adjustment (undergrad vs PhD)?

If User Doesn't Respond

Use defaults and note assumptions in exam header:

**Assumptions:** Merged thematically, balanced difficulty, standard timing

Before Implementation

SourceGather
Source FilesRead all specified Markdown files completely
Content DepthAssess complexity for difficulty calibration
Key ConceptsExtract testable facts, definitions, relationships
Section StructureMap headings for source references

Exam Specifications

ParameterStandardScaled (Sparse)
Questions200Min 25, proportional to content
Duration180 min15 min per 25 questions
Points1 per questionSame

Grading Scale

Grade%Classification
A+95-100Exceptional - PhD qualifying
A90-94.99Strong mastery
B+85-89.99Good foundation
B80-84.99Satisfactory
C70-79.99Marginal pass
F<70Fail - Retake required

Generation Workflow

1. ANALYZE
   └── Read source files → Extract concepts → Map sections
   └── Calculate: content_density = concepts / sections

2. CALIBRATE
   └── question_count = min(200, concepts * 2)
   └── difficulty_profile = analyze_complexity(content)

3. DISTRIBUTE
   └── Allocate questions by type (see references/question-patterns.md)
   └── Allocate by Bloom's level (see references/bloom-taxonomy.md)
   └── Weight by source document size (multi-doc)

4. GENERATE
   └── Create questions following type patterns
   └── Ensure distractors are plausible (70-90% correct)
   └── Track source section for each question

5. VALIDATE
   └── Run all checks (see references/validation-rules.md)
   └── Fix any failures before delivery

6. OUTPUT
   └── Save to exam-[source-name].md alongside source

Question Type Distribution

Type%Purpose
Precision Recall10Exact values, definitions
Conceptual Distinction15Paired/contrasting concepts
Decision Matrix12.5Multi-criteria scenarios
Architecture Analysis12.5System components, flows
Economic/Quantitative10Calculations, comparisons
Specification Design10Framework application
Critical Evaluation12.5Trade-offs, judgments
Strategic Synthesis10Multi-concept integration
Research Extension7.5Novel scenario extrapolation

See references/question-patterns.md for templates and examples.


Bloom's Taxonomy Distribution

Level%Question Characteristics
Remember/Understand25Recall facts, explain concepts
Apply20Use in new situations
Analyze25Break down, compare, contrast
Evaluate18Judge, critique, justify
Create/Synthesize12Design, propose, integrate

See references/bloom-taxonomy.md for level indicators.


Answer Construction Rules

  1. Option A: Never "All/None of the above"
  2. Correct Answer: One clearly correct option
  3. Distractors: Plausible but fail on critical detail (70-90% correct)
  4. Distribution: Roughly equal A:B:C:D across exam
  5. Sequences: No more than 3 consecutive same-letter answers

Multi-Document Handling

When multiple source files provided:

weight[doc] = word_count[doc] / total_word_count
questions[doc] = round(total_questions * weight[doc])

Create distinct sections per source or merge thematically (user preference).


Output Format

# [Exam Title]
## MIT PhD Qualifying Examination

**Source:** [file(s)]
**Questions:** [N]
**Duration:** [X] minutes
**Generated:** [date]

---

### PART A: [Topic] ([X] Questions)

**Q1.** [Question stem]
A) [Option]
B) [Option]
C) [Option]
D) [Option]

[Continue all questions...]

---

## ANSWER KEY

| Q# | Ans | Section | Difficulty | Bloom |
|----|-----|---------|------------|-------|
| 1 | C | Part A | Medium | Apply |

---

## EXPLANATIONS

### Q1
**Correct: C**
[Explanation with section reference - NO direct quotes]
Section: [Heading from source]

Scaling Algorithm

def calculate_questions(content):
    concepts = extract_testable_concepts(content)

    if len(concepts) >= 100:
        return 200  # Full exam
    elif len(concepts) >= 50:
        return 100  # Half exam
    elif len(concepts) >= 25:
        return 50   # Quarter exam
    else:
        return max(25, len(concepts))  # Minimum viable

Edge Case Handling

SituationAction
Conflicting info in sourceFlag in exam notes; create question testing the distinction
Ambiguous conceptsSkip or ask user for clarification before generating
Too few testable factsScale down; warn user if <25 questions possible
Highly technical jargonInclude definition in question stem if needed
Multiple valid interpretationsAvoid or phrase as "According to [source]..."
Source has errorsDo not correct; test what source states (note discrepancy)

Validation Pipeline

Run ALL checks before delivery. See references/validation-rules.md.

Quick Checklist

  • Question count matches calculated target
  • Each question has exactly 4 options (A-D)
  • Answer distribution within 20-30% per letter
  • No >3 consecutive same-letter answers
  • All Bloom levels represented per distribution
  • All question types represented per distribution
  • Every question has section reference
  • No direct quotes in explanations
  • Difficulty distribution matches content complexity

Reference Files

FilePurpose
references/question-patterns.mdTemplates for each question type
references/bloom-taxonomy.mdCognitive level classification
references/validation-rules.mdQuality validation criteria

Install

Download ZIP
Requires askill CLI v1.0+

AI Quality Score

77/100Analyzed 2/24/2026

Comprehensive and well-structured skill for generating MIT PhD-level exams from Markdown content. Strong on completeness, clarity, and actionability with detailed workflows, specifications, and validation rules. Main weaknesses: very low reusability due to extreme specialization (MIT PhD exams only) and completely mismatched tags (ci-cd/github-actions/testing for an exam generator). Relies on external reference files not included. Despite good structural quality, the mismatch between tags and content suggests possible auto-generation or copy-paste error."

90
92
35
88
80

Metadata

Licenseunknown
Version-
Updated2/22/2026
Publisherpanaversity

Tags

ci-cdgithub-actionstesting