Convert DOCX → Markdown
Convert the Word document at $1 to clean markdown format optimized for LLM context.
Steps
- Install dependency: Use
dstoic:install-dependencyskill to ensuremarkitdownis installed - Parse arguments: Input file
$1(required), output dir$2(optional, defaults to./converted/) - Prepare output dir: Create
$2if it doesn't exist - Convert:
source .venv/bin/activate && python -m markitdown "$1" -o "$OUTPUT_DIR/output.md" - Report: Output location, file sizes (original vs converted), structure summary (headings, tables, lists), any warnings
Goal
Extract clean markdown for LLM context — analysis, summarization, Q&A. NOT for round-tripping back to DOCX.
No content sent to any LLM during conversion (pure script-based extraction).
