askill
convert-to-md

convert-to-mdSafety --Repository

Batch convert documents (DOCX, PDF, XLSX, TXT, PPTX, MSG, DOC) to markdown, preserving tracked changes and comments.

1 stars
1.2k downloads
Updated 2/15/2026

Package Files

Loading files...
SKILL.md

Document to Markdown Conversion

Batch convert documents to markdown format, preserving tracked changes, comments, and other markup.

Usage

/convert-to-md [directory]

Supported Formats

FormatMethodNotes
DOCXpandoc --track-changes=allPreserves comments & tracked changes
PDFPyMuPDFText extraction
XLSXpandasConverts to markdown tables
TXTrenameDirect rename to .md
PPTXpandocSlide content to markdown
MSGextract-msgEmail metadata + body
DOCtextutilmacOS native (fallback)
DOTXpandocWord templates

Process

  1. Install dependencies (if needed):

    uv add pymupdf pandas openpyxl tabulate extract-msg
    
  2. Convert DOCX (preserves comments/edits):

    for f in *.docx; do
      pandoc --track-changes=all -f docx -t markdown -o "${f%.docx}.md" "$f" && rm "$f"
    done
    
  3. Convert PDF:

    import fitz
    from pathlib import Path
    for pdf in Path(".").glob("*.pdf"):
        doc = fitz.open(pdf)
        text = "\n\n".join(page.get_text() for page in doc)
        pdf.with_suffix(".md").write_text(text.strip())
        pdf.unlink()
    
  4. Convert XLSX to tables:

    import pandas as pd
    for xlsx in Path(".").glob("*.xlsx"):
        xls = pd.ExcelFile(xlsx)
        content = f"# {xlsx.stem}\n\n"
        for sheet in xls.sheet_names:
            df = pd.read_excel(xlsx, sheet_name=sheet)
            content += f"## {sheet}\n\n{df.to_markdown(index=False)}\n\n"
        xlsx.with_suffix(".md").write_text(content)
        xlsx.unlink()
    
  5. Convert TXT: for f in *.txt; do mv "$f" "${f%.txt}.md"; done

  6. Convert MSG:

    import extract_msg
    msg = extract_msg.Message("file.msg")
    content = f"# {msg.subject}\n\n**From:** {msg.sender}\n**Date:** {msg.date}\n\n{msg.body}"
    
  7. Clean up: Remove *:Zone.Identifier files (Windows metadata)

Behavior

  • Deletes original files after successful conversion
  • Skips files that already have a .md counterpart
  • Reports failures without stopping batch

Dependencies

  • pandoc (system): DOCX, PPTX, DOTX conversion
  • textutil (macOS): DOC fallback
  • pymupdf (Python): PDF text extraction
  • pandas, openpyxl, tabulate (Python): XLSX tables
  • extract-msg (Python): Outlook MSG files

Install

Download ZIP
Requires askill CLI v1.0+

AI Quality Score

AI review pending.

Metadata

Licenseunknown
Version1.0.0
Updated2/15/2026
Publishernicsuzor

Tags

No tags yet.