PluginBench
Skill
Official
Review
Audit score 70

firecrawl-parse

firecrawl/cli

Convert local files (PDF, DOCX, XLSX, HTML, etc.) to clean markdown with optional AI summaries.

What is firecrawl-parse?

Extracts and converts local documents into well-formatted markdown saved to disk. Use this when you have a file on your computer that needs parsing, summarizing, or content extraction—ideal for PDFs, Word docs, spreadsheets, and HTML files.

  • Parse local files (PDF, DOCX, DOC, ODT, RTF, XLSX, XLS, HTML) into clean markdown
  • Generate AI-powered summaries of document content
  • Answer specific questions about parsed file content
  • Save output to disk to avoid context bloat
  • Support for files up to 50 MB

How to install firecrawl-parse

npx skills add null --skill firecrawl-parse
Prerequisites
  • Firecrawl CLI installed (via `npx skills add null --skill firecrawl-parse`)
  • Local file path to parse
  • `.firecrawl/` directory created (recommended: `mkdir -p .firecrawl`)
  • Add `.firecrawl/` to `.gitignore` to avoid committing large parsed files
Claude Code
Cursor
Windsurf
Cline

How to use firecrawl-parse

  1. 1.Create output directory: `mkdir -p .firecrawl`
  2. 2.Run parse command with `-o` flag to save to disk: `firecrawl parse ./document.pdf -o .firecrawl/output.md`
  3. 3.For AI summary, add `-S` flag: `firecrawl parse ./document.pdf -S -o .firecrawl/summary.md`
  4. 4.To answer a question, use `-Q` flag: `firecrawl parse ./document.pdf -Q "Your question here" -o .firecrawl/qa.md`
  5. 5.Read output incrementally (e.g., `head`, `grep`, `rg`) rather than loading entire file into context

Use cases

Good for
  • User uploads a PDF and asks 'what does this say?'—parse it and summarize
  • Extract text from a DOCX contract and answer 'what are the payment terms?'
  • Convert an XLSX spreadsheet to markdown for analysis
  • Parse an HTML document and save structured output for downstream processing
  • Batch process multiple documents and check credit usage
Who it's for
  • Developers processing user-uploaded documents
  • Analysts extracting insights from PDFs and reports
  • Teams automating document workflows
  • Anyone needing to convert local files to structured markdown

firecrawl-parse FAQ

When should I use firecrawl-parse vs. firecrawl-scrape?

Use parse for local files on disk (PDF, DOCX, etc.). Use scrape for URLs and web content.

What file formats are supported?

PDF, DOCX, DOC, ODT, RTF, XLSX, XLS, HTML, HTM, and XHTML.

Why save to `.firecrawl/` instead of stdout?

Parsed documents can be hundreds of KB. Saving to disk prevents bloating your context window.

How much does parsing cost?

Approximately 1 credit per PDF page; HTML files cost 1 credit flat. Check balance with `firecrawl credit-usage`.

What is the file size limit?

Maximum 50 MB per file.

Full instructions (SKILL.md)

Source of truth, from firecrawl/cli.


name: firecrawl-parse description: | Efficiently extract and convert the contents of any local file—such as PDF, DOCX, DOC, ODT, RTF, XLSX, XLS, or HTML—into clean, well-formatted markdown saved to disk. Use this skill whenever the user requests to parse, read, or extract information from a file on their computer, including phrases like “parse this PDF”, “convert this document”, “read this file”, “extract text from”, or when a local file path (not a URL) is provided. This skill offers advanced options like generating AI-powered summaries and answering questions based on the file's content. Prefer this tool over scrape when handling local files to deliver precise, structured outputs for downstream tasks. allowed-tools:

  • Bash(firecrawl *)
  • Bash(npx firecrawl *)

firecrawl parse

Turn a local document into clean markdown on disk. Supports PDF, DOCX, DOC, ODT, RTF, XLSX, XLS, HTML/HTM/XHTML.

When to use

  • You have a file on disk (not a URL) and want its text as markdown
  • User drops a PDF/DOCX and asks what it says, or to summarize it
  • Use scrape instead when the source is a URL

Quick start

Always save to .firecrawl/ with -o — parsed docs can be hundreds of KB and blow up context if streamed to stdout. Add .firecrawl/ to .gitignore.

mkdir -p .firecrawl

# File → markdown
firecrawl parse ./paper.pdf -o .firecrawl/paper.md

# AI summary
firecrawl parse ./paper.pdf -S -o .firecrawl/paper-summary.md

# Ask a question about the doc
firecrawl parse ./paper.pdf -Q "What are the main conclusions?" \
  -o .firecrawl/paper-qa.md

Then head, grep, rg etc., or incrementally read the file - don't load the whole thing at once.

Options

OptionDescription
-S, --summaryAI-generated summary
-Q, --query <prompt>Ask a question about the parsed content
-o, --output <path>Output file path — always use this
-f, --format <fmt>markdown (default), html, summary
--timeout <ms>Timeout for the parse job
--timingShow request duration

Tips

  • Quote paths with spaces: firecrawl parse "./My Doc.pdf" -o .firecrawl/mydoc.md.
  • Max upload size: 50 MB per file.
  • Credits: ~1 per PDF page; HTML is 1 flat.
  • Check .firecrawl/ before re-parsing the same file.
  • To check your credit balance (recommended for batch processing and similar workflows), use the firecrawl credit-usage command.

See also