firecrawl-agent
firecrawl/cli
AI-powered autonomous data extraction from complex websites, returning structured JSON.
What is firecrawl-agent?
The firecrawl-agent skill uses AI to navigate multi-page websites and extract structured data automatically. Use it when you need JSON-formatted data from complex sites, pricing tiers, product listings, or directory entries—especially when manual scraping would require navigating many pages.
- Navigates complex multi-page websites autonomously to find and extract data
- Returns structured JSON output matching a provided schema or freeform data
- Handles multi-page extraction that would be tedious to scrape manually
- Accepts JSON schemas to enforce predictable, structured output
- Supports credit limits to control spending on agent runs
- Outputs results to files or returns inline with --wait flag
How to install firecrawl-agent
npx skills add null --skill firecrawl-agent- Firecrawl CLI installed and configured with API credentials
- Node.js and npm (to run npx commands)
How to use firecrawl-agent
- 1.Run the firecrawl agent command with a natural language instruction describing what to extract
- 2.Optionally provide a JSON schema using --schema or --schema-file for structured output
- 3.Specify starting URLs with --urls if needed, or let the agent discover them
- 4.Use --wait flag to block until the agent completes (takes 2-5 minutes)
- 5.Specify --output to save results to a file, or omit to print to stdout
- 6.Review the returned JSON data in your specified output file or console
Use cases
- Extract all product listings and pricing from an e-commerce site with multiple pages
- Pull pricing tiers and feature comparisons from a SaaS pricing page
- Gather directory entries or contact information across paginated results
- Extract structured product data (name, price, description) as JSON with a defined schema
- Collect feature lists or specifications from multi-page documentation sites
- Data analysts needing structured data from complex websites
- Developers building integrations that require website data as JSON
- Business intelligence teams extracting competitive pricing or product information
- Anyone needing multi-page web data extraction without writing custom scrapers
firecrawl-agent FAQ
Use firecrawl-agent for complex multi-page sites where the AI needs to navigate and figure out where data lives. Use firecrawl-scrape for simple single-page extraction—it's faster and cheaper.
Agent runs typically take 2–5 minutes to complete. Always use --wait to block and get results inline.
With a JSON schema, the agent returns predictable, structured output matching your schema. Without one, the agent returns freeform extracted data.
Use --max-credits to set a spending limit for the agent run. Agent runs consume more credits than simple scrapes, so use firecrawl-scrape for single-page extraction when possible.
Yes, use --urls to provide starting URLs. The agent will navigate from there to find and extract the data you request.
Full instructions (SKILL.md)
Source of truth, from firecrawl/cli.
name: firecrawl-agent description: | AI-powered autonomous data extraction that navigates complex sites and returns structured JSON. Use this skill when the user wants structured data from websites, needs to extract pricing tiers, product listings, directory entries, or any data as JSON with a schema. Triggers on "extract structured data", "get all the products", "pull pricing info", "extract as JSON", or when the user provides a JSON schema for website data. More powerful than simple scraping for multi-page structured extraction. allowed-tools:
- Bash(firecrawl *)
- Bash(npx firecrawl *)
firecrawl agent
AI-powered autonomous extraction. The agent navigates sites and extracts structured data (takes 2-5 minutes).
When to use
- You need structured data from complex multi-page sites
- Manual scraping would require navigating many pages
- You want the AI to figure out where the data lives
Quick start
# Extract structured data
firecrawl agent "extract all pricing tiers" --wait -o .firecrawl/pricing.json
# With a JSON schema for structured output
firecrawl agent "extract products" --schema '{"type":"object","properties":{"name":{"type":"string"},"price":{"type":"number"}}}' --wait -o .firecrawl/products.json
# Focus on specific pages
firecrawl agent "get feature list" --urls "<url>" --wait -o .firecrawl/features.json
Options
| Option | Description |
|---|---|
--urls <urls> | Starting URLs for the agent |
--model <model> | Model to use: spark-1-mini or spark-1-pro |
--schema <json> | JSON schema for structured output |
--schema-file <path> | Path to JSON schema file |
--max-credits <n> | Credit limit for this agent run |
--wait | Wait for agent to complete |
--pretty | Pretty print JSON output |
-o, --output <path> | Output file path |
Tips
- Always use
--waitto get results inline. Without it, returns a job ID. - Use
--schemafor predictable, structured output — otherwise the agent returns freeform data. - Agent runs consume more credits than simple scrapes. Use
--max-creditsto cap spending. - For simple single-page extraction, prefer
scrape— it's faster and cheaper.
See also
- firecrawl-scrape — simpler single-page extraction
- firecrawl-interact — scrape + interact for manual page interaction (more control)
- firecrawl-crawl — bulk extraction without AI
Related skills
More from firecrawl/cli and the wider catalog.
firecrawl
Search, scrape, and interact with the web via Firecrawl CLI—real-time content extraction and monitoring.
firecrawl-scrape
Extract clean markdown from any URL, including JavaScript-rendered pages.
firecrawl-search
Web search with full page content extraction—find articles, research topics, and discover sources beyond snippets.
firecrawl-crawl
Bulk extract content from entire websites or site sections with depth and path filtering.
firecrawl-map
Discover and list all URLs on a website with optional search filtering.
firecrawl-download
Download entire websites as local markdown, screenshots, or multiple formats organized in directories.