πŸ’‘

πŸ”₯ The API to search, scrape, and interact with the web for AI. Three integrated capabilities β€” Search, Scrape, Interact β€” exposed through one API. Open source under AGPL-3.0 and self-hostable via docker-compose, the engine also powers the firecrawl.dev cloud SaaS run by the same team.

πŸ”“
Open-source web data API Capability
Installation Β· manual
Self-host (docker-compose) $ git clone https://github.com/firecrawl/firecrawl && cd firecrawl && docker compose up
Python SDK $ pip install firecrawl-py
Node.js SDK $ npm install @mendable/firecrawl-js

What it does

The infrastructure for β€œclean, LLM-ready data” from the live web is a real bottleneck for AI agents and RAG pipelines. General scrapers leave you to handle JavaScript rendering, complex markup, robots.txt, and multi-step interactions yourself β€” and the output rarely lands in a shape that an LLM can consume directly.

Firecrawl bundles that infrastructure into one API. Quoting firecrawl.dev: β€œthe infrastructure layer that helps AI find, read, and act on the live web.” Output is returned as LLM-ready markdown or structured data from the start.

Key features β€” three integrated capabilities

  • Search β€” web search

    Run a query and get search results, with optional content extraction for each hit in the same call.

  • Scrape β€” page β†’ clean data

    Extract a single URL into JSON, markdown, or branding formats. JavaScript rendering and complex markup are handled automatically.

  • Interact β€” page automation

    Automate clicks, typing, and navigation to reach content that static scraping cannot.

Additional endpoints include Agent (autonomous multi-source research), Crawl (multi-page extraction with depth and page limits), Map (discover indexed URLs on a site), and Batch Scrape (parallel processing of many URLs).

Cloud vs Open Source

AspectOpen Source (this repo)Cloud (firecrawl.dev)
OperatorYouFirecrawl team
LicenseAGPL-3.0 (SDKs / some UI = MIT)SaaS terms
Extra featuresCore engineAdditional cloud-only features (see README comparison)
CostYour infra cost1,000 credits/month free + paid plans
Data controlFull self-controlRouted through Firecrawl infrastructure
Best forStrict data residency, cost or customization controlFast start without infrastructure overhead

SDKs

LanguageInstall
Pythonpip install firecrawl-py
Node.jsnpm install @mendable/firecrawl-js
JavaJitPack via Gradle / Maven (com.github.firecrawl:firecrawl-java-sdk:2.0)
Elixir{:firecrawl, "~> 1.0"}
Rustfirecrawl = "2"

A community Go SDK is linked separately in the README.

Usage

Cloud (fastest start) β€” generate an API key at firecrawl.dev and call directly.

curl -X POST 'https://api.firecrawl.dev/v2/search' \
  -H 'Authorization: Bearer fc-YOUR_API_KEY' \
  -H 'Content-Type: application/json' \
  -d '{"query": "firecrawl", "limit": 5}'

Self-host β€” use the docker-compose stack at the repo root.

git clone https://github.com/firecrawl/firecrawl
cd firecrawl
docker compose up

See SELF_HOST.md in the repo for environment setup and dependencies.

From Claude Code β€” use the Firecrawl MCP. Point it at a self-hosted instance via FIRECRAWL_API_URL to keep the cloud out of the loop entirely.

Notes

  • AGPL-3.0 has real obligations β€” review the copyleft terms before integrating the engine source into a commercial product. Simply calling the API as a client (via MCP or SDK) is generally unaffected.
  • SDKs and some UI components are MIT β€” explicit in the README. Client-side integration draws only the MIT-licensed parts.
  • robots.txt respected by default β€” README quote: β€œFirecrawl respects robots.txt by default,” and: β€œIt is the sole responsibility of end users to respect websites’ policies when scraping.”
  • Adoption β€” firecrawl.dev cites over one million signups and customers including Apple, Canva, and Lovable.
  • Actively maintained β€” near-daily commits since the first commit in April 2024.

Frequently Asked Questions

What is Firecrawl?

Quoting the README: "The API to search, scrape, and interact with the web for AI." A full-stack backend service written in TypeScript, Python, Rust, and Java that both powers the [firecrawl.dev](https://firecrawl.dev) cloud SaaS and is open-source under AGPL-3.0 for anyone to self-host.

Is it open source? What's the license?

Yes β€” published on GitHub under AGPL-3.0. From the README: "This project is primarily licensed under the GNU Affero General Public License v3.0 (AGPL-3.0). The SDKs and some UI components are licensed under the MIT License." The core engine is AGPL; SDKs and some UI components are MIT.

How does it relate to firecrawl.dev?

firecrawl.dev is the cloud SaaS run by the same Firecrawl team β€” a hosted version of this engine with additional cloud-only features (see the README's "Open Source vs Cloud" comparison). The free plan starts at 1,000 credits per month.

How do I self-host it?

Use the `docker-compose.yaml` in the repo root and follow the `SELF_HOST.md` guide. It runs as a containerized stack (with services like Redis as dependencies). Not a single `docker run`, but lighter than bare-metal infrastructure deployment.

Which SDKs are available?

Officially supported: Python (`firecrawl-py`), Node.js (`@mendable/firecrawl-js`), Java (Gradle/Maven via JitPack), Elixir (`firecrawl`), and Rust (`firecrawl`). A community Go SDK is also linked in the README.

How do I use it from Claude Code?

Through the [Firecrawl MCP](/en/tools/firecrawl-mcp/). The MCP server can target either the cloud (`FIRECRAWL_API_KEY`) or a self-hosted instance (`FIRECRAWL_API_URL`), so you can use your own deployment from inside Claude as well.