Open source · MIT licensed · Free forever

Web data.
On your terms.

Scrape, crawl, extract, and schedule with one self-hosted API. Multi-LLM support. No rate limits. No cloud bill.

Self-host on GitHub

This is a showcase deploy — accounts are disabled. Run docker compose up to use the full dashboard.

Terminal

# Clone, run, you're online ↓

$ git clone https://github.com/Ramcode64/netleaf

$ docker compose up

✓ Postgres ready

✓ Redis ready

✓ Migrations applied

→ API live at http://localhost:3000

Run it locally — docker compose up, then curl http://localhost:3000. No sign-up, no auth.

Seven capabilities, ten endpoints

Everything you need
to turn the web into data.

One self-hosted binary. Zero dollars. No rate limits.

Scrape

Single-page extraction to clean Markdown, HTML, or text. Playwright-powered for JS-heavy sites.

Crawl

Recursive site crawls with configurable depth. Async BullMQ queue — survives restarts.

Map

Instant URL discovery via sitemaps and link extraction. Returns thousands of URLs in seconds.

Extract

Structured JSON from any page using your own JSON Schema. Choose Claude, OpenAI, or fully offline Ollama.

Search

Brave Search integration with optional parallel scraping of every result.

Schedule

Cron-based recurring crawls. Set it once, get fresh data on your own schedule.

Diff

Cryptographic change detection between crawl runs. Know exactly what changed and when.

Open source

MIT licensed. Fork it, extend it, or contribute upstream.

View on GitHub →

Plain HTTP

A request away
from data.

No SDK required. Works from curl, Python, Go, Node — any language that can make an HTTP request.

Bearer token auth — one header, every endpoint
Consistent JSON response envelope everywhere
Webhooks for async crawl job completion

curl -X POST localhost:3000/v1/scrape \
"Authorization: Bearer nl_your_key" \
"Content-Type: application/json" \
'{
    "url": "https://example.com",
    "formats": ["markdown"]
  }'

Compare

Netleaf vs the field

Same job. No lock-in, no meter, no cloud dependency.

Feature	Netleaf	Firecrawl	Apify	Diffbot	ScrapingBee	Crawlee
Self-hosted	docker compose up	Complex setup				Yes (library)
Rate limits	None	500 credits/mo free	$5 credits/mo free	10K calls/mo trial	1K credits trial
AI/LLM extraction	Claude · GPT · Ollama	Cloud only		Proprietary AI
Scheduled crawls
Change detection
Export formats	JSON · CSV · XML · ZIP	Markdown, JSON	JSON, CSV, Excel, XML	JSON	HTML, JSON	JSON, CSV
Local-only mode
Open source	MIT	AGPL-3.0	SDK only (MIT)			Apache 2.0
Pricing	Free forever	$16 – $333/mo	$49+/mo	$299+/mo	$49+/mo	Free

← swipe to compare →Data accurate as of 2026. Crawlee is an open-source library, not a hosted API — included for reference. = partial support.

Pricing

Free. Forever.

Self-host on any machine. The only cost is your compute.

per month. For everything. Always.

All 10 endpoints — unlimited requests
Multi-LLM extraction (Claude · OpenAI · Ollama)
Scheduled crawls + cron management
Cryptographic change detection
CSV · XML · ZIP exports
No rate limits — your hardware, your rules
MIT licensed — fork, extend, ship

No credit card required. docker compose up and you're running.

Start in 30 seconds.

No cloud account. No credit card. Just Docker.