Open source · MIT licensed · Free forever

Web data.
On your terms.

Scrape, crawl, extract, and schedule with one self-hosted API. Multi-LLM support. No rate limits. No cloud bill.

This is a showcase deploy — accounts are disabled. Run docker compose up to use the full dashboard.

Terminal
# Clone, run, you're online ↓
$ git clone https://github.com/Ramcode64/netleaf
$ docker compose up
Postgres ready
Redis ready
Migrations applied
API live at http://localhost:3000

Run it locally — docker compose up, then curl http://localhost:3000. No sign-up, no auth.

Seven capabilities, ten endpoints

Everything you need
to turn the web into data.

One self-hosted binary. Zero dollars. No rate limits.

Scrape

01

Single-page extraction to clean Markdown, HTML, or text. Playwright-powered for JS-heavy sites.

Crawl

02

Recursive site crawls with configurable depth. Async BullMQ queue — survives restarts.

Map

03

Instant URL discovery via sitemaps and link extraction. Returns thousands of URLs in seconds.

Extract

04

Structured JSON from any page using your own JSON Schema. Choose Claude, OpenAI, or fully offline Ollama.

Search

05

Brave Search integration with optional parallel scraping of every result.

Schedule

06

Cron-based recurring crawls. Set it once, get fresh data on your own schedule.

Diff

07

Cryptographic change detection between crawl runs. Know exactly what changed and when.

Open source

MIT licensed. Fork it, extend it, or contribute upstream.

View on GitHub →

Plain HTTP

A request away
from data.

No SDK required. Works from curl, Python, Go, Node — any language that can make an HTTP request.

  • Bearer token auth — one header, every endpoint
  • Consistent JSON response envelope everywhere
  • Webhooks for async crawl job completion
curl -X POST localhost:3000/v1/scrape \
"Authorization: Bearer nl_your_key" \
"Content-Type: application/json" \
'{
"url": "https://example.com",
"formats": ["markdown"]
}'

Compare

Netleaf vs the field

Same job. No lock-in, no meter, no cloud dependency.

FeatureNetleafFirecrawlApifyDiffbotScrapingBeeCrawlee
Self-hosteddocker compose upComplex setupYes (library)
Rate limitsNone500 credits/mo free$5 credits/mo free10K calls/mo trial1K credits trial
AI/LLM extractionClaude · GPT · OllamaCloud onlyProprietary AI
Scheduled crawls
Change detection
Export formatsJSON · CSV · XML · ZIPMarkdown, JSONJSON, CSV, Excel, XMLJSONHTML, JSONJSON, CSV
Local-only mode
Open sourceMITAGPL-3.0SDK only (MIT)Apache 2.0
PricingFree forever$16 – $333/mo$49+/mo$299+/mo$49+/moFree

← swipe to compare →Data accurate as of 2026. Crawlee is an open-source library, not a hosted API — included for reference. = partial support.

Pricing

Free. Forever.

Self-host on any machine. The only cost is your compute.

$0

per month. For everything. Always.

  • All 10 endpoints — unlimited requests
  • Multi-LLM extraction (Claude · OpenAI · Ollama)
  • Scheduled crawls + cron management
  • Cryptographic change detection
  • CSV · XML · ZIP exports
  • No rate limits — your hardware, your rules
  • MIT licensed — fork, extend, ship

No credit card required. docker compose up and you're running.

Start in 30 seconds.

No cloud account. No credit card. Just Docker.

$ git clone https://github.com/Ramcode64/netleaf
$ cp apps/api/.env.example apps/api/.env
$ docker compose up