DocsAPI ReferenceStructured extraction
POST/v1/extract

Structured extraction

Scrapes a URL then passes the page content to an LLM to extract structured data matching a JSON schema. Supports Claude (Anthropic), GPT-4o-mini (OpenAI), and Ollama (local, free, fully offline). Validates output with AJV; retries once with error context if invalid.

Request body

NameTypeRequiredDescription
urlstringrequiredPage to extract from
schemaobjectrequiredJSON Schema describing the shape of the data to extract
instructionsstringoptionalExtra instructions appended to the LLM prompt (max 4096 chars)
providerstringoptionaldefault: "auto""claude" | "openai" | "ollama" | "auto"

Example

bash
curl -X POST http://localhost:3000/v1/extract \
  -H "Content-Type: application/json" \
  -d '{
    "url": "https://news.ycombinator.com",
    "provider": "ollama",
    "schema": {
      "type": "object",
      "properties": {
        "topStories": { "type": "array", "items": { "type": "string" } }
      },
      "required": ["topStories"]
    }
  }'

Response

bash
{
  "success": true,
  "data": {
    "topStories": [
      "Ask HN: What are you working on?",
      "LLMs are getting cheaper"
    ]
  },
  "meta": { "provider": "ollama", "model": "llama3" }
}

Try it

Live preview only works when the API is running on your machine.

Run docker compose up and the Try-It panel will become interactive. Until then, copy the curl example below.