Recurring crawls
Schedule a crawl to run automatically on any interval using standard cron expressions.
Cron syntax
Standard 5-field format: minute hour day month weekday.
| Expression | Meaning |
|---|---|
| 0 2 * * * | Every day at 02:00 |
| 0 */6 * * * | Every 6 hours |
| 0 9 * * 1 | Every Monday at 09:00 |
| */30 * * * * | Every 30 minutes |
Create a schedule
bash
curl -X POST http://localhost:3000/v1/schedule \
-H "Content-Type: application/json" \
-d '{
"name": "Nightly docs crawl",
"cronExpression": "0 2 * * *",
"url": "https://docs.example.com",
"options": { "maxPages": 200, "formats": ["markdown"] }
}'The in-process scheduler polls every 60 s and enqueues a crawl job whenever nextRunAt has passed.
Manage schedules
bash
# List all curl http://localhost:3000/v1/schedule # Delete curl -X DELETE http://localhost:3000/v1/schedule/YOUR_SCHEDULE_ID
Combine with webhooks
Add webhookUrl inside options to receive a POST each time a scheduled crawl completes:
bash
{
"name": "Nightly with webhook",
"cronExpression": "0 2 * * *",
"url": "https://docs.example.com",
"options": {
"maxPages": 200,
"webhookUrl": "https://your-app.com/hooks/netleaf"
}
}Payloads are signed with X-Netleaf-Signature (HMAC-SHA256) when WEBHOOK_SECRET is set.