Turn any webpage into
structured data
AI-powered extraction that understands content, not just HTML. No brittle CSS selectors. Define your schema, get consistent output.
$2.00 / 1,000 extractions • Plans start at $5/month
The problem
CSS selectors break. AI doesn't.
Traditional web scraping is brittle. Sites update their HTML, your selectors break, and your pipeline fails at 3am.
document.querySelector('.product-price').price-valueschema: { price: { type: 'number' } }AI extraction understands content semantically. It doesn't matter if the price is in a <span>, <div>, or <p>—the AI finds it. Your schema defines what you want, not where to find it.
Extraction modes
Four ways to extract data
Choose the extraction mode that fits your use case. From fully automatic to schema-driven.
Auto Mode
mode: 'auto'AI automatically identifies and extracts the most relevant content. Great for articles, blog posts, and product pages.
const auto = await stack0.extraction.extractAndWait({url: 'https://example.com/blog/article',mode: 'auto',})// AI determines what's importantconsole.log(auto.extractedData)// { title: '...', content: '...', author: '...', date: '...' }
Markdown Mode
mode: 'markdown'Converts page content to clean, formatted markdown. Preserves headings, lists, code blocks, and links.
const markdown = await stack0.extraction.extractAndWait({url: 'https://example.com/documentation',mode: 'markdown',includeLinks: true,includeImages: true,})// Clean markdown outputconsole.log(markdown.extractedData)// # Documentation Title\n\nContent in markdown...
Schema Mode
mode: 'schema'Define your data structure with JSON Schema. Get strongly typed, consistent output every time.
const product = await stack0.extraction.extractAndWait({url: 'https://store.example.com/product/123',mode: 'schema',schema: {type: 'object',properties: {name: { type: 'string' },price: { type: 'number' },inStock: { type: 'boolean' },rating: { type: 'number' },},},})
HTML Mode
mode: 'html'Returns raw HTML content for custom parsing. Useful when you need full control over extraction logic.
const html = await stack0.extraction.extractAndWait({url: 'https://example.com',mode: 'html',})// Raw HTML for custom processingconsole.log(html.extractedData)
Schema extraction
Define your structure. Get consistent output.
JSON Schema support with nested objects, arrays, and all primitive types. Add custom prompts to guide extraction.
Product Data
E-commerce product with specs
{type: 'object',properties: {name: { type: 'string' },price: { type: 'number' },currency: { type: 'string' },description: { type: 'string' },inStock: { type: 'boolean' },rating: { type: 'number' },reviewCount: { type: 'number' },images: {type: 'array',items: { type: 'string' },},specifications: {type: 'object',properties: {brand: { type: 'string' },model: { type: 'string' },weight: { type: 'string' },},},},}
News/Articles
List of stories from a news page
{type: 'object',properties: {stories: {type: 'array',items: {type: 'object',properties: {title: { type: 'string' },url: { type: 'string' },points: { type: 'number' },comments: { type: 'number' },author: { type: 'string' },},},},},}
Team/People
Team members with roles and bios
{type: 'object',properties: {teamMembers: {type: 'array',items: {type: 'object',properties: {name: { type: 'string' },role: { type: 'string' },bio: { type: 'string' },linkedIn: { type: 'string' },},},},},}
Job Listings
Open positions with requirements
{type: 'object',properties: {jobs: {type: 'array',items: {type: 'object',properties: {title: { type: 'string' },department: { type: 'string' },location: { type: 'string' },salary: { type: 'string' },requirements: {type: 'array',items: { type: 'string' },},},},},},}
Guide extraction with prompts
Add natural language instructions to help the AI focus on what matters.
const guided = await stack0.extraction.extractAndWait({url: 'https://example.com/team',mode: 'schema',prompt: 'Extract information about team members, focusing on their roles and technical expertise. Ignore marketing staff.',schema: {type: 'object',properties: {engineers: {type: 'array',items: {type: 'object',properties: {name: { type: 'string' },role: { type: 'string' },expertise: { type: 'array', items: { type: 'string' } },},},},},},})
Use cases
Built for the AI era
From AI agents that need structured world knowledge to research automation and lead enrichment.
For AI Agents
RAG Data Pipelines
Feed structured web content into your retrieval-augmented generation systems
Agent World Knowledge
Give AI agents structured understanding of web pages they visit
Tool Use
Let agents extract data from URLs as part of multi-step workflows
For Research
Price Monitoring
Track competitor pricing across hundreds of products automatically
Market Research
Extract structured data from industry reports and directories
Trend Analysis
Monitor news and social content for emerging patterns
For Lead Gen
Company Enrichment
Extract company details, team info, and tech stack from websites
Contact Discovery
Find team members and their roles from about/team pages
Job Board Parsing
Monitor competitors hiring to understand their growth areas
For Content
News Aggregation
Build custom feeds from multiple sources with consistent structure
Content Migration
Convert web content to markdown for CMS imports
Documentation Sync
Keep external docs in sync with your knowledge base
Advanced features
Handle dynamic content. Process in batch.
Wait for Dynamic Content
Handle SPAs and lazy-loaded content. Wait for elements or timeouts before extracting.
const dynamic = await stack0.extraction.extractAndWait({url: 'https://example.com/spa',mode: 'schema',waitForSelector: '.content-loaded',waitForTimeout: 3000,schema: { ... },})
Batch Processing
Extract from multiple URLs with a shared schema. Process in parallel with webhook notifications.
const batch = await stack0.extraction.batchAndWait({urls: ['https://store.example.com/product/1','https://store.example.com/product/2','https://store.example.com/product/3',],config: {mode: 'schema',schema: { name: { type: 'string' }, price: { type: 'number' } },},})
Async with Webhooks
Start extractions and receive results via webhook. Perfect for background processing pipelines.
// Start extraction (returns immediately)const { id } = await stack0.extraction.extract({url: 'https://example.com',mode: 'schema',schema: { ... },webhookUrl: 'https://yourapp.com/webhook',webhookSecret: 'your-secret',})// Webhook receives:{event: 'extraction.completed',data: {id: 'ext_abc123',status: 'completed',extractedData: { ... },processingTimeMs: 1840,}}
Reliability
Works when sites change
Semantic Understanding
AI understands content meaning, not just HTML structure
Consistent Output
Schema validation ensures you always get the structure you expect
No Maintenance
No selectors to update when sites change their HTML
Pricing
Simple, usage-based pricing
AI tokens included. No hidden costs for complex pages.
AI-powered content extraction and parsing.
Plans start at $5/month. No long-term contracts.
Stop writing brittle scrapers
Define your schema once, extract structured data from any page. AI-powered extraction that works when sites change.