Extract, classify, and summarize documents at scale
Process PDFs, Word docs, and scanned images with AI workflows. Parse content, classify document types, extract entities, and generate summaries in a single pipeline.
import { Stack0 } from '@stack0/sdk'const stack0 = new Stack0({ apiKey: process.env.STACK0_API_KEY })// Process a document through extraction, classification, and summarizationconst result = await stack0.workflows.run({steps: [{id: 'parse',type: 'tool',tool: 'document.parse',input: {url: '{{input.documentUrl}}',ocr: true, // Enable OCR for scanned documents},},{id: 'classify',type: 'llm',model: 'gpt-4o-mini',prompt: 'Classify this document into one of: invoice, contract, report, letter, form. Document: {{steps.parse.output.text}}',outputSchema: {category: 'string',confidence: 'number',},dependsOn: ['parse'],},{id: 'extract',type: 'llm',model: 'gpt-4o',prompt: 'Extract key entities from this {{steps.classify.output.category}}: {{steps.parse.output.text}}',outputSchema: {entities: '{ name: string, type: string, value: string }[]',dates: 'string[]',amounts: '{ value: number, currency: string }[]',},dependsOn: ['parse', 'classify'],},{id: 'summarize',type: 'llm',model: 'gpt-4o-mini',prompt: 'Summarize this document in 2-3 sentences: {{steps.parse.output.text}}',dependsOn: ['parse'],},],input: {documentUrl: 'https://example.com/invoice-2024-001.pdf',},})console.log(result.steps.classify.output) // { category: 'invoice', confidence: 0.97 }console.log(result.steps.extract.output) // { entities: [...], dates: [...], amounts: [...] }console.log(result.steps.summarize.output) // "This invoice from Acme Corp..."
What's included
PDF & Word Parsing
Extract text, tables, and metadata from PDFs and Word documents. Preserves structure and formatting.
OCR
Automatic OCR for scanned documents and images. Handles multi-language, tables, and handwriting.
Classification
Categorize documents by type with confidence scores. Customizable categories for your domain.
Summarization
Generate concise summaries of long documents. Configurable length and focus areas.
Entity Extraction
Extract people, companies, dates, amounts, and custom entities into structured JSON.
Batch Processing
Submit up to 1,000 documents per batch. Parallel processing with webhook delivery.
Built for production
Multi-format parsing
PDF, DOCX, XLSX, images, and scanned documents. One API handles all formats with automatic OCR detection.
Intelligent classification
Automatically categorize documents by type. Invoices, contracts, reports, and custom categories for your domain.
Structured entity extraction
Pull out names, dates, amounts, and custom entities. Returns typed JSON you can store directly in your database.
Batch processing
Submit thousands of documents at once. Process in parallel with results delivered via webhook.
TypeScript SDK
Full type safety for document processing results. Schema validation ensures your output matches expectations.
Simple pricing
$0.001 per step execution. Parse, classify, extract, and summarize a document for under a cent.
Common implementations
Invoice Processing
Extract line items, totals, vendor details, and payment terms from invoices for accounts payable automation.
Contract Analysis
Parse contracts to identify key clauses, obligations, renewal dates, and risk factors.
Resume Screening
Extract skills, experience, education, and contact info from resumes to populate your ATS.
Insurance Claims
Process claim documents to extract incident details, policy numbers, and damage assessments.
FAQ
Frequently asked questions
The document parser supports PDF, DOCX, XLSX, PPTX, TXT, CSV, HTML, and common image formats (PNG, JPG, TIFF) with OCR. Scanned PDFs are automatically detected and processed with OCR.
When OCR is enabled, we detect whether pages contain text layers or are image-based. Image-based pages are processed with high-accuracy OCR that handles multiple languages, tables, and handwriting. OCR adds roughly 1-2 seconds per page.
Yes. Use the batch endpoint to submit up to 1,000 documents per request. Documents are processed in parallel and results are delivered via webhook or polling. Batch processing is ideal for backfill jobs and nightly imports.
Classification accuracy depends on the model and document type. GPT-4o achieves 95%+ accuracy on standard business documents like invoices, contracts, and reports. You can improve accuracy by customizing the classification categories for your domain.
Individual documents can be up to 50MB and 500 pages. For larger documents, split them into sections before processing. The parser preserves page numbers and section headers for traceability.