Extract, classify, and summarize documents at scale

Process PDFs, Word docs, and scanned images with AI workflows. Parse content, classify document types, extract entities, and generate summaries in a single pipeline.

PDF & Word ParsingOCRClassificationEntity Extraction
import { Stack0 } from '@stack0/sdk'
const stack0 = new Stack0({ apiKey: process.env.STACK0_API_KEY })
// Process a document through extraction, classification, and summarization
const result = await stack0.workflows.run({
steps: [
{
id: 'parse',
type: 'tool',
tool: 'document.parse',
input: {
url: '{{input.documentUrl}}',
ocr: true, // Enable OCR for scanned documents
},
},
{
id: 'classify',
type: 'llm',
model: 'gpt-4o-mini',
prompt: 'Classify this document into one of: invoice, contract, report, letter, form. Document: {{steps.parse.output.text}}',
outputSchema: {
category: 'string',
confidence: 'number',
},
dependsOn: ['parse'],
},
{
id: 'extract',
type: 'llm',
model: 'gpt-4o',
prompt: 'Extract key entities from this {{steps.classify.output.category}}: {{steps.parse.output.text}}',
outputSchema: {
entities: '{ name: string, type: string, value: string }[]',
dates: 'string[]',
amounts: '{ value: number, currency: string }[]',
},
dependsOn: ['parse', 'classify'],
},
{
id: 'summarize',
type: 'llm',
model: 'gpt-4o-mini',
prompt: 'Summarize this document in 2-3 sentences: {{steps.parse.output.text}}',
dependsOn: ['parse'],
},
],
input: {
documentUrl: 'https://example.com/invoice-2024-001.pdf',
},
})
console.log(result.steps.classify.output) // { category: 'invoice', confidence: 0.97 }
console.log(result.steps.extract.output) // { entities: [...], dates: [...], amounts: [...] }
console.log(result.steps.summarize.output) // "This invoice from Acme Corp..."

What's included

PDF & Word Parsing

Extract text, tables, and metadata from PDFs and Word documents. Preserves structure and formatting.

OCR

Automatic OCR for scanned documents and images. Handles multi-language, tables, and handwriting.

Classification

Categorize documents by type with confidence scores. Customizable categories for your domain.

Summarization

Generate concise summaries of long documents. Configurable length and focus areas.

Entity Extraction

Extract people, companies, dates, amounts, and custom entities into structured JSON.

Batch Processing

Submit up to 1,000 documents per batch. Parallel processing with webhook delivery.


Built for production

Multi-format parsing

PDF, DOCX, XLSX, images, and scanned documents. One API handles all formats with automatic OCR detection.

Intelligent classification

Automatically categorize documents by type. Invoices, contracts, reports, and custom categories for your domain.

Structured entity extraction

Pull out names, dates, amounts, and custom entities. Returns typed JSON you can store directly in your database.

Batch processing

Submit thousands of documents at once. Process in parallel with results delivered via webhook.

TypeScript SDK

Full type safety for document processing results. Schema validation ensures your output matches expectations.

Simple pricing

$0.001 per step execution. Parse, classify, extract, and summarize a document for under a cent.


Common implementations

Invoice Processing

Extract line items, totals, vendor details, and payment terms from invoices for accounts payable automation.

Contract Analysis

Parse contracts to identify key clauses, obligations, renewal dates, and risk factors.

Resume Screening

Extract skills, experience, education, and contact info from resumes to populate your ATS.

Insurance Claims

Process claim documents to extract incident details, policy numbers, and damage assessments.


FAQ

Frequently asked questions

The document parser supports PDF, DOCX, XLSX, PPTX, TXT, CSV, HTML, and common image formats (PNG, JPG, TIFF) with OCR. Scanned PDFs are automatically detected and processed with OCR.

When OCR is enabled, we detect whether pages contain text layers or are image-based. Image-based pages are processed with high-accuracy OCR that handles multiple languages, tables, and handwriting. OCR adds roughly 1-2 seconds per page.

Yes. Use the batch endpoint to submit up to 1,000 documents per request. Documents are processed in parallel and results are delivered via webhook or polling. Batch processing is ideal for backfill jobs and nightly imports.

Classification accuracy depends on the model and document type. GPT-4o achieves 95%+ accuracy on standard business documents like invoices, contracts, and reports. You can improve accuracy by customizing the classification categories for your domain.

Individual documents can be up to 50MB and 500 pages. For larger documents, split them into sections before processing. The parser preserves page numbers and section headers for traceability.


Ready to build?

Get started in minutes.

Get Started