Graflows - Turn Your Documents into Structured Data

Introduction

Graflows is a high-performance Document Intelligence API designed for the modern stack. It provides developers with the tools to convert unstructured PDFs and images into highly accurate, structured JSON data while preserving complex layout patterns and table structures.

Layout Aware

Maintains the spatial relationships of text and tables for precise extraction.

High Performance

Optimized for speed, from small receipts to thousand-page reports.

Authentication

All Graflows API requests require an API key to be sent in the header. Requests without a valid key will return a 401 Unauthorized response.

Authorization:

Bearer <YOUR_API_KEY>

Where to find your API key?

Extraction Modes

Synchronous ExtractionRecommended for UI

Best for real-time applications where immediate feedback is required. The connection stays open until the processing is complete.

Limit: 3 pages per document

Timeout: 60 seconds

Asynchronous JobsBatch Processing

Designed for processing large volumes of documents or long files. Upload the file, receive a job_id, and poll for results.

Limit: Up to 1,000 pages (Enterprise)

Webhook support coming soon

Schema Guide

Define exactly what you want to extract using a dynamic JSON schema. Graflows supports simple key-value pairs, complex nested structures, and optional schema memory for reusing approved examples.

Complex Invoice Example

JSON Schema

{
  "invoice_number": "Invoice ID string",
  "vendor_name": "Name of the issuer",
  "total_amount": "Total amount in numeric value",
  "line_items": {
    "_type": "list",
    "_description": "A list of individual charges",
    "description": "Item description",
    "quantity": "Number of units",
    "price": "Price per unit"
  }
}

Use the "_type": "list" convention to handle repeating structures like invoice line items or table rows.

For saved schemas, set settings.memory_enabled totrue to let approved files become reusable examples.

Document Memory

Reuse approved files as examples

Graflows can reuse a user's approved files as one-shot examples for future extractions of the same saved schema. Memory stays private to the current user and is only applied when you extract with schema_id.

Private by default

Memory is scoped to the current user and never shared across accounts.

Approval controlled

Only explicitly approved files become reusable examples for future runs.

Saved-schema only

Memory is used only when extracting against a saved schema via schema_id.

Step 1

Enable memory on a saved schema

Turn memory on at schema creation time, or patch an existing schema later.

Create schema with memory

curl -X POST "http://api.graflows.com/api/v1/schemas/" \
  -H "Authorization: Bearer <API_KEY>" \
  -H "Content-Type: application/json" \
  -d '{
    "name": "Invoices",
    "description": "Invoice extraction",
    "definition": {
      "invoice_number": "Invoice number",
      "vendor": "Vendor name",
      "total_amount": "Total amount"
    },
    "settings": {
      "memory_enabled": true
    }
  }'

Enable memory on an existing schema

curl -X PATCH "http://api.graflows.com/api/v1/schemas/{schema_id}" \
  -H "Authorization: Bearer <API_KEY>" \
  -H "Content-Type: application/json" \
  -d '{
    "settings": {
      "memory_enabled": true
    }
  }'

Step 2

Extract with the saved schema

When a similar approved file is found, the response includes the matched source file and similarity score.

Extract with schema_id

curl -X POST "http://api.graflows.com/api/v1/parse/extract" \
  -H "Authorization: Bearer <API_KEY>" \
  -F "schema_id={schema_id}" \
  -F "file=@invoice.pdf"

Relevant response fields

{
  "id": "file-uuid",
  "filename": "invoice.pdf",
  "status": "processed",
  "review_status": "pending",
  "schema_id": "schema-uuid",
  "schema_name": "Invoices",
  "schema_version_number": 1,
  "normalized_result": {
    "invoice_number": "INV-1001",
    "vendor": "Acme Corp",
    "total_amount": "$1250.00"
  },
  "raw_result": {
    "invoice_number": "INV-1001",
    "vendor": "Acme Corp",
    "total_amount": "$1250.00"
  },
  "approved_at": null,
  "memory_source_file_id": "file-prior-approved",
  "memory_similarity": 0.94
}

Step 3

Approve the corrected normalized result

Approval should include the edited normalized_result. That approved JSON becomes the reusable example output for future extractions.

Approve a file

curl -X PATCH "http://api.graflows.com/api/v1/files/{file_id}/review" \
  -H "Authorization: Bearer <API_KEY>" \
  -H "Content-Type: application/json" \
  -d '{
    "review_status": "approved",
    "normalized_result": {
      "invoice_number": "INV-1001",
      "vendor": "Acme Corp",
      "total_amount": "$1250.00"
    }
  }'

Step 4

Reject files you do not want reused

Rejected files stay out of memory and can still be inspected in the dashboard.

Reject a file

curl -X PATCH "http://api.graflows.com/api/v1/files/{file_id}/review" \
  -H "Authorization: Bearer <API_KEY>" \
  -H "Content-Type: application/json" \
  -d '{
    "review_status": "rejected"
  }'

API Reference

POST/parse/extract

Performs immediate extraction on documents under 3 pages. Use extraction_schema for ad hoc runs, or pass schema_id to target a saved schema with optional memory.

Ad hoc extraction

curl -X POST http://api.graflows.com/api/v1/parse/extract \
  -H "Authorization: Bearer <API_KEY>" \
  -F "file=@invoice.pdf" \
  -F 'extraction_schema={"total": "Total amount"}'

Saved-schema extraction

curl -X POST "http://api.graflows.com/api/v1/parse/extract" \
  -H "Authorization: Bearer <API_KEY>" \
  -F "schema_id={schema_id}" \
  -F "file=@invoice.pdf"

Saved-schema response fields

{
  "id": "file-uuid",
  "filename": "invoice.pdf",
  "status": "processed",
  "review_status": "pending",
  "schema_id": "schema-uuid",
  "schema_name": "Invoices",
  "schema_version_number": 1,
  "normalized_result": {
    "invoice_number": "INV-1001",
    "vendor": "Acme Corp",
    "total_amount": "$1250.00"
  },
  "raw_result": {
    "invoice_number": "INV-1001",
    "vendor": "Acme Corp",
    "total_amount": "$1250.00"
  },
  "approved_at": null,
  "memory_source_file_id": "file-prior-approved",
  "memory_similarity": 0.94
}

POST/schemas/

Create a saved schema and enable memory with settings.memory_enabled when you want approved files to become reusable examples.

Create schema

curl -X POST "http://api.graflows.com/api/v1/schemas/" \
  -H "Authorization: Bearer <API_KEY>" \
  -H "Content-Type: application/json" \
  -d '{
    "name": "Invoices",
    "description": "Invoice extraction",
    "definition": {
      "invoice_number": "Invoice number",
      "vendor": "Vendor name",
      "total_amount": "Total amount"
    },
    "settings": {
      "memory_enabled": true
    }
  }'

PATCH/schemas/{schema_id}

Update an existing schema to turn memory on later without changing the schema definition.

Enable memory on schema

curl -X PATCH "http://api.graflows.com/api/v1/schemas/{schema_id}" \
  -H "Authorization: Bearer <API_KEY>" \
  -H "Content-Type: application/json" \
  -d '{
    "settings": {
      "memory_enabled": true
    }
  }'

PATCH/files/{file_id}/review

Approve or reject a processed file. Approval should include the edited normalized_result; rejected files are excluded from memory.

Approve file

curl -X PATCH "http://api.graflows.com/api/v1/files/{file_id}/review" \
  -H "Authorization: Bearer <API_KEY>" \
  -H "Content-Type: application/json" \
  -d '{
    "review_status": "approved",
    "normalized_result": {
      "invoice_number": "INV-1001",
      "vendor": "Acme Corp",
      "total_amount": "$1250.00"
    }
  }'

Reject file

curl -X PATCH "http://api.graflows.com/api/v1/files/{file_id}/review" \
  -H "Authorization: Bearer <API_KEY>" \
  -H "Content-Type: application/json" \
  -d '{
    "review_status": "rejected"
  }'

POST/jobs/

Creates a background processing job for large documents.

Request (CURL)

curl -X POST http://api.graflows.com/api/v1/jobs/ \
  -H "Authorization: Bearer <API_KEY>" \
  -F "file=@large-report.pdf"

GET/jobs/{id}

Retrieve the status and results of an async job.

POST/parse/markdown

Utility endpoint to export document contents as clean Markdown.

Request (CURL)

curl -X POST http://api.graflows.com/api/v1/parse/markdown \
  -H "Authorization: Bearer <API_KEY>" \
  -F "file=@document.pdf"

Ready to start building?

Get your API key today and start extracting data from your documents with just a few lines of code.