Skip to the content.

API Reference

Use the Open Receipt OCR REST API to integrate receipt processing into your applications.

Base URL

http://localhost:3000

In production, replace with your deployment URL.

Authentication

Currently, the API does not require authentication. For production deployments, consider adding API key authentication.

OCR Jobs

Upload Files for OCR

Upload one or more files for OCR processing.

Endpoint: POST /ocr-jobs/upload

Request:

Multipart form data with the following fields:

Field Type Required Description
file File Yes The image file (JPEG, PNG, WebP, etc.)
ocrProvider_<index> String Yes OCR provider for the file at <index>
jobName String No Human-readable name for the job

The <index> in ocrProvider_<index> corresponds to the position of the file in the multipart request (starting at 0).

Example:

curl -X POST http://localhost:3000/ocr-jobs/upload \
  -F "file=@receipt.jpg" \
  -F "ocrProvider_0=mistral" \
  -F "jobName=Weekly Groceries"

Response:

{
  "id": 123
}

The id can be used to check job status and retrieve results.

Supported Providers:

Provider must be configured and available (see Configuration Guide).

Get Job Details

Retrieve job status and OCR results.

Endpoint: GET /ocr-jobs/:id

Example:

curl http://localhost:3000/ocr-jobs/123

Response:

{
  "id": 123,
  "name": "Weekly Groceries",
  "status": "completed",
  "createdAt": "2024-01-15T10:30:00Z",
  "updatedAt": "2024-01-15T10:35:00Z",
  "files": [
    {
      "id": 1,
      "originalName": "receipt.jpg",
      "storageKey": "uploads/123/receipt.jpg",
      "executions": [
        {
          "id": 1,
          "status": "completed",
          "ocrProvider": "mistral",
          "ocrData": "{\"markdown\": \"# Receipt\\n\\n...\"}",
          "createdAt": "2024-01-15T10:30:00Z",
          "updatedAt": "2024-01-15T10:35:00Z"
        }
      ]
    }
  ]
}

Job Status Values:

File Execution Status:

List Jobs

Get all jobs (pagination supported).

Endpoint: GET /ocr-jobs

Query Parameters:

Parameter Type Default Description
page Number 1 Page number
limit Number 20 Results per page
sort String createdAt:desc Sort field and direction

Example:

curl "http://localhost:3000/ocr-jobs?page=1&limit=10&sort=createdAt:desc"

Response:

{
  "data": [
    {
      "id": 123,
      "name": "Weekly Groceries",
      "status": "completed",
      "createdAt": "2024-01-15T10:30:00Z"
    }
  ],
  "pagination": {
    "total": 42,
    "page": 1,
    "limit": 10,
    "pages": 5
  }
}

Delete Job

Remove a job and its associated files.

Endpoint: DELETE /ocr-jobs/:id

Example:

curl -X DELETE http://localhost:3000/ocr-jobs/123

Response:

{
  "success": true
}

File Operations

Get File

Retrieve a specific file from a job.

Endpoint: GET /ocr-jobs/:jobId/files/:fileId

Example:

curl http://localhost:3000/ocr-jobs/123/files/1

Response:

Binary file content (original uploaded file).

Delete File

Remove a specific file from a job.

Endpoint: DELETE /ocr-jobs/:jobId/files/:fileId

Example:

curl -X DELETE http://localhost:3000/ocr-jobs/123/files/1

Health Check

Check if the API is running and healthy.

Endpoint: GET /health

Response:

{
  "status": "ok",
  "timestamp": "2024-01-15T10:30:00Z"
}

Error Responses

API errors follow this format:

{
  "statusCode": 400,
  "message": "Bad Request",
  "details": "Provider 'invalid' is not available"
}

Common Status Codes:

Code Meaning
200 Success
400 Bad request (invalid parameters)
404 Resource not found
409 Conflict (job already processing)
500 Server error

Rate Limiting

Currently not implemented, but recommended for production deployments.

Webhooks

Currently not implemented. Poll /ocr-jobs/:id for job status updates.

Examples

Upload Multiple Files

curl -X POST http://localhost:3000/ocr-jobs/upload \
  -F "file=@receipt1.jpg" \
  -F "ocrProvider_0=mistral" \
  -F "file=@receipt2.jpg" \
  -F "ocrProvider_1=openai" \
  -F "jobName=Batch OCR"

Monitor Job Status

#!/bin/bash

JOB_ID=$1

while true; do
  STATUS=$(curl -s http://localhost:3000/ocr-jobs/$JOB_ID | jq -r '.status')
  echo "Job status: $STATUS"
  
  if [ "$STATUS" = "completed" ] || [ "$STATUS" = "failed" ]; then
    break
  fi
  
  sleep 2
done

Extract Text from Completed Job

curl -s http://localhost:3000/ocr-jobs/123 | \
  jq '.files[0].executions[0].ocrData | fromjson'

Best Practices

  1. Handle Retries: Implement exponential backoff when polling job status
  2. Validate Input: Check file format and size before uploading
  3. Use Appropriate Providers: Choose providers based on your accuracy and cost requirements
  4. Clean Up: Periodically delete old jobs to manage storage
  5. Monitor Costs: Track API usage if using cloud OCR providers

Integration Examples

JavaScript/Node.js

const FormData = require('form-data');
const fs = require('fs');
const axios = require('axios');

async function uploadReceipt(filePath, provider) {
  const form = new FormData();
  form.append('file', fs.createReadStream(filePath));
  form.append('ocrProvider_0', provider);
  form.append('jobName', 'Receipt');

  const response = await axios.post(
    'http://localhost:3000/ocr-jobs/upload',
    form,
    { headers: form.getHeaders() }
  );

  return response.data.id;
}

async function waitForCompletion(jobId) {
  while (true) {
    const response = await axios.get(
      `http://localhost:3000/ocr-jobs/${jobId}`
    );

    if (response.data.status === 'completed') {
      return response.data;
    }

    await new Promise(resolve => setTimeout(resolve, 1000));
  }
}

Python

import requests
import time
import json

def upload_receipt(file_path, provider):
    with open(file_path, 'rb') as f:
        files = {'file': f}
        data = {'ocrProvider_0': provider}
        response = requests.post(
            'http://localhost:3000/ocr-jobs/upload',
            files=files,
            data=data
        )
    return response.json()['id']

def wait_for_completion(job_id):
    while True:
        response = requests.get(f'http://localhost:3000/ocr-jobs/{job_id}')
        data = response.json()
        
        if data['status'] == 'completed':
            return data
        
        time.sleep(1)

cURL

See examples above for basic cURL usage.