olmOCR

Adaptive Content-Aware Processing: Automatically classifies document content types including tables, diagrams, and mathematical equations to apply specialized OCR strategies for enhanced accuracy
Reinforcement Learning Optimization: GRPO RL training specifically enhances accuracy on mathematical equations, tables, and other difficult OCR cases
Excellent Benchmark Performance: Scores 82.4 overall on olmOCR-bench with strong results across arXiv documents, old scans, headers, footers, and multi-column layouts
Specialized Document Processing: Optimized for document images with longest dimension of 1288 pixels and requires specific metadata prompts for best results
Scalable Toolkit Support: Designed to work with the olmOCR toolkit for efficient VLLM-based inference capable of processing millions of documents

Tags: ai model vision ocr

Last modified 22 March 2026