Getting Started

Installation

Basic installation (inference only):

pip install manuscript-ocr

Installation with training support (includes PyTorch):

pip install manuscript-ocr[dev]

This installs additional dependencies for model training:

PyTorch and TorchVision
ONNX export tools
Training utilities (albumentations, tensorboard, etc.)
Development tools (pytest, black, flake8, etc.)

GPU acceleration (NVIDIA CUDA):

pip install manuscript-ocr
pip install onnxruntime-gpu

Apple Silicon acceleration (CoreML):

pip install manuscript-ocr
pip install onnxruntime-silicon

Quick Start

Basic usage example:

from manuscript import Pipeline

# Create pipeline
pipeline = Pipeline()

# Process image
result = pipeline.predict("document.jpg")

# Get recognized text
text = pipeline.get_text(result["page"])
print(text)

Example Notebooks

Current example notebooks are available in the repository notebooks folder:

Main Components

Pipeline - High-level OCR pipeline
YOLO - ONNX text detector for YOLO-family models
EAST - Text detector
SimpleSorting - Layout ordering model
TRBA - Text recognizer
CharLM - Character-level text corrector
Page - Page data structure
Block - Block data structure
Line - Line data structure
TextSpan - Smallest OCR text region

Model Zoo

For the list of built-in presets and release artifacts documented for this documentation version, see Model Zoo.