Getting Started
Installation
Minimum System Requirements
Linux (Ubuntu 20.04+), Windows 10+, or macOS 11+
at least 2 CPU cores
at least 8 GB RAM
16 GB RAM or more is recommended for model training
NVIDIA GPU with CUDA support is recommended for acceleration
at least 4 GB VRAM is recommended for GPU execution
CPU-only execution is supported, but performance may be significantly lower than on GPU. Model training currently targets CUDA-capable NVIDIA GPUs and falls back to CPU when CUDA is unavailable.
Basic installation (inference only):
pip install manuscript-ocr
Installation with training support (includes PyTorch):
pip install manuscript-ocr[dev]
This installs additional dependencies for model training:
PyTorch and TorchVision
ONNX export tools
Training utilities (albumentations, tensorboard, etc.)
Development tools (pytest, black, flake8, etc.)
GPU acceleration (NVIDIA CUDA):
If you are switching an existing installation from CPU to GPU:
Remove the CPU version of ONNX Runtime and install the GPU version:
pip uninstall onnxruntime
pip install onnxruntime-gpu
If you are working in Jupyter Notebook, JupyterLab, VS Code notebooks, or Google Colab, restart the kernel or runtime after installation.
Reinstalling manuscript-ocr is not required.
You can switch models and pipeline components explicitly with the device
parameter, for example device="cuda" for NVIDIA GPU or device="cpu"
for CPU:
from manuscript.detectors import EAST
from manuscript.recognizers import TRBA
from manuscript.correctors import CharLM
detector = EAST(device="cuda")
recognizer = TRBA(device="cuda")
corrector = CharLM(device="cuda")
Diagnostics
If the pipeline still does not switch to GPU, first run:
import onnxruntime as ort
print(ort.get_available_providers())
Case 1. "CUDAExecutionProvider" is missing
Install additional CUDA/cuDNN runtime packages:
pip install nvidia-cudnn-cu12 nvidia-cublas-cu12 nvidia-cuda-runtime-cu12 nvidia-cufft-cu12
Then restart the kernel or runtime and create the Pipeline again.
If ONNX Runtime appears to be installed but still behaves incorrectly in a notebook environment, perform a clean GPU reinstall:
pip uninstall -y onnxruntime
pip install --no-cache-dir --force-reinstall onnxruntime-gpu==1.24.4
pip install --no-cache-dir nvidia-cudnn-cu12 nvidia-cublas-cu12 nvidia-cuda-runtime-cu12 nvidia-cufft-cu12
After that, restart the kernel or runtime again and re-import manuscript.
Case 2. "CUDAExecutionProvider" is present, but the models still fall back to CPU
In some notebook environments, ONNX Runtime may require an explicit preload
step before importing manuscript:
import onnxruntime as ort
ort.preload_dlls(directory="")
After that, import manuscript and create the Pipeline again.
Apple Silicon acceleration (CoreML):
pip install manuscript-ocr
pip install onnxruntime-silicon
Then use device="coreml" for the relevant models or pipeline components.
Quick Start
Basic usage example:
from manuscript import Pipeline
# Create pipeline
pipeline = Pipeline()
# Process image
result = pipeline.predict("document.jpg")
# Get recognized text
text = pipeline.get_text(result["page"])
print(text)
Example Notebooks
Current example notebooks are available in the repository notebooks
folder:
Main Components
Pipeline- High-level OCR pipelineYOLO- ONNX text detector for YOLO-family modelsEAST- Text detectorSimpleSorting- Layout ordering modelTRBA- Text recognizerCharLM- Character-level text correctorPage- Page data structureBlock- Block data structureLine- Line data structureTextSpan- Smallest OCR text region
Model Zoo
For the list of built-in presets and release artifacts documented for this documentation version, see Model Zoo.