Library Structure
=================
Overview of the manuscript-ocr library architecture.
.. mermaid::
graph LR
%% Main package
manuscript[manuscript]
%% Core modules
manuscript --> Pipeline["Pipeline
Главный класс для OCR"]
manuscript --> data["data
Структуры данных"]
manuscript --> detectors["detectors
Детекторы текста"]
manuscript --> recognizers["recognizers
Распознаватели текста"]
manuscript --> utils["utils
Утилиты"]
manuscript --> api["api
API базовые классы"]
%% Pipeline methods
Pipeline --> p_predict["predict()
→ Dict | Tuple[Dict, Image]"]
Pipeline --> p_get_text["get_text()
→ str"]
%% Data structures
data --> Page["Page"]
data --> Block["Block"]
data --> Line["Line"]
data --> Word["Word"]
Page --> page_blocks["blocks: List[Block]"]
Block --> block_lines["lines: List[Line]"]
Block --> block_order["order: Optional[int]"]
Line --> line_words["words: List[Word]"]
Line --> line_order["order: Optional[int]"]
Word --> word_polygon["polygon: List[(x,y)]"]
Word --> word_det_conf["detection_confidence: float"]
Word --> word_text["text: Optional[str]"]
Word --> word_rec_conf["recognition_confidence: Optional[float]"]
Word --> word_order["order: Optional[int]"]
%% Detectors
detectors --> EAST["EAST
Efficient and Accurate Scene Text Detector"]
EAST --> east_predict["predict()
→ Dict[str, Any]"]
EAST --> east_train["train()
→ None"]
EAST --> east_export["export()
→ str"]
%% Recognizers
recognizers --> TRBA["TRBA
Text Recognition with BiLSTM + Attention"]
TRBA --> trba_predict["predict()
→ List[Dict[str, Any]]"]
TRBA --> trba_train["train()
→ None"]
TRBA --> trba_export["export()
→ str"]
%% Utils submodules
utils --> io["io
Чтение/запись"]
utils --> visualization["visualization
Визуализация результатов"]
utils --> sorting["sorting
Сортировка и организация"]
utils --> training["training
Обучение моделей"]
%% IO functions
io --> read_image["read_image()
→ np.ndarray"]
%% Visualization functions
visualization --> visualize_page["visualize_page()
→ Image"]
%% Sorting functions
sorting --> organize_page["organize_page()
→ Page"]
%% Training functions
training --> set_seed["set_seed()
→ None"]
%% API
api --> BaseModel["BaseModel
Базовый класс для моделей"]
BaseModel --> base_predict["predict() abstract"]
%% Styles
style manuscript fill:#1f2937,color:#ffffff,stroke:#111827,stroke-width:2px
style Pipeline fill:#fde68a,color:#111827,stroke:#92400e,stroke-width:2px
style data fill:#bbf7d0,color:#064e3b,stroke:#047857
style detectors fill:#bfdbfe,color:#1e3a8a,stroke:#2563eb
style recognizers fill:#ddd6fe,color:#4c1d95,stroke:#7c3aed
style utils fill:#fed7aa,color:#7c2d12,stroke:#ea580c
style api fill:#fecaca,color:#7f1d1d,stroke:#dc2626
Module Descriptions
-------------------
**Pipeline**
The main high-level interface that combines detection and recognition
into a single OCR workflow.
**data**
Data structures (``Page``, ``Block``, ``Line``, ``Word``) for representing
OCR results in a hierarchical format.
**detectors**
Text detection models. Currently includes ``EAST`` (Efficient and Accurate
Scene Text Detector).
**recognizers**
Text recognition models. Currently includes ``TRBA`` (Text Recognition with
BiLSTM and Attention mechanism).
**utils**
Utility functions for:
- I/O operations (``read_image``)
- Visualization (``visualize_page``)
- Sorting and organization (``organize_page``)
- Training utilities (``set_seed``)