Library Structure
Overview of the manuscript-ocr library architecture.
graph LR
%% Main package
manuscript[manuscript]
%% Core modules
manuscript --> Pipeline["Pipeline<br/><i>Главный класс для OCR</i>"]
manuscript --> data["data<br/><i>Структуры данных</i>"]
manuscript --> detectors["detectors<br/><i>Детекторы текста</i>"]
manuscript --> recognizers["recognizers<br/><i>Распознаватели текста</i>"]
manuscript --> utils["utils<br/><i>Утилиты</i>"]
manuscript --> api["api<br/><i>API базовые классы</i>"]
%% Pipeline methods
Pipeline --> p_predict["predict()<br/><i>→ Dict | Tuple[Dict, Image]</i>"]
Pipeline --> p_get_text["get_text()<br/><i>→ str</i>"]
%% Data structures
data --> Page["Page"]
data --> Block["Block"]
data --> Line["Line"]
data --> Word["Word"]
Page --> page_blocks["blocks: List[Block]"]
Block --> block_lines["lines: List[Line]"]
Block --> block_order["order: Optional[int]"]
Line --> line_words["words: List[Word]"]
Line --> line_order["order: Optional[int]"]
Word --> word_polygon["polygon: List[(x,y)]"]
Word --> word_det_conf["detection_confidence: float"]
Word --> word_text["text: Optional[str]"]
Word --> word_rec_conf["recognition_confidence: Optional[float]"]
Word --> word_order["order: Optional[int]"]
%% Detectors
detectors --> EAST["EAST<br/><i>Efficient and Accurate Scene Text Detector</i>"]
EAST --> east_predict["predict()<br/><i>→ Dict[str, Any]</i>"]
EAST --> east_train["train()<br/><i>→ None</i>"]
EAST --> east_export["export()<br/><i>→ str</i>"]
%% Recognizers
recognizers --> TRBA["TRBA<br/><i>Text Recognition with BiLSTM + Attention</i>"]
TRBA --> trba_predict["predict()<br/><i>→ List[Dict[str, Any]]</i>"]
TRBA --> trba_train["train()<br/><i>→ None</i>"]
TRBA --> trba_export["export()<br/><i>→ str</i>"]
%% Utils submodules
utils --> io["io<br/><i>Чтение/запись</i>"]
utils --> visualization["visualization<br/><i>Визуализация результатов</i>"]
utils --> sorting["sorting<br/><i>Сортировка и организация</i>"]
utils --> training["training<br/><i>Обучение моделей</i>"]
%% IO functions
io --> read_image["read_image()<br/><i>→ np.ndarray</i>"]
%% Visualization functions
visualization --> visualize_page["visualize_page()<br/><i>→ Image</i>"]
%% Sorting functions
sorting --> organize_page["organize_page()<br/><i>→ Page</i>"]
%% Training functions
training --> set_seed["set_seed()<br/><i>→ None</i>"]
%% API
api --> BaseModel["BaseModel<br/><i>Базовый класс для моделей</i>"]
BaseModel --> base_predict["predict() <i>abstract</i>"]
%% Styles
style manuscript fill:#1f2937,color:#ffffff,stroke:#111827,stroke-width:2px
style Pipeline fill:#fde68a,color:#111827,stroke:#92400e,stroke-width:2px
style data fill:#bbf7d0,color:#064e3b,stroke:#047857
style detectors fill:#bfdbfe,color:#1e3a8a,stroke:#2563eb
style recognizers fill:#ddd6fe,color:#4c1d95,stroke:#7c3aed
style utils fill:#fed7aa,color:#7c2d12,stroke:#ea580c
style api fill:#fecaca,color:#7f1d1d,stroke:#dc2626
Module Descriptions
- Pipeline
The main high-level interface that combines detection and recognition into a single OCR workflow.
- data
Data structures (
Page,Block,Line,Word) for representing OCR results in a hierarchical format.- detectors
Text detection models. Currently includes
EAST(Efficient and Accurate Scene Text Detector).- recognizers
Text recognition models. Currently includes
TRBA(Text Recognition with BiLSTM and Attention mechanism).- utils
Utility functions for:
I/O operations (
read_image)Visualization (
visualize_page)Sorting and organization (
organize_page)Training utilities (
set_seed)