Library Structure ================= Overview of the manuscript-ocr library architecture. .. mermaid:: graph LR %% Main package manuscript[manuscript] %% Core modules manuscript --> Pipeline["Pipeline
Главный класс для OCR"] manuscript --> data["data
Структуры данных"] manuscript --> detectors["detectors
Детекторы текста"] manuscript --> recognizers["recognizers
Распознаватели текста"] manuscript --> utils["utils
Утилиты"] manuscript --> api["api
API базовые классы"] %% Pipeline methods Pipeline --> p_predict["predict()
→ Dict | Tuple[Dict, Image]"] Pipeline --> p_get_text["get_text()
→ str"] %% Data structures data --> Page["Page"] data --> Block["Block"] data --> Line["Line"] data --> Word["Word"] Page --> page_blocks["blocks: List[Block]"] Block --> block_lines["lines: List[Line]"] Block --> block_order["order: Optional[int]"] Line --> line_words["words: List[Word]"] Line --> line_order["order: Optional[int]"] Word --> word_polygon["polygon: List[(x,y)]"] Word --> word_det_conf["detection_confidence: float"] Word --> word_text["text: Optional[str]"] Word --> word_rec_conf["recognition_confidence: Optional[float]"] Word --> word_order["order: Optional[int]"] %% Detectors detectors --> EAST["EAST
Efficient and Accurate Scene Text Detector"] EAST --> east_predict["predict()
→ Dict[str, Any]"] EAST --> east_train["train()
→ None"] EAST --> east_export["export()
→ str"] %% Recognizers recognizers --> TRBA["TRBA
Text Recognition with BiLSTM + Attention"] TRBA --> trba_predict["predict()
→ List[Dict[str, Any]]"] TRBA --> trba_train["train()
→ None"] TRBA --> trba_export["export()
→ str"] %% Utils submodules utils --> io["io
Чтение/запись"] utils --> visualization["visualization
Визуализация результатов"] utils --> sorting["sorting
Сортировка и организация"] utils --> training["training
Обучение моделей"] %% IO functions io --> read_image["read_image()
→ np.ndarray"] %% Visualization functions visualization --> visualize_page["visualize_page()
→ Image"] %% Sorting functions sorting --> organize_page["organize_page()
→ Page"] %% Training functions training --> set_seed["set_seed()
→ None"] %% API api --> BaseModel["BaseModel
Базовый класс для моделей"] BaseModel --> base_predict["predict() abstract"] %% Styles style manuscript fill:#1f2937,color:#ffffff,stroke:#111827,stroke-width:2px style Pipeline fill:#fde68a,color:#111827,stroke:#92400e,stroke-width:2px style data fill:#bbf7d0,color:#064e3b,stroke:#047857 style detectors fill:#bfdbfe,color:#1e3a8a,stroke:#2563eb style recognizers fill:#ddd6fe,color:#4c1d95,stroke:#7c3aed style utils fill:#fed7aa,color:#7c2d12,stroke:#ea580c style api fill:#fecaca,color:#7f1d1d,stroke:#dc2626 Module Descriptions ------------------- **Pipeline** The main high-level interface that combines detection and recognition into a single OCR workflow. **data** Data structures (``Page``, ``Block``, ``Line``, ``Word``) for representing OCR results in a hierarchical format. **detectors** Text detection models. Currently includes ``EAST`` (Efficient and Accurate Scene Text Detector). **recognizers** Text recognition models. Currently includes ``TRBA`` (Text Recognition with BiLSTM and Attention mechanism). **utils** Utility functions for: - I/O operations (``read_image``) - Visualization (``visualize_page``) - Sorting and organization (``organize_page``) - Training utilities (``set_seed``)