Data Structures
Core data structures for representing OCR results.
Data Model
The following diagram shows the relationships between data structures:
graph LR
%% Entities
Page[Page]
Block[Block]
Line[Line]
Word[Word]
%% Relations
Page -->|"blocks: List[Block]"| Block
Block -->|"lines: List[Line]"| Line
Line -->|"words: List[Word]"| Word
%% Word fields
Word --> Wpoly["polygon: List[(x, y)]<br>≥ 4 points, clockwise"]
Word --> Wdet["detection_confidence: float (0–1)"]
Word --> Wtext["text: Optional[str]"]
Word --> Wrec["recognition_confidence: Optional[float] (0–1)"]
Word --> WordOrder["order: Optional[int]<br>assigned after sorting"]
%% Line fields
Line --> LineOrder["order: Optional[int]<br>assigned after sorting"]
%% Block fields
Block --> BlockOrder["order: Optional[int]<br>assigned after sorting"]
API Reference
Data structures for manuscript OCR.
This package contains the core data structures used to represent OCR results throughout the manuscript-ocr library.
- class manuscript.data.Word(*args, **kwargs)[source]
Bases:
BaseModelA single detected or recognized word.
- Parameters:
args (Any)
kwargs (Any)
- Return type:
Any
- polygon
Polygon vertices (x, y), ordered clockwise. For quadrilateral text regions: TL → TR → BR → BL (Top-Left, Top-Right, Bottom-Right, Bottom-Left).
- text
Recognized text content (populated by OCR pipeline). None if only detection was performed.
- Type:
str, optional
- recognition_confidence
Text recognition confidence score from recognizer (0.0 to 1.0). None if only detection was performed.
- Type:
float, optional
Examples
>>> word = Word( ... polygon=[(10, 20), (100, 20), (100, 40), (10, 40)], ... detection_confidence=0.95, ... text="Hello", ... recognition_confidence=0.98 ... ) >>> print(word.text) Hello
Methods
__call__(*args, **kwargs)Call self as a function.
detection_confidence
order
polygon
recognition_confidence
text
- class manuscript.data.Line(*args, **kwargs)[source]
Bases:
BaseModelA single text line containing one or more words.
- Parameters:
args (Any)
kwargs (Any)
- Return type:
Any
Examples
>>> line = Line(words=[ ... Word(polygon=[(10, 20), (50, 20), (50, 40), (10, 40)], ... detection_confidence=0.95, text="Hello"), ... Word(polygon=[(60, 20), (110, 20), (110, 40), (60, 40)], ... detection_confidence=0.97, text="World"), ... ]) >>> print(len(line.words)) 2
Methods
__call__(*args, **kwargs)Call self as a function.
order
- class manuscript.data.Block(*args, **kwargs)[source]
Bases:
BaseModelA logical text block (e.g., paragraph, column).
- words
Legacy: Direct list of words without line structure. Used for backward compatibility. If both lines and words are empty, creates a single line from words.
- Type:
List[Word], optional
Examples
>>> block = Block(lines=[ ... Line(words=[Word(polygon=[(10, 20), (50, 20), (50, 40), (10, 40)], ... detection_confidence=0.95, text="Line 1")]), ... Line(words=[Word(polygon=[(10, 50), (50, 50), (50, 70), (10, 70)], ... detection_confidence=0.97, text="Line 2")]), ... ]) >>> print(len(block.lines)) 2
Methods
__call__(*args, **kwargs)Call self as a function.
lines
order
words
- class manuscript.data.Page(*args, **kwargs)[source]
Bases:
BaseModelA document page containing blocks of text.
For a full visual diagram of the data model, see:
DATA_MODEL.mdlocated in the same module directory.- Parameters:
args (Any)
kwargs (Any)
- Return type:
Any
Examples
>>> page = Page(blocks=[ ... Block(lines=[ ... Line(words=[Word(polygon=[(10, 20), (50, 20), (50, 40), (10, 40)], ... detection_confidence=0.95, text="Hello")]) ... ]) ... ]) >>> print(len(page.blocks)) 1
Methods
__call__(*args, **kwargs)Call self as a function.
from_json(source)Load Page from JSON file or string.
to_json([path, indent])Export Page to JSON.