Data Structures
Core data structures for representing OCR results.
Data Model
The following diagram shows the relationships between data structures:
graph LR
%% Entities
Page[Page]
Block[Block]
Line[Line]
TextSpan[TextSpan]
%% Relations
Page -->|"blocks: List[Block]"| Block
Block -->|"lines: List[Line]"| Line
Line -->|"text_spans: List[TextSpan]"| TextSpan
%% TextSpan fields
TextSpan --> Tpoly["polygon: List[(x, y)]<br>≥ 4 points, clockwise"]
TextSpan --> Tdet["detection_confidence: float (0–1)"]
TextSpan --> Ttext["text: Optional[str]"]
TextSpan --> Trec["recognition_confidence: Optional[float] (0–1)"]
TextSpan --> Torder["order: Optional[int]<br>assigned after sorting"]
%% Line fields
Line --> LineOrder["order: Optional[int]<br>assigned after sorting"]
%% Block fields
Block --> BlockOrder["order: Optional[int]<br>assigned after sorting"]
Block --> FlatInput["text_spans: List[TextSpan]<br>optional flat input"]
Compatibility
The canonical names in v0_1_11 are TextSpan and text_spans. For
code and services that still target v0_1_10, Word and words
remain available as compatibility aliases on import, validation, and Python
attribute access.
When exporting OCR results, choose the schema explicitly:
page.to_dict(schema="v0_1_11")
page.to_json("result.json", schema="v0_1_10")
Use "v0_1_10" only for legacy JSON consumers. New integrations should
prefer "v0_1_11".
Module Reference
Data structures for manuscript OCR.
This package contains the core data structures used to represent OCR results throughout the manuscript-ocr library.
- class manuscript.data.TextSpan(*args, **kwargs)[source]
Bases:
BaseModelA single detected or recognized text span.
A text span is the smallest OCR region in the pipeline. It may correspond to a word, a whole text line, or any other contiguous text segment returned by a detector.
- Parameters:
args (Any)
kwargs (Any)
- Return type:
Any
- polygon
Polygon vertices (x, y), ordered clockwise. The public data model supports arbitrary polygons with 4 or more points. For quadrilateral text regions, the canonical order is TL -> TR -> BR -> BL (Top-Left, Top-Right, Bottom-Right, Bottom-Left).
- text
Recognized text content (populated by OCR pipeline). None if only detection was performed.
- Type:
str, optional
- recognition_confidence
Text recognition confidence score from recognizer (0.0 to 1.0). None if only detection was performed.
- Type:
float, optional
Examples
>>> text_span = TextSpan( ... polygon=[(10, 20), (100, 20), (100, 40), (10, 40)], ... detection_confidence=0.95, ... text="Hello", ... recognition_confidence=0.98 ... ) >>> print(text_span.text) Hello
Methods
__call__(*args, **kwargs)Call self as a function.
detection_confidence
model_config
order
polygon
recognition_confidence
text
- class manuscript.data.Line(*args, **kwargs)[source]
Bases:
BaseModelA single text line containing one or more text spans.
Examples
>>> line = Line(text_spans=[ ... TextSpan( ... polygon=[(10, 20), (50, 20), (50, 40), (10, 40)], ... detection_confidence=0.95, ... text="Hello", ... ), ... TextSpan( ... polygon=[(60, 20), (110, 20), (110, 40), (60, 40)], ... detection_confidence=0.97, ... text="World", ... ), ... ]) >>> print(len(line.text_spans)) 2
- Attributes:
wordsBackward-compatible alias for
text_spans.
- Parameters:
args (Any)
kwargs (Any)
- Return type:
Any
Methods
__call__(*args, **kwargs)Call self as a function.
model_config
order
text_spans
- class manuscript.data.Block(*args, **kwargs)[source]
Bases:
BaseModelA logical text block (e.g., paragraph, column).
- text_spans
Optional flat list of text spans used as a shorthand input. If
linesis empty andtext_spansare provided, they are wrapped into a single line.- Type:
List[TextSpan], optional
Examples
>>> block = Block(lines=[ ... Line(text_spans=[ ... TextSpan( ... polygon=[(10, 20), (50, 20), (50, 40), (10, 40)], ... detection_confidence=0.95, ... text="Line 1", ... ) ... ]), ... Line(text_spans=[ ... TextSpan( ... polygon=[(10, 50), (50, 50), (50, 70), (10, 70)], ... detection_confidence=0.97, ... text="Line 2", ... ) ... ]), ... ]) >>> print(len(block.lines)) 2
- Attributes:
wordsBackward-compatible alias for flat
text_spansinput.
Methods
__call__(*args, **kwargs)Call self as a function.
lines
model_config
order
text_spans
- class manuscript.data.Page(*args, **kwargs)[source]
Bases:
BaseModelA document page containing blocks of text.
For a full visual diagram of the data model, see:
DATA_MODEL.mdlocated in the same module directory.- Parameters:
args (Any)
kwargs (Any)
- Return type:
Any
Examples
>>> page = Page(blocks=[ ... Block(lines=[ ... Line(text_spans=[ ... TextSpan( ... polygon=[(10, 20), (50, 20), (50, 40), (10, 40)], ... detection_confidence=0.95, ... text="Hello", ... ) ... ]) ... ]) ... ]) >>> print(len(page.blocks)) 1
Methods
__call__(*args, **kwargs)Call self as a function.
from_json(source)Load Page from JSON file or string.
to_dict([schema])Export Page to a plain Python dictionary.
to_json([path, indent, schema])Export Page to JSON.
model_config