Data Structures

Core data structures for representing OCR results.

Data Model

The following diagram shows the relationships between data structures:

        graph LR

    %% Entities
    Page[Page]
    Block[Block]
    Line[Line]
    TextSpan[TextSpan]

    %% Relations
    Page -->|"blocks: List[Block]"| Block
    Block -->|"lines: List[Line]"| Line
    Line -->|"text_spans: List[TextSpan]"| TextSpan

    %% TextSpan fields
    TextSpan --> Tpoly["polygon: List[(x, y)]<br>≥ 4 points, clockwise"]
    TextSpan --> Tdet["detection_confidence: float (0–1)"]
    TextSpan --> Ttext["text: Optional[str]"]
    TextSpan --> Trec["recognition_confidence: Optional[float] (0–1)"]
    TextSpan --> Torder["order: Optional[int]<br>assigned after sorting"]

    %% Line fields
    Line --> LineOrder["order: Optional[int]<br>assigned after sorting"]

    %% Block fields
    Block --> BlockOrder["order: Optional[int]<br>assigned after sorting"]
    Block --> FlatInput["text_spans: List[TextSpan]<br>optional flat input"]

Module Reference

Data structures for manuscript OCR.

This package contains the core data structures used to represent OCR results throughout the manuscript-ocr library.

class manuscript.data.TextSpan(*args, **kwargs)[source]

Bases: BaseModel

A single detected or recognized text span.

A text span is the smallest OCR region in the pipeline. It may correspond to a word, a whole text line, or any other contiguous text segment returned by a detector.

Parameters:

args (Any)
kwargs (Any)

Return type:

Any

polygon

Polygon vertices (x, y), ordered clockwise. The public data model supports arbitrary polygons with 4 or more points. For quadrilateral text regions, the canonical order is TL -> TR -> BR -> BL (Top-Left, Top-Right, Bottom-Right, Bottom-Left).

Type:: List[Tuple[float, float]]

detection_confidence

Text detection confidence score from detector (0.0 to 1.0).

Type:: float

text

Recognized text content (populated by OCR pipeline). None if only detection was performed.

Type:: str, optional

recognition_confidence

Text recognition confidence score from recognizer (0.0 to 1.0). None if only detection was performed.

Type:: float, optional

order

Text span position inside the line after sorting. None before sorting.

Type:: int, optional

Examples

>>> text_span = TextSpan(
...     polygon=[(10, 20), (100, 20), (100, 40), (10, 40)],
...     detection_confidence=0.95,
...     text="Hello",
...     recognition_confidence=0.98
... )
>>> print(text_span.text)
Hello

Methods

__call__(*args, **kwargs)

Call self as a function.

detection_confidence
model_config
order
polygon
recognition_confidence
text

detection_confidence: float = Ellipsis

order: int | None = None

polygon: List[Tuple[float, float]] = Ellipsis

recognition_confidence: float | None = None

text: str | None = None

class manuscript.data.Line(*args, **kwargs)[source]

Bases: BaseModel

A single text line containing one or more text spans.

Parameters:

args (Any)
kwargs (Any)

Return type:

Any

text_spans

List of text spans in the line.

Type:: List[TextSpan]

order

Line position inside a block or page after sorting. None before sorting.

Type:: int, optional

Examples

>>> line = Line(text_spans=[
...     TextSpan(
...         polygon=[(10, 20), (50, 20), (50, 40), (10, 40)],
...         detection_confidence=0.95,
...         text="Hello",
...     ),
...     TextSpan(
...         polygon=[(60, 20), (110, 20), (110, 40), (60, 40)],
...         detection_confidence=0.97,
...         text="World",
...     ),
... ])
>>> print(len(line.text_spans))
2

Methods

__call__(*args, **kwargs)

Call self as a function.

model_config
order
text_spans

order: int | None = None

class manuscript.data.Block(*args, **kwargs)[source]

Bases: BaseModel

A logical text block (e.g., paragraph, column).

lines

List of text lines in the block.

Type:: List[Line]

text_spans

Optional flat list of text spans used as a shorthand input. If lines is empty and text_spans are provided, they are wrapped into a single line.

Type:: List[TextSpan], optional

order

Block reading-order position after sorting. None before sorting.

Type:: int, optional

Examples

>>> block = Block(lines=[
...     Line(text_spans=[
...         TextSpan(
...             polygon=[(10, 20), (50, 20), (50, 40), (10, 40)],
...             detection_confidence=0.95,
...             text="Line 1",
...         )
...     ]),
...     Line(text_spans=[
...         TextSpan(
...             polygon=[(10, 50), (50, 50), (50, 70), (10, 70)],
...             detection_confidence=0.97,
...             text="Line 2",
...         )
...     ]),
... ])
>>> print(len(block.lines))
2

Methods

__call__(*args, **kwargs)

Call self as a function.

lines
model_config
order
text_spans

__init__(**data)[source]: Initialize Block, normalizing flat text_spans into one line.

order: int | None = None

class manuscript.data.Page(*args, **kwargs)[source]

Bases: BaseModel

A document page containing blocks of text.

For a full visual diagram of the data model, see: DATA_MODEL.md located in the same module directory.

Parameters:

args (Any)
kwargs (Any)

Return type:

Any

blocks

List of text blocks on the page.

Type:: List[Block]

Examples

>>> page = Page(blocks=[
...     Block(lines=[
...         Line(text_spans=[
...             TextSpan(
...                 polygon=[(10, 20), (50, 20), (50, 40), (10, 40)],
...                 detection_confidence=0.95,
...                 text="Hello",
...             )
...         ])
...     ])
... ])
>>> print(len(page.blocks))
1

Methods

`__call__`(args, *kwargs)	Call self as a function.
`from_json`(source)	Load Page from JSON file or string.
`to_json`([path, indent])	Export Page to JSON.

model_config

classmethod from_json(source)[source]

Load Page from JSON file or string.

Parameters:: source (str or Path) – Path to JSON file or JSON string.
Returns:: Loaded Page object.
Return type:: Page

Examples

>>> page = Page.from_json("result.json")
>>> page = Page.from_json('{"blocks": [...]}')

to_json(path=None, indent=2)[source]

Export Page to JSON.

Parameters:

path (str or Path, optional) – If provided, saves JSON to file.
indent (int, optional) – JSON indentation. Default is 2.

Returns:

JSON string representation.

Return type:

str

Examples

>>> page.to_json("result.json")  # save to file
>>> json_str = page.to_json()    # get as string

blocks: List[Block]