Recognizers

Text recognition models.

class manuscript.recognizers.TRBA(weights=None, config=None, charset=None, device=None, force_download=False, rotate_threshold=1.5, region_preparer='bbox', region_preparer_options=None, min_text_size=5, batch_size=128, **kwargs)[source]

Bases: BaseRecognizer

Инициализация модели распознавания текста TRBA с использованием ONNX Runtime.

Methods

`__call__`(args, *kwargs)	Call self as a function.
`export`(weights_path, config_path, ...[, ...])	Экспорт модели TRBA PyTorch в формат ONNX.
`predict`(page[, image, batch_size, ...])	Распознаёт текст для текстовых областей на `Page` и возвращает обновлённый `Page`.
`runtime_providers`()	Get ONNX Runtime execution providers based on device.
`train`(train_csvs, train_roots[, val_csvs, ...])	Обучение модели распознавания текста TRBA на пользовательских наборах данных.

Parameters:

weights (str | None)
config (str | None)
charset (str | None)
device (str | None)
force_download (bool)
rotate_threshold (float | None)
region_preparer (str | Callable[[...], Sequence[Any]])
region_preparer_options (Dict[str, Any] | None)
min_text_size (int)
batch_size (int)

default_weights_name: str | None = 'trba_lite_g1'

pretrained_registry: Dict[str, str] = {'trba_base_g1': 'github://konstantinkozhin/manuscript-ocr/v0.1.0/trba_base_g1.onnx', 'trba_lite_g1': 'github://konstantinkozhin/manuscript-ocr/v0.1.0/trba_lite_g1.onnx', 'trba_lite_g2': 'github://konstantinkozhin/manuscript-ocr/v0.1.0/trba_lite_g2.onnx'}

config_registry = {'trba_base_g1': 'github://konstantinkozhin/manuscript-ocr/v0.1.0/trba_base_g1.json', 'trba_lite_g1': 'github://konstantinkozhin/manuscript-ocr/v0.1.0/trba_lite_g1.json', 'trba_lite_g2': 'github://konstantinkozhin/manuscript-ocr/v0.1.0/trba_lite_g2.json'}

charset_registry = {'trba_base_g1': 'github://konstantinkozhin/manuscript-ocr/v0.1.0/trba_base_g1.txt', 'trba_lite_g1': 'github://konstantinkozhin/manuscript-ocr/v0.1.0/trba_lite_g1.txt', 'trba_lite_g2': 'github://konstantinkozhin/manuscript-ocr/v0.1.0/trba_lite_g2.txt'}

__init__(weights=None, config=None, charset=None, device=None, force_download=False, rotate_threshold=1.5, region_preparer='bbox', region_preparer_options=None, min_text_size=5, batch_size=128, **kwargs)[source]

Parameters:

weights (str | None)
config (str | None)
charset (str | None)
device (str | None)
force_download (bool)
rotate_threshold (float | None)
region_preparer (str | Callable[[...], Sequence[Any]])
region_preparer_options (Dict[str, Any] | None)
min_text_size (int)
batch_size (int)

predict(page, image=None, batch_size=None, debug_save_dir=None)[source]

Распознаёт текст для текстовых областей на Page и возвращает обновлённый Page.

Return type:

Page

Parameters:

page (Page)
image (numpy.ndarray | str | Path | Image | None)
batch_size (int | None)
debug_save_dir (str | Path | None)

static train(train_csvs, train_roots, val_csvs=None, val_roots=None, *, exp_dir=None, charset_path=None, encoding='utf-8', img_h=64, img_w=256, max_len=25, hidden_size=256, num_encoder_layers=3, cnn_in_channels=3, cnn_out_channels=512, cnn_backbone='seresnet31', ctc_weight=0.3, ctc_weight_decay_epochs=50, ctc_weight_min=0.0, max_grad_norm=5.0, batch_size=32, epochs=20, lr=0.001, optimizer='AdamW', scheduler='OneCycleLR', weight_decay=0.0, momentum=0.9, val_interval=1, val_size=3000, train_proportions=None, num_workers=0, seed=42, resume_from=None, save_interval=None, device='cuda', freeze_cnn='none', freeze_enc_rnn='none', freeze_attention='none', pretrain_weights='default', **extra_config)[source]

Обучение модели распознавания текста TRBA на пользовательских наборах данных.

Parameters:

train_csvs (str | Sequence[str])
train_roots (str | Sequence[str])
val_csvs (str | Sequence[str] | None)
val_roots (str | Sequence[str] | None)
exp_dir (str | None)
charset_path (str | None)
encoding (str)
img_h (int)
img_w (int)
max_len (int)
hidden_size (int)
num_encoder_layers (int)
cnn_in_channels (int)
cnn_out_channels (int)
cnn_backbone (str)
ctc_weight (float)
ctc_weight_decay_epochs (int)
ctc_weight_min (float)
max_grad_norm (float)
batch_size (int)
epochs (int)
lr (float)
optimizer (str)
scheduler (str)
weight_decay (float)
momentum (float)
val_interval (int)
val_size (int)
train_proportions (Sequence[float] | None)
num_workers (int)
seed (int)
resume_from (str | None)
save_interval (int | None)
device (str)
freeze_cnn (str)
freeze_enc_rnn (str)
freeze_attention (str)
pretrain_weights (object | None)
extra_config (Any)

static export(weights_path, config_path, charset_path, output_path, opset_version=14, simplify=True)[source]

Экспорт модели TRBA PyTorch в формат ONNX.

Метод конвертирует обученную модель TRBA из PyTorch в формат ONNX, который может использоваться для более быстрого инференса с ONNX Runtime. Экспортированную модель можно загрузить через TRBA(weights="model.onnx").

Return type:

None

Parameters:

weights_path (str | Path)
config_path (str | Path)
charset_path (str | Path)
output_path (str | Path)
opset_version (int)
simplify (bool)