You Actually Look Twice At it (YALTAi): using an object detection approach instead of region segmentation within the Kraken engine

Thibault Clérice

Pré-Publication, Document De Travail Année : 2023

You Actually Look Twice At it (YALTAi): using an object detection approach instead of region segmentation within the Kraken engine

(1, 2, 3, 4)

1
2
3
4

Thibault Clérice

Fonction : Auteur
PersonId : 15153
IdHAL : thibault-clerice
ORCID : 0000-0003-1852-9204
IdRef : 221533877

École nationale des chartes

Centre Jean Mabillon

Histoire et Sources des Mondes antiques

Université Jean Moulin - Lyon 3

Résumé

Layout Analysis (the identification of zones and their classification) is the first step along line segmentation in Optical Character Recognition and similar tasks. The ability of identifying main body of text from marginal text or running titles makes the difference between extracting the work full text of a digitized book and noisy outputs. We show that most segmenters focus on pixel classification and that polygonization of this output has not been used as a target for the latest competition on historical document (ICDAR 2017 and onwards), despite being the focus in the early 2010s. We propose to shift, for efficiency, the task from a pixel classification-based polygonization to an object detection using isothetic rectangles. We compare the output of Kraken and YOLOv5 in terms of segmentation and show that the later severely outperforms the first on small datasets (1110 samples and below). We release two datasets for training and evaluation on historical documents as well as a new package, YALTAi, which injects YOLOv5 in the segmentation pipeline of Kraken 4.1.

Mots clés

kraken layout segmentation yolo htr ocr object detection historical document kraken

Domaines

Intelligence artificielle [cs.AI] Sciences de l'Homme et Société

Fichier principal

You_Actually_Look_Twice_At_it.pdf (2.31 Mo)

Origine	Fichiers produits par l'(les) auteur(s)

Thibault Clérice : Connectez-vous pour contacter le contributeur

https://enc.hal.science/hal-03723208

Soumis le : lundi 3 avril 2023-17:32:40

Dernière modification le : mardi 12 novembre 2024-15:20:06

Dates et versions

hal-03723208 , version 1 (18-07-2022)

hal-03723208 , version 2 (03-04-2023)

hal-03723208 , version 3 (16-12-2023)

hal-03723208 , version 4 (15-01-2024)

Identifiants

HAL Id : hal-03723208 , version 2
ARXIV : 2207.11230

Citer

Thibault Clérice. You Actually Look Twice At it (YALTAi): using an object detection approach instead of region segmentation within the Kraken engine. 2023. ⟨hal-03723208v2⟩

Exporter

BibTeX XML-TEI Dublin Core DC Terms EndNote DataCite

960 Consultations

850 Téléchargements

You Actually Look Twice At it (YALTAi): using an object detection approach instead of region segmentation within the Kraken engine

Résumé

Mots clés

Domaines

Dates et versions

Identifiants

Citer

Exporter

Altmetric

Partager