Savino P., Tonazzini A.
Ancient manuscript virtual restoration Degraded document binarization Recto-verso registration Bleed-through removal Shallow multilayer neural networks
We propose a fast procedure based on neural networks (NN) to correct the typically complex background of recto-verso historical manuscripts, where the texts of the two sides often appear mixed. The purpose is to eliminate the interfering, shining-through text, to facilitate both the work of philologists and paleographers and the automatic analysis of the linguistic contents. We adapt the learning phase of a very simple shallow NN to exploit the information of the registered recto and verso sides of the manuscript without the need for a large class of other similar manuscripts. Hence, the training set is self-generated from the data images based on a theoretical mixing model that accounts for ink spreading through the paper fiber and for ink saturation in the text superposition areas. Operationally, we select pairs of patches containing clean text from the manuscript and then mix them symmetrically using the model with varying parameters that span the allowed range. This makes the NN able to generalize to diverse amounts of ink seeping and then classify different manuscripts. We show comparisons between the results obtained on heavily damaged manuscripts with this NN and other approaches. From a qualitative point of view, the proposed method seems quite promising.
Source: VIPERC2022: 1st International Virtual Conference on Visual Pattern Extraction and Recognition for Cultural Heritage Understanding, Online event, 12/09/2022
@inproceedings{oai:it.cnr:prodotti:471459, title = {A shallow neural net with model-based learning for the virtual restoration of recto-verso manuscript}, author = {Savino P. and Tonazzini A.}, booktitle = {VIPERC2022: 1st International Virtual Conference on Visual Pattern Extraction and Recognition for Cultural Heritage Understanding, Online event, 12/09/2022}, year = {2022} }