Tonazzini A, Bedini L, Salerno E
Degraded Documents Blind Source Separation Component Analysis
Many text documents show a reduced legibility due to some specific kinds of physical degradation. In these cases, recovering a clean text pattern may be not the only purpose of digital document restoration, since some of the degradation artifacts may contain significant information.This is the case, for instance, of underwritings in palimpsests. In this paper, we propose a novel approach to this problem, by reformulating it as a blind source separation problem and solving it by independent component analysis techniques. Under appropriate hypotheses, the spectral components of the document, taken at different bands both in the visible and in the non-visible ranges, can be used to extract the individual contributions of, say, the text and the bleed-through and background patterns. Examples of bleed-through cancellation and recovery of underwriting from palimpsests are provided.
@inproceedings{oai:it.cnr:prodotti:91149, title = {Digital analysis of damaged documents by ICA techniques}, author = {Tonazzini A and Bedini L and Salerno E}, year = {2003} }