2004
Journal article  Unknown

Analysis and recognition of highly degraded printed characters

Tonazzini A., Vezzosi S., Bedini L.

Degraded texts  image restoration  Wavelet denoising  Neural Networks 

This paper proposes an integrated system for the processing and analysis of highly degraded printed documents for the purpose of recognizing text characters. As a case study, ancient printed texts are considered. The system is comprised of various blocks operating sequentially. Starting with a single page of the document, the background noise is reduced by wavelet-based decomposition and filtering, the text lines are detected, extracted, and segmented by a simple and fast adaptive thresholding into blobs corresponding to characters, and the various blobs are analyzed by a feedforward multilayer neural network trained with a back-propagation algorithm. For each character, the probability associated with the recognition is then used as a discriminating parameter that determines the automatic activation of a feedback process, leading the system back to a block for refining segmentation. This block acts only on the small portions of the text where the recognition cannot be relied on and makes use of blind deconvolution and MRF-based segmentation techniques whose high complexity is greatly reduced when applied to a few subimages of small size. The experimental results highlight that the proposed system performs a very precise segmentation of the characters and then a highly effective recognition of even strongly degraded texts.

Source: International journal on document analysis and recognition (Internet) 6 (2004): 236–247.

Publisher: Springer., Heidelberg, Germania



Back to previous page
BibTeX entry
@article{oai:it.cnr:prodotti:68266,
	title = {Analysis and recognition of highly degraded printed characters},
	author = {Tonazzini A. and Vezzosi S. and Bedini L.},
	publisher = {Springer., Heidelberg, Germania},
	journal = {International journal on document analysis and recognition (Internet)},
	volume = {6},
	pages = {236–247},
	year = {2004}
}