2015
Journal article  Open Access

A non-stationary density model to separate overlapped texts in degraded documents

Tonazzini A., Savino P., Salerno E.

Non-stationary data model  Electrical and Electronic Engineering  Back-to-front interferences  Document restoration  Nonlinear data model  Palimpsests  Signal Processing 

We address the problem of the removal of a text superimposed to a more important one, in a document image, considering the two instances of canceling back-to-front interferences from recto and verso images of archival documents and of recovering the erased text in palimpsests from multispectral images. Both problems are approached through a model where the ideal images of the two texts are considered as individual source patterns, mixed through some parametric operator. To cope with occlusions, ink saturation, and space variability of the mixing operator, a data model for this problem should be nonlinear and space variant. Here, we show that if a pointwise non-stationarity is allowed, a linear model can compensate for the lack of a suitable nonlinearity and for other modeling errors.

Source: Signal, image and video processing (Print) 9 (2015): 155–164. doi:10.1007/s11760-014-0735-3

Publisher: Springer, London , Regno Unito


Citations

1. Dubois, E., Pathak, A.: Reduction of bleed-through in scanned manuscript documents. In: Proceedings of IS&T Image Processing, Image Quality, Image Capture Systems Conference, pp. 177-180 (2001)
2. Tan, C.L., Cao, R., Shen, P.: Restoration of archival documents using a wavelet technique. IEEE Trans. PAMI 24(10), 1399-1404 (2002)
3. Wang, Q., Tan, C.L.: Matching of double-sided document images to remove interference. In: Proceedings of IEEE CVPR 2001, p. 1084 (2001)
4. Hanasusanto, G.A., Wu, Z., Brown, M.S.: Ink-bleed reduction using functional minimization. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 825-832 (2010)
5. Huang, Y., Brown, M.S., Xu, D.: User assisted ink-bleed reduction. IEEE Trans. Image Process. 19(10), 2646-2658 (2010)
6. Rowley-Brooke, R., Piti, F., Kokaram, A.: A non-parametric framework for document bleed-through removal. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 2954-2960 (2013)
7. Tonazzini, A., Bedini, L., Salerno, E.: Independent component analysis for document restoration. Int. J. Doc. Anal. Recognit. 7, 17-27 (2004)
8. Tonazzini, A., Salerno, E., Bedini, L.: Fast correction of bleedthrough distortion in grayscale documents by a blind source separation technique. Int. J. Doc. Anal. Recognit. 10, 17-25 (2007)
9. Tonazzini, A., Gerace, I., Martinelli, F.: Multichannel blind separation and deconvolution of images for document analysis. IEEE Trans. Image Process. 19(4), 912-925 (2010)
10. Rowley-Brooke, R., Kokaram, A.: Bleed-through removal in degraded documents. In: Proceedings of SPIE 8297 Document Recognition and Retrieval XIX, 82970T-10 (2012)
11. Merrikh-Bayat, F., Babaie-Zadeh, M., Jutten, C.: Using nonnegative matrix factorization for removing show-through. In: Proceedings of LVA/ICA, pp. 482-489 (2010)
12. Sharma, G.: Show-through cancellation in scans of duplex printed documents. IEEE Tans. Image Process. 10(5), 736-754 (2001)
13. Ophir, B., Malah, D.: Show-through cancellation in scanned images using blind source separation techniques. In: Proceedings of International Conference on Image Processing ICIP. Volume III. pp. 233-236 (2007)
14. Martinelli, F., Salerno, E., Gerace, I., Tonazzini, A.: Non-linear model and constrained ml for removing back-to-front interferences from recto-verso documents. Pattern Recognit. 45, 596-605 (2012)
15. Salerno, E., Martinelli, F., Tonazzini, A.: Nonlinear model identification and seethrough cancellation from recto-verso data. Int. J. Doc. Anal. Recognit. 16, 177-187 (2013)
16. Merrikh-Bayat, F., Babaie-Zadeh, M., Jutten, C.: Linear-quadratic blind source separating structure for removing show-through in scanned documents. IJDAR 14, 319-333 (2011)
17. Almeida, M.S.C., Almeida, L.B.: Nonlinear separation of showthrough image mixtures using a physical model trained with ica. Signal Process. 92, 872884 (2012)
18. Moghaddam, R.F., Cheriet, M.: A variational approach to degraded document enhancement. IEEE Trans. Pattern Anal. Mach. Intell. 32(8), 1347-1361 (2010)
19. Tonazzini, A., Salerno, E., Savino, P., Bedini, L.: Removal of nonstationary see-through interferences from recto-verso documents. In: Proceedings of International Workshop on Intelligent Pattern Recognition and Applications WIPRA 2013, pp. 151-158 (2013)
20. Gerace, I., Martinelli, F., Tonazzini, A.: Restoration of recto-verso archival documents through a regularized nonlinear model. In: Proceedings of Eusipco 2012, Bucharest, August 27-31 pp. 1588- 1592 (2012)
21. Luenberger, D.G.: Linear and Nonlinear Programming. AddisonWesley, Reading (1984)
22. Otsu, N.: A threshold selection method from gray-level histograms. IEEE Trans. Syst. Man. Cybern. 9(1), 62-66 (1979)
23. Irish Script On Screen Project: http://www.isos.dias.ie (2012)
24. Rowley-Brooke, R., Piti, F., Kokaram, A.: A ground truth bleedthrough document image database. In: Adn, G., Buchanan, P.Z., Rasmussen, E., Loizides, F. (eds.) Theory and Practice of Digital Libraries. Volume 7489 of Lecture Notes in Computer Science, pp. 185-196. Springer, Berlin (2012)
25. Sauvola, J., Pietikäinen, M.: Adaptive document image binarization. Pattern Recognit. 33, 225236 (2000)
26. Gatos, B., Pratikakis, I., Perantonis, S.J.: Adaptive degraded document image binarization. Pattern Recognit. 39(3), 317-327 (2006)


Back to previous page
BibTeX entry
@article{oai:it.cnr:prodotti:293705,
	title = {A non-stationary density model to separate overlapped texts in degraded documents},
	author = {Tonazzini A. and Savino P. and Salerno E.},
	publisher = {Springer, London , Regno Unito},
	doi = {10.1007/s11760-014-0735-3},
	journal = {Signal, image and video processing (Print)},
	volume = {9},
	pages = {155–164},
	year = {2015}
}