2016
Conference article  Open Access

DataQ: a data flow quality monitoring system for aggregative data infrastructures

Mannocci A., Manghi P.

K.6.4 MANAGEMENT OF COMPUTING AND INFORMATION SYSTEMS. System Management  Aggregative data infrastructures  D.2.8 SOFTWARE ENGINEERING. Metrics  Data quality  Monitoring  Data flow 

Aggregative Data Infrastructures (ADIs) are information systems offering services to integrate content collected from data sources so as to form uniform and richer information spaces and support communities of users with enhanced access services to such content. The resulting information spaces are an important asset for the target communities, whose services demand for guarantees on their "correctness" and "quality" over time, in terms of the expected content (structure and semantics) and of the processes generating such content. Application-level continuous monitoring of ADIs becomes therefore crucial to ensure validation of quality. However, ADIs are in most of the cases the result of patchworks of software components and services, in some cases developed independently, built over time to address evolving requirements. As such they are not generally equipped with embedded monitoring components and ADI admins must rely on third-party monitoring systems. In this paper we describe DataQ, a general-purpose system for exible and cost-effective data fow quality monitoring in ADIs. DataQ supports ADI admins with a framework where they can (i) represent ADIs data fows and the relative monitoring specification, and (ii) be instructed on how to meet such specification on the ADI side to implement their monitoring functionality.

Source: TPDL 2016 - Theory and Practice of Digital Libraries. 20th International Conference, pp. 357–369, Hannover, Germany, September 5-9, 2016


1. Akoka, J., Berti-Equille, L., Boucelma, O., Bouzeghoub, M., Comyn-Wattiau, I., Cosquer, M., Goasdoue-Thion, V., Kedad, Z., Nugier, S., Peralta, V., Sisaid-Cher , S.: A framework for quality evaluation in data integration systems. 9th International Conference on Enterprise Information Systems, ICEIS (2007)
2. Artini, M., Bardi, A., Biagini, F., Debole, F., Bruzzo, S.L., Manghi, P., Mikulicic, M., Savino, P., Zoppi, F.: The creation of the European Film Archive : achieving interoperability and data quality. In: 8th Italian Research Conference on Digital Libraries, IRCDL. pp. 1{12 (2012)
3. Ballou, D.P., Pazer, H.L.: Modeling Data and Process Quality in Multi-Input, Multi-Output Information Systems. Management Science 31(2), 150{162 (1985)
4. Ballou, D., Wang, R., Pazer, H., Kumar.Tayi, G.: Modeling information manufacturing systems to determine information product quality. Management Science 44(4), 462{484 (1998)
5. Bardi, A., Manghi, P., Zoppi, F.: Aggregative data infrastructures for the cultural heritage. Communications in Computer and Information Science, CCIS 343 (2012)
6. Batini, C., Barone, D., Cabitza, F., Grega, S.: A Data Quality Methodology for Heterogeneous Data. International Journal of Database Management Systems 3(1), 60{79 (2011)
7. Batini, C., Cappiello, C., Francalanci, C., Maurino, A.: Methodologies for data quality assessment and improvement. ACM Computing Surveys 41(3) (2009)
8. Boufares, F., Ben Salem, A.: Heterogeneous data-integration and data quality: Overview of con icts. 6th International Conference on Sciences of Electronics, Technologies of Information and Telecommunications, SETIT pp. 867{874 (2012)
9. Gonzalez, L., Peralta, V., Bouzeghoub, M., Ruggia, R.: Qbox-services: Towards a service-oriented quality platform. In: Lecture Notes in Computer Science. vol. 5833 LNCS, pp. 232{242 (2009)
10. Huh, Y., Keller, F., Redman, T., Watkins, A.: Data quality. Information and Software Technology 32(8), 559{565 (oct 1990)
11. Lemos, F., Bouadjenek, M.R., Bouzeghoub, M., Kedad, Z.: Using the QBox platform to assess quality in data integration systems. Ingenierie des systemes d'information 15(6), 105{124 (2010)
12. Manghi, P., Artini, M., Atzori, C., Bardi, A., Mannocci, A., La Bruzzo, S., Candela, L., Castelli, D., Pagano, P.: The D-NET software toolkit. Program 48(4) (2014)
13. Manghi, P., Bolikowski, L., Manola, N., Schirrwagen, J., Smith, T.: OpenAIREplus: The European scholarly communication data infrastructure. D-Lib Magazine 18(9- 10) (2012)
14. Mannocci, A., Casarosa, V., Manghi, P., Zoppi, F.: The Europeana Network of Ancient Greek and Latin Epigraphy Data Infrastructure. In: Metadata and Semantics Research: 8th Research Conference, MTSR 2014. pp. 286{300 (2014)
15. Marotta, A., Ruggia, R.: Quality Management in Multi-Source Information Systems. Quality (2002)
16. Marotta, A., Ruggia, R.: Managing source quality changes in a data integration system. CEUR Workshop Proceedings 263, 1168{1176 (2006)
17. Missier, P., Preece, A., Embury, S., Jin, B., Greenwood, M., Stead, D., Brown, A.: Managing information quality in e-Science: A case study in proteomics. Lecture Notes in Computer Science 3770, 423{432 (2005)
18. Peralta, V., Ruggia, R., Kedad, Z., Bouzeghoub, M.: A Framework for Data Quality Evaluation in a Data Integration System. In: SBBD. pp. 134{147 (2004)
19. Preece, A.D., Jin, B., Pignotti, E., Missier, P., Embury, S.M., Stead, D., Brown, A.: Managing Information Quality in e-Science Using Semantic Web Technology. Procs. ESWC pp. 472{486 (2006)
20. Redman, T.C.: The Impact of Poor Data Quality on the Typical Enterprise. Communications of the ACM 41(2), 79{82 (1998)
21. Reiter, M., Breitenbucher, U., Dustdar, S., Karastoyanova, D., Leymann, F., Truong, H.L.: A novel framework for monitoring and analyzing quality of data in simulation work ows. In: IEEE 7th International Conference on E-Science. pp. 105{112 (2011)
22. Reiter, M., Breitenbucher, U., Kopp, O., Karastoyanova, D.: Quality of data driven simulation work ows. Journal of Systems Integration 5(1), 3{29 (2014)
23. Scannapieco, M., Missier, P., Batini, C.: Data Quality at a Glance. DatenbankSpektrum 14(January), 6{14 (2005)
24. Shankaranarayanan, G., Wang, R.Y., Ziad, M.: IP-MAP: Representing the Manufacture of an Information Product. Proceedings of the 2000 Conference on Information Quality pp. 1{16 (2000)
25. Strong, D.M., Lee, Y.W., Wang, R.Y.: Data quality in context. Communications of the ACM 40(5), 103{110 (may 1997)
26. Tani, A., Candela, L., Castelli, D.: Dealing with metadata quality: The legacy of digital library e orts. Information Processing and Management 49(6) (2013)
27. Tayi, G.K., Ballou, D.P.: Examining data quality. Communications of the ACM 41(2), 54{57 (1998)
28. Wang, R., Storey, V., Firth, C.: A framework for analysis of data quality research. IEEE Transactions on Knowledge and Data Engineering 7(4), 623{640 (1995)

Metrics



Back to previous page
BibTeX entry
@inproceedings{oai:it.cnr:prodotti:357402,
	title = {DataQ: a data flow quality monitoring system for aggregative data infrastructures},
	author = {Mannocci A. and Manghi P.},
	doi = {10.1007/978-3-319-43997-6_28},
	booktitle = {TPDL 2016 - Theory and Practice of Digital Libraries. 20th International Conference, pp. 357–369, Hannover, Germany, September 5-9, 2016},
	year = {2016}
}

OpenAIRE2020
Open Access Infrastructure for Research in Europe 2020


OpenAIRE