2017
Conference article  Open Access

Integration of Deep Web Sources: A Distributed Information Retrieval Approach

Calì A., Straccia U.

information integration  Deep web 

The Deep Web consists of those structured data that are available as dynamically generated pages, typically requested through HTML forms. Deep Web pages cannot be indexed by search engines, and are notoriously difficult to query and integrate due to the limited access that they offer. We propose a novel framework for integrating Deep Web sources by means of a mediated schema that represent the underlying, distributed sources. Our goal is to compute answers to queries posed on the mediated schema. To this aim, we propose the use of techniques from the area of Distributed Information Retrieval. We discuss a novel approach to automated sampling, size estimation and selection of Deep Web sources, as well as a technique for merging result lists.

Source: 7th International Conference on Web Intelligence, Mining and Semantics (WIMS-17), pp. 33:1–33:4, 19-22 June 2017


Metrics



Back to previous page
BibTeX entry
@inproceedings{oai:it.cnr:prodotti:374719,
	title = {Integration of Deep Web Sources: A Distributed Information Retrieval Approach},
	author = {Calì A. and Straccia U.},
	doi = {10.1145/3102254.3102291},
	booktitle = {7th International Conference on Web Intelligence, Mining and Semantics (WIMS-17), pp. 33:1–33:4, 19-22 June 2017},
	year = {2017}
}