10 result(s)
Page Size: 10, 20, 50
Export: bibtex, xml, json, csv
Order by:

CNR Author operator: and / or
more
Typology operator: and / or
Language operator: and / or
Date operator: and / or
more
Rights operator: and / or
2011 Report Unknown
Componente per la creazione di suggerimenti da comportamenti collettivi
Lucchese C., Venturini R.
The document presents our system of suggestion of point of interests that has been developed within the Visito Tuscany project. -Progetto: VIsual Support to Interactive TOurism in Tuscany -Acronimo: VISITO Tuscany -Grant agreement: D57E09000050007Source: Project report, VISITO Tuscany, 2011

See at: CNR ExploRA


2011 Journal article Restricted
Discovering Europeana Users' Search Behavior
Ceccarelli D., Gordea S., Lucchese C., Nardini F. M., Perego R. Tolomei G.
Europeana is a strategic project funded by the European Commission with the goal of making Europe's cultural and scientific heritage accessible to the public. ASSETS is a two-year Best Practice Network co-funded by the CIP PSP Programme to improve performance, accessibility and usability of the Europeana search engine. Here we present a characterization of the Europeana logs by showing statistics on common behavioral patterns of the Europeana users.Source: ERCIM news 86 (2011): 39–40.

See at: ercim-news.ercim.eu Restricted | CNR ExploRA


2011 Report Unknown
VISITO Tuscany - Progetto dell'architettura della piattaforma VISITO Tuscany
Atzori M., Bazzoni G., Bolettieri P., La Torre F., Loschiavo D., Lucchese C., Manfrin S., Martinelli F., Melani A., Naldi C., Pironi A., Rubichi A., Venturini R., Zanetti N.
Il documento è inquadrato nell'Obiettivo Operativo 2 del Progetto VISITO Tuscany, nel quale viene elaborata la progettazione dell'intero sistema. In particolare in questo documento verrà descritta l'architettura del sistema sulla base del Reference Model for Open Distributed Processing (RM-ODP) che prevede cinque viste: enterprise, information, computational, engineering e technology.Source: Project report, VISITO Tuscany, 2011

See at: CNR ExploRA


2011 Report Unknown
A1.1.1 Lo stato dell'arte: tecnologia ed utenti
Falchi Fabrizio, Ippolito Valentina, Loschiavo Domenico, Lucchese Claudio, Lungarotti Francesca, Melani, Alessio, Minelli, Sam, Pialli Saverio, Rossi Silvia, Salvadori Sauro, Scartoni Rita, Scopigno Roberto, Tavanti Francesca, La Torre Francesco, Venturini Rossano
This document reports the state of the art related to the technologies of interest of the VISITO Tuscany projectSource: Project report, VISITO Tuscany, 2011

See at: CNR ExploRA


2011 Contribution to conference Restricted
LSDS-IR'11: the 9th workshop on large-scale and distributed systems for information retrieval
Lucchese C., Cambazoglu B. B.
The growth of the Web and user bases lead to important performance problems for large-scale Web search engines. The LSDS-IR '11 workshop focuses on research contributions re- lated to the scalability and efficiency of distributed information retrieval (IR) systems. The workshop also encourages contributions that propose different ways of leveraging diversity and multiplicity of resources available in distributed systems. More specifically, we are interested in novel applications, models, and architectures that deal with efficiency and scalability of distributed IR systems.Source: New York: ACM, Association for computing machinery, 2011
DOI: 10.1145/2063576.2064054
Project(s): ASSETS
Metrics:


See at: dl.acm.org Restricted | doi.org Restricted | CNR ExploRA


2011 Conference article Open Access OPEN
Caching query-biased snippets for efficient retrieval
Lucchese C., Perego R., Ceccarelli D., Silvestri F., Orlando S.
Web Search Engines' result pages contain references to the top-k documents relevant for the query submitted by a user. Each document is represented by a title, a snippet and a URL. Snippets, i.e. short sentences showing the portions of the document being relevant to the query, help users to select the most interesting results. The snippet generation process is very expensive, since it may require to access a number of documents for each issued query. We assert that caching, a popular technique used to enhance performance at various levels of any computing systems, can be very e ective in this context. We design and experiment several cache organizations, and we introduce the concept of supersnippet, that is the set of sentences in a document that are more likely to answer future queries. We show that supersnippets can be built by exploiting query logs, and that in our experiments a supersnippet cache answers up to 62% of the requests, remarkably outperforming other caching approaches.Source: 14th International Conference on Extending Database Technology, EDBT/ICDT '11, pp. 93–104, Uppsala, Sweden, March 21-24 2011
DOI: 10.1145/1951365.1951379
Project(s): ASSETS
Metrics:


See at: www.dsi.unive.it Open Access | doi.org Restricted | portal.acm.org Restricted | CNR ExploRA


2011 Conference article Restricted
Improving Europeana search experience using query logs
Ceccarelli D., Gordea S., Lucchese C., Nardini F. M., Tolomei G.
Europeana is a long-term project funded by the European Commission with the goal of making Europe's cultural and scientific heritage accessible to the public. Since 2008, about 1500 institutions have contributed to Europeana, enabling people to explore the digital re- sources of Europe's museums, libraries and archives. The huge amount of collected multi-lingual multi-media data is made available today through the Europeana portal, a search engine allowing users to explore such con- tent through textual queries. One of the most important techniques for enhancing users search experience in large information spaces, is the exploitation of the knowledge contained in query logs. In this paper we present a characterization of the Europeana query log, showing statistics on common behavioral patterns of the Europeana users. Our analysis highlights some significative differences between the Europeana query log and the historical data collected by general purpose Web Search Engine logs. In particular, we find out that both query and search session distributions show different behaviors. Finally, we use this information for designing a query recommendation technique having the goal of enhancing the functionality of the Europeana portal.Source: Research and Advanced Technology for Digital Libraries. International Conference on Theory and Practice of Digital Libraries, pp. 384–395, Berlin, Germany, 26-27-28 SETTEMBRE 2011
DOI: 10.1007/978-3-642-24469-8_39
Project(s): ASSETS
Metrics:


See at: doi.org Restricted | gateway.webofknowledge.com Restricted | www.springerlink.com Restricted | CNR ExploRA


2011 Conference article Restricted
Direct local pattern sampling by efficient two-step random procedures
Boley M., Lucchese C., Paurat D., Gartner, T.
We present several exact and highly scalable local pattern sampling algorithms. They can be used as an alternative to exhaustive local pattern discovery methods (e.g, frequent set mining or optimistic-estimator-based subgroup discovery) and can substantially improve efficiency as well as con- trollability of pattern discovery processes. While previous sampling approaches mainly rely on the Markov chain Monte Carlo method, our procedures are direct, i.e., non process- simulating, sampling algorithms. The advantages of these direct methods are an almost optimal time complexity per pattern as well as an exactly controlled distribution of the produced patterns. Namely, the proposed algorithms can sample (item-)sets according to frequency, area, squared fre- quency, and a class discriminativity measure. Experiments demonstrate that these procedures can improve the accuracy of pattern-based models similar to frequent sets and often also lead to substantial gains in terms of scalability.Source: ACM SIGKDD Conference on Knowledge Discovery and Data Mining, KDD'11, pp. 582–590, San Diego, USA, 21-24 August 2011
DOI: 10.1145/2020408.2020500
Project(s): LIFT via OpenAIRE
Metrics:


See at: ACM Digital Library Restricted | dl.acm.org Restricted | doi.org Restricted | research.monash.edu Restricted | CNR ExploRA


2011 Contribution to conference Unknown
Direct pattern sampling with respect to pattern frequency
Lucchese C., Boley M., Gartner T., Paurat D.
We present an exact and highly scalable sampling algorithm that can be used as an alternative to exhaustive local pattern discovery methods. It samples patterns according to their frequency of occurrence and can substantially improve efficiency and controllability of the pattern discovery processes. While previous sampling approaches mainly rely on the Markov chain Monte Carlo method, our procedure is direct, i.e. a non process-simulating sampling algorithm. The ad- vantages of this direct method are an almost optimal time complexity per pattern as well as an exactly controlled distribution of the produced pat- terns. In addition we present experimental results which demonstrate that these procedures can improve the accuracy of pattern-based models similar to frequent sets and often also lead to substantial gains in terms of scalabilitySource: Workshop on Knowledge Discovery, Data Mining and Machine Learning, in conjunction with the LWA 2011. KDLM'11 - LWA 2011, Magdeburg, Germany, 28-30 September 2011
Project(s): LIFT via OpenAIRE

See at: CNR ExploRA


2011 Conference article Open Access OPEN
Identifying task-based sessions in search engine query logs
Lucchese C., Orlando S., Perego R., Silvestri F., Tolomei G.
The research challenge addressed in this paper is to devise effective techniques for identifying task-based sessions, i.e. sets of possibly non contiguous queries issued by the user of a Web Search Engine for carrying out a given task. In order to evaluate and compare different approaches, we built, by means of a manual labeling process, a ground-truth where the queries of a given query log have been grouped in tasks. Our analysis of this ground-truth shows that users tend to perform more than one task at the same time, since about 75% of the submitted queries involve a multi-tasking activity. We formally define the Task-based Session Discovery Problem (TSDP) as the problem of best approximating the manually annotated tasks, and we propose several variants of well known clustering algorithms, as well as a novel efficient heuristic algorithm, specifically tuned for solving the TSDP. These algorithms also exploit the collaborative knowledge collected by Wiktionary and Wikipedia for detecting query pairs that are not similar from a lexical content point of view, but actually semantically related. The proposed algorithms have been evaluated on the above ground-truth, and are shown to perform better than state-of-the-art approaches, because they effectively take into account the multi-tasking behavior of users.Source: Fourth ACM International Conference on Web Search and Data Mining, pp. 277–286, Hong Kong, China, 10-12 Febbraio 2011
DOI: 10.1145/1935826.1935875
Project(s): S-CUBE via OpenAIRE
Metrics:


See at: www.dsi.unive.it Open Access | ACM Digital Library Restricted | doi.org Restricted | portal.acm.org Restricted | CNR ExploRA