111 result(s)
Page Size: 10, 20, 50
Export: bibtex, xml, json, csv
Order by:

CNR Author operator: and / or
more
Typology operator: and / or
Language operator: and / or
Date operator: and / or
more
Rights operator: and / or
2019 Journal article Open Access OPEN

Parallel Traversal of Large Ensembles of Decision Trees
Lettich F., Lucchese C., Nardini F. M., Orlando S., Perego R., Tonellotto N., Venturini R.
Machine-learnt models based on additive ensembles of regression trees are currently deemed the best solution to address complex classification, regression, and ranking tasks. The deployment of such models is computationally demanding: to compute the final prediction, the whole ensemble must be traversed by accumulating the contributions of all its trees. In particular, traversal cost impacts applications where the number of candidate items is large, the time budget available to apply the learnt model to them is limited, and the users' expectations in terms of quality-of-service is high. Document ranking in web search, where sub-optimal ranking models are deployed to find a proper trade-off between efficiency and effectiveness of query answering, is probably the most typical example of this challenging issue. This paper investigates multi/many-core parallelization strategies for speeding up the traversal of large ensembles of regression trees thus obtaining machine-learnt models that are, at the same time, effective, fast, and scalable. Our best results are obtained by the GPU-based parallelization of the state-of-the-art algorithm, with speedups of up to 102.6x.Source: IEEE transactions on parallel and distributed systems (Print) 30 (2019): 2075–2089. doi:10.1109/TPDS.2018.2860982
DOI: 10.1109/tpds.2018.2860982
DOI: 10.5281/zenodo.2668379
DOI: 10.5281/zenodo.2668378
Project(s): BigDataGrapes via OpenAIRE

See at: Archivio istituzionale della ricerca - Università degli Studi di Venezia Ca' Foscari Open Access | ISTI Repository Open Access | ZENODO Open Access | IEEE Transactions on Parallel and Distributed Systems Open Access | IEEE Transactions on Parallel and Distributed Systems Restricted | IEEE Transactions on Parallel and Distributed Systems Restricted | ieeexplore.ieee.org Restricted | IEEE Transactions on Parallel and Distributed Systems Restricted | IEEE Transactions on Parallel and Distributed Systems Restricted | CNR ExploRA Restricted | IEEE Transactions on Parallel and Distributed Systems Restricted


2018 Contribution to book Open Access OPEN

Analysing trajectories of mobile users: from data warehouses to recommender systems
Nardini F. M., Orlando S., Perego R., Raffaetà A., Renso C., Silvestri C.
This chapter discusses a general framework for the analysis of trajectories of moving objects, designed around a Trajectory Data Warehouse (TDW). We argue that data warehouse technologies, combined with geographic visual analytics tools, can play an important role in granting very fast, accurate and understandable analysis of mobility data. We describe how in the last decade the TDW models have changed in order to provide the user with a more suitable model of the reality of interest and we also cope with the challenge of semantic trajectories. As a use case we illustrate how the framework can be instantiated for realizing a recommender system for tourists.Source: A Comprehensive Guide Through the Italian Database Research Over the Last 25 Years, edited by Sergio Flesca, Sergio Greco, Elio Masciari, Domenico Saccà, pp. 407–421, 2018
DOI: 10.1007/978-3-319-61893-7_24

See at: ISTI Repository Open Access | academic.microsoft.com Restricted | dblp.uni-trier.de Restricted | iris.unive.it Restricted | link.springer.com Restricted | link.springer.com Restricted | CNR ExploRA Restricted | rd.springer.com Restricted


2018 Journal article Open Access OPEN

X-CLEaVER: Learning ranking ensembles by growing and pruning trees
Lucchese C., Nardini F. M., Orlando S., Perego R., Silvestri F., Trani S.
Learning-to-Rank (LtR) solutions are commonly used in large-scale information retrieval systems such as Web search engines, which have to return highly relevant documents in response to user query within fractions of seconds. The most effective LtR algorithms adopt a gradient boosting approach to build additive ensembles of weighted regression trees. Since the required ranking effectiveness is achieved with very large ensembles, the impact on response time and query throughput of these solutions is not negligible. In this article, we propose X-CLEaVER, an iterative meta-algorithm able to build more efcient and effective ranking ensembles. X-CLEaVER interleaves the iterations of a given gradient boosting learning algorithm with pruning and re-weighting phases. First, redundant trees are removed from the given ensemble, then the weights of the remaining trees are fne-tuned by optimizing the desired ranking quality metric. We propose and analyze several pruning strategies and we assess their benefts showing that interleaving pruning and re-weighting phases during learning is more effective than applying a single post-learning optimization step. Experiments conducted using two publicly available LtR datasets show that X-CLEaVER can be successfully exploited on top of several LtR algorithms as it is effective in optimizing the effectiveness of the learnt ensembles, thus obtaining more compact forests that hence are much more efcient at scoring time.Source: ACM transactions on intelligent systems and technology (Print) 9 (2018). doi:10.1145/3205453
DOI: 10.1145/3205453
Project(s): BigDataGrapes via OpenAIRE, SoBigData via OpenAIRE

See at: ISTI Repository Open Access | ACM Transactions on Intelligent Systems and Technology Restricted | ACM Transactions on Intelligent Systems and Technology Restricted | ACM Transactions on Intelligent Systems and Technology Restricted | dl.acm.org Restricted | ACM Transactions on Intelligent Systems and Technology Restricted | ACM Transactions on Intelligent Systems and Technology Restricted | CNR ExploRA Restricted


2018 Conference article Open Access OPEN

Selective gradient boosting for effective learning to rank
Lucchese C., Nardini F. M., Perego R., Orlando S., Trani S.
Learning an effective ranking function from a large number of query-document examples is a challenging task. Indeed, training sets where queries are associated with a few relevant documents and a large number of irrelevant ones are required to model real scenarios of Web search production systems, where a query can possibly retrieve thousands of matching documents, but only a few of them are actually relevant. In this paper, we propose Selective Gradient Boosting (SelGB), an algorithm addressing the Learning-to-Rank task by focusing on those irrelevant documents that are most likely to be mis-ranked, thus severely hindering the quality of the learned model. SelGB exploits a novel technique minimizing the mis-ranking risk, i.e., the probability that two randomly drawn instances are ranked incorrectly, within a gradient boosting process that iteratively generates an additive ensemble of decision trees. Specifically, at every iteration and on a per query basis, SelGB selectively chooses among the training instances a small sample of negative examples enhancing the discriminative power of the learned model. Reproducible and comprehensive experiments conducted on a publicly available dataset show that SelGB exploits the diversity and variety of the negative examples selected to train tree ensembles that outperform models generated by state-of-the-art algorithms by achieving improvements of NDCG@10 up to 3.2%.Source: International ACM Conference on Research and Development in Information Retrieval (SIGIR), pp. 155–164, 8-12/07/2018
DOI: 10.1145/3209978.3210048
DOI: 10.5281/zenodo.2668014
DOI: 10.5281/zenodo.2668013
Project(s): BigDataGrapes via OpenAIRE, MASTER via OpenAIRE, SoBigData via OpenAIRE

See at: Archivio istituzionale della ricerca - Università degli Studi di Venezia Ca' Foscari Open Access | ISTI Repository Open Access | ZENODO Open Access | zenodo.org Open Access | academic.microsoft.com Restricted | dblp.uni-trier.de Restricted | dl.acm.org Restricted | dl.acm.org Restricted | dl.acm.org Restricted | iris.unive.it Restricted | CNR ExploRA Restricted


2017 Conference article Restricted

X-DART: blending dropout and pruning for efficient learning to rank
Lucchese C., Nardini F. M., Orlando S., Perego R., Trani S.
In this paper we propose X-DART, a new Learning to Rank algorithm focusing on the training of robust and compact ranking models. Motivated from the observation that the last trees of MART models impact the prediction of only a few instances of the training set, we borrow from the DART algorithm the dropout strategy consisting in temporarily dropping some of the trees from the ensemble while new weak learners are trained. However, differently from this algorithm we drop permanently these trees on the basis of smart choices driven by accuracy measured on the validation set. Experiments conducted on publicly available datasets shows that X-DART outperforms DART in training models providing the same effectiveness by employing up to 40% less trees.Source: SIGIR '17 - 40th International ACM SIGIR Conference on Research and Development in Information Retrieval, pp. 1077–1080, Tokyo, Japan, 9-11 August 2017
DOI: 10.1145/3077136.3080725

See at: academic.microsoft.com Restricted | dblp.uni-trier.de Restricted | dl.acm.org Restricted | doi.acm.org Restricted | doi.org Restricted | CNR ExploRA Restricted


2017 Conference article Restricted

Multicore/Manycore parallel traversal of large forests of regression trees
Lettich F., Lucchese C., Nardini F. M., Orlando S., Perego R., Tonellotto N., Venturini R.
Machine-learnt models based on additive ensembles of binary regression trees are currently considered one of the best solutions to address complex classification, regression, and ranking tasks. To evaluate these complex models over a continuous stream of data items with high throughput requirements, we need to optimize, and possibly parallelize, the traversal of thousands of trees, each including hundreds of nodes.Document ranking in Web Search is a typical example of this challenging scenario, where complex tree-based models are used to score query-document pairs and finally rank lists of document results for each incoming query (a.k.a. Learning-to-Rank). In this extended abstract, we briefly discuss some preliminary results concerning the parallelization strategies for QUICKSCORER - indeed the state-of-art scoring algorithm that exploits ensembles of decision trees - by using multicore CPUs (with SIMD coprocessors) and manycore GPUs. We show that QUICKSCORER, which transforms the traversal of thousands of decision trees in a linear access to array data structures, can be parallelized very effectively, by achieving very interesting speedups.Source: HPCS 2017 - International Conference on High Performance Computing & Simulation, pp. 915–915, Genoa, Italy, 17-21 July 2017
DOI: 10.1109/hpcs.2017.154

See at: academic.microsoft.com Restricted | dblp.uni-trier.de Restricted | doi.org Restricted | ieeexplore.ieee.org Restricted | iris.unive.it Restricted | CNR ExploRA Restricted | xplorestaging.ieee.org Restricted


2017 Conference article Open Access OPEN

QuickScorer: efficient traversal of large ensembles of decision trees
Lucchese C., Nardini F. M., Orlando S., Perego R., Tonellotto N., Venturini R.
Machine-learnt models based on additive ensembles of binary regression trees are currently deemed the best solution to address complex classification, regression, and ranking tasks. Evaluating these models is a computationally demanding task as it needs to traverse thousands of trees with hundreds of nodes each. The cost of traversing such large forests of trees significantly impacts their application to big and stream input data, when the time budget available for each prediction is limited to guarantee a given processing throughput. Document ranking in Web search is a typical example of this challenging scenario, where the exploitation of tree-based models to score query-document pairs, and finally rank lists of documents for each incoming query, is the state-of-art method for ranking (a.k.a. Learning-to-Rank). This paper presents QuickScorer, a novel algorithm for the traversal of huge decision trees ensembles that, thanks to a cache- and CPU-aware design, provides a 9 speedup over best competitors.Source: ECML PKDD - Joint European Conference on Machine Learning and Knowledge Discovery in Databases, pp. 383–387, Skopje, Macedonia, 18-22 September, 2017
DOI: 10.1007/978-3-319-71273-4_36

See at: arpi.unipi.it Open Access | academic.microsoft.com Restricted | arpi.unipi.it Restricted | arpi.unipi.it Restricted | dblp.uni-trier.de Restricted | iris.unive.it Restricted | link.springer.com Restricted | link.springer.com Restricted | link.springer.com Restricted | CNR ExploRA Restricted | rd.springer.com Restricted


2016 Contribution to conference Restricted

Improve ranking efficiency by optimizing tree ensembles
Lucchese C., Nardini F. M., Orlando S., Perego R., Silvestri F., Trani S.
Learning to Rank (LtR) is the machine learning method of choice for producing highly effective ranking functions. However, efficiency and effectiveness are two competing forces and trading off effiectiveness for meeting efficiency constraints typical of production systems is one of the most urgent issues. This extended abstract shortly summarizes the work in [4] proposing CLEaVER, a new framework for optimizing LtR models based on ensembles of regression trees. We summarize the results of a comprehensive evaluation showing that CLEaVER is able to prune up to 80% of the trees and provides an efficiency speed-up up to 2:6x without affecting the effectiveness of the model.Source: 7th Italian Information Retrieval Workshop, Venezia, Italia, 30-31 May 2016

See at: ceur-ws.org Restricted | CNR ExploRA Restricted


2015 Report Open Access OPEN

Twitter for election forecasts: a joint machine learning and complex network approach applied to an italian case study
Coletto M., Lucchese C., Orlando S., Perego R., Chessa A., Puliga M.
Several studies have shown how to approximately predict real-world phenomena, such as political elections, by ana- lyzing user activities in micro-blogging platforms. This ap- proach has proven to be interesting but with some limita- tions, such as the representativeness of the sample of users, and the hardness of understanding polarity in short mes- sages. We believe that predictions based on social network analysis can be significantly improved by exploiting machine learning and complex network tools, where the latter pro- vides valuable high-level features to support the former in learning an accurate prediction function.Source: ISTI Technical reports, 2015

See at: ISTI Repository Open Access | CNR ExploRA Open Access


2015 Conference article Open Access OPEN

Electoral predictions with Twitter: a machine-learning approach
Coletto M., Lucchese C., Orlando S., Perego R.
Several studies have shown how to approximately predict public opinion, such as in political elections, by analyzing user activities in blogging platforms and on-line social networks. The task is challenging for several reasons. Sample bias and automatic understanding of textual content are two of several non trivial issues. In this work we study how Twitter can provide some interesting insights concerning the primary elections of an Italian political party. State-of-the-art approaches rely on indicators based on tweet and user volumes, often including sentiment analysis. We investigate how to exploit and improve those indicators in order to reduce the bias of the Twitter users sample. We propose novel indicators and a novel content-based method. Furthermore, we study how a machine learning approach can learn correction factors for those indicators. Experimental results on Twitter data support the validity of the proposed methods and their improvement over the state of the art.Source: 6th Italian Information Retrieval Workshop, pp. 1–12, Cagliari, 25/06/2015

See at: ceur-ws.org Open Access | ISTI Repository Open Access | CNR ExploRA Open Access


2015 Conference article Open Access OPEN

QuickRank: A C++ suite of learning to rank algorithms
Capannini G., Dato D., Lucchese C., Mori M., Nardini F. M., Orlando S., Perego R., Tonellotto N.
Ranking is a central task of many Information Retrieval (IR) problems, particularly challenging in the case of large-scale Web collections where it involves effectiveness requirements and effciency constraints that are not common to other ranking-based applications. This paper describes QuickRank, a C++ suite of effcient and effective Learning to Rank (LtR) algorithms that allows high-quality ranking functions to be devised from possibly huge training datasets. QuickRank is a project with a double goal: i) answering industrial need of Tiscali S.p.A. for a exible and scalable LtR solution for learning ranking models from huge training datasets; ii) providing the IR research community with a exible, extensible and effcient LtR framework to design LtR solutions and fairly compare the performance of different algorithms and ranking models. This paper presents our choices in designing QuickRank and report some preliminary use experiences.Source: Italian Information Retrieval Workshop (IR), Cagliari, may 2015

See at: ceur-ws.org Open Access | CNR ExploRA Open Access


2015 Conference article Restricted

Speeding up document ranking with rank-based features
Lucchese C., Nardini F. M., Orlando S., Perego R., Tonellotto N.
Learning to Rank (LtR) is an effective machine learning me- thodology for inducing high-quality document ranking func- tions. Given a query and a candidate set of documents, where query-document pairs are represented by feature vec- tors, a machine-learned function is used to reorder this set. In this paper we propose a new family of rank-based features, which extend the original feature vector associated with each query-document pair. Indeed, since they are derived as a function of the query-document pair and the full set of can- didate documents to score, rank-based features provide ad- ditional information to better rank documents and return the most relevant ones. We report a comprehensive evalu- ation showing that rank-based features allow us to achieve the desired effectiveness with ranking models being up to 3.5 times smaller than models not using them, with a scoring time reduction up to 70%.Source: 38th International ACM SIGIR Conference on Research and Development in Information Retrieval, pp. 895–898, Santiago, Chile, 9-13 August 2015
DOI: 10.1145/2766462.2767776
Project(s): eCloud

See at: academic.microsoft.com Restricted | dblp.uni-trier.de Restricted | dl.acm.org Restricted | dl.acm.org Restricted | dl.acm.org Restricted | iris.unive.it Restricted | CNR ExploRA Restricted


2015 Conference article Open Access OPEN

QuickScorer: a fast algorithm to rank documents with additive ensembles of regression trees
Lucchese C., Nardini F. M., Orlando S., Perego R., Tonellotto N., Venturini R.
Learning-to-Rank models based on additive ensembles of re- gression trees have proven to be very effective for ranking query results returned by Web search engines, a scenario where quality and efficiency requirements are very demand- ing. Unfortunately, the computational cost of these rank- ing models is high. Thus, several works already proposed solutions aiming at improving the efficiency of the scoring process by dealing with features and peculiarities of modern CPUs and memory hierarchies. In this paper, we present QuickScorer, a new algorithm that adopts a novel bitvec- tor representation of the tree-based ranking model, and per- forms an interleaved traversal of the ensemble by means of simple logical bitwise operations. The performance of the proposed algorithm are unprecedented, due to its cache- aware approach, both in terms of data layout and access patterns, and to a control flow that entails very low branch mis-prediction rates. The experiments on real Learning-to- Rank datasets show that QuickScorer is able to achieve speedups over the best state-of-the-art baseline ranging from 2x to 6.5x.Source: 38th International ACM SIGIR Conference on Research and Development in Information Retrieval, pp. 73–82, Santiago, Chile, 9-13 August 2015
DOI: 10.1145/2766462.2767733
Project(s): eCloud

See at: ISTI Repository Open Access | academic.microsoft.com Restricted | dblp.uni-trier.de Restricted | dl.acm.org Restricted | dl.acm.org Restricted | dl.acm.org Restricted | doi.acm.org Restricted | iris.unive.it Restricted | pages.di.unipi.it Restricted | CNR ExploRA Restricted


2015 Patent Unknown

A method to rank documents by a computer, using additive ensembles of regression trees and cache optimisation, and search engines using such a method
Dato D., Lucchese C., Nardini F. M., Orlando S., Perego R., Tonellotto N., Venturini R.
Source: PCT29914, Nazionale

See at: patentscope.wipo.int | CNR ExploRA


2015 Contribution to conference Open Access OPEN

Twitter for election forecasts: a joint machine learning and complex network approach applied to an italian case study
Coletto M., Lucchese C., Orlando S., Perego R., Chessa A., Puliga M.
Several studies have shown how to approximately predict real-world phenomena, such as political elections, by analyzing user activities in micro-blogging platforms. This approach has proven to be interesting but with some limitations, such as the representativeness of the sample of users, and the hardness of understanding polarity in short messages. We believe that predictions based on social network analysis can be significantly improved by exploiting machine learning and complex network tools, where the latter pro- vides valuable high-level features to support the former in learning an accurate prediction function.Source: International Conference on Computational Social Science (ICCSS 2015), Helsinki, Finland, 08-11/06/2015

See at: ISTI Repository Open Access | CNR ExploRA Open Access


2014 Conference article Open Access OPEN

Dexter 2.0 - an open source tool for semantically enriching data
Ceccarelli D., Lucchese C., Orlando S., Perego R., Trani S.
Entity Linking (EL) enables to automatically link unstruc- tured data with entities in a Knowledge Base. Linking unstructured data (like news, blog posts, tweets) has several important applications: for ex- ample it allows to enrich the text with external useful contents or to improve the categorization and the retrieval of documents. In the latest years many effective approaches for performing EL have been proposed but only a few authors published the code to perform the task. In this work we describe Dexter 2.0, a major revision of our open source frame- work to experiment with different EL approaches. We designed Dexter in order to make it easy to deploy and to use. The new version provides several important features: the possibility to adopt different EL strate- gies at run-time and to annotate semi-structured documents, as well as a well-documented REST-API. In this demo we present the current state of the system, the improvements made, its architecture and the APIs provided.Source: ISWC 2014 Posters & Demonstrations Track. A track within the 13th International Semantic Web Conference, pp. 417–420, ISWC-P&D 2014, 21 October 2014

See at: ceur-ws.org Open Access | CNR ExploRA Open Access


2014 Conference article Restricted

Manual annotation of semi-structured documents for entity-linking
Ceccarelli D., Lucchese C., Orlando S., Perego R., Trani S.
The Entity Linking (EL) problem consists in automatically linking short fragments of text within a document to entities in a given Knowledge Base like Wikipedia. Due to its impact in several text-understanding related tasks, EL is an hot research topic. The correlated problem of devising the most relevant entities mentioned in the document, a.k.a. salient entities (SE), is also attracting increasing interest. Unfortunately, publicly available evaluation datasets that contain accurate and supervised knowledge about mentioned entities and their relevance ranking are currently very poor both in number and quality. This lack makes very difficult to compare different EL and SE solutions on a fair basis, as well as to devise innovative techniques that relies on these datasets to train machine learning models, in turn used to automatically link and rank entities. In this demo paper we propose a Web-deployed tool that allows to crowdsource the creation of these datasets, by sup- porting the collaborative human annotation of semi-structured documents. The tool, called Elianto, is actually an open source framework, which provides a user friendly and re- active Web interface to support both EL and SE labelling tasks, through a guided two-step process.Source: CIKM'14 - 23rd ACM International Conference on Conference on Information and Knowledge Management, pp. 2075–2077, Shanghai, China, 3-7 November 2014
DOI: 10.1145/2661829.2661854

See at: academic.microsoft.com Restricted | dblp.uni-trier.de Restricted | dl.acm.org Restricted | dl.acm.org Restricted | dl.acm.org Restricted | doi.acm.org Restricted | iris.unive.it Restricted | CNR ExploRA Restricted


2014 Conference article Restricted

GPU-based computing of repeated range queries over moving objects
Orlando S., Francesco L., Silvestri C., Jensen C. S.
In this paper we investigate the use of GPUs to solve a data-intensive problem that involves huge amounts of moving objects. The scenario which we focus on regards objects that continuously move in a 2D space, where a large percentage of them also issues range queries. The processing of these queries entails a large quantity of objects falling into the range queries to be returned. In order to solve this problem by maintaining a suitable throughput, we partition the time into ticks, and defer the parallel processing of all the objects events (location updates and range queries) occurring in a given tick to the next tick, thus slightly delaying the overall computation. We process in parallel all the events of each tick by adopting an hybrid approach, based on the combined use of CPU and GPU, and show the suitability of the method by discussing performance results. The exploitation of a GPU allow us to achieve a speedup of more than 20× on several datasets with respect to the best sequential algorithm solving the same problem. More importantly, we show that the adoption of new bitmap-based intermediate data structure we propose to avoid memory access contention entails a 10× speedup with respect to naive GPU based solutions. © 2014 IEEE.Source: PDP 2014 - 22nd Euromicro International Conference on Parallel, Distributed and Network-Based Processing, pp. 640–647, Torino, Italy, 12-14 February 2014
DOI: 10.1109/pdp.2014.27

See at: academic.microsoft.com Restricted | dblp.uni-trier.de Restricted | ieeexplore.ieee.org Restricted | ieeexplore.ieee.org Restricted | iris.unive.it Restricted | CNR ExploRA Restricted | PURE Aarhus University Restricted | VBN Restricted | xplorestaging.ieee.org Restricted


2014 Conference article Restricted

Quite a mess in my cookie jar!: Leveraging machine learning to protect web authentication
Calzavara S., Tolomei G., Bugliesi M., Orlando S.
Browser-based defenses have recently been advocated as an effective mechanism to protect web applications against the threats of session hijacking, fixation, and related attacks. In existing approaches, all such defenses ultimately rely on client-side heuristics to automatically detect cookies containing session information, to then protect them against theft or otherwise unintended use. While clearly crucial to the effectiveness of the resulting defense mechanisms, these heuristics have not, as yet, undergone any rigorous assessment of their adequacy. In this paper, we conduct the first such formal assessment, based on a gold set of cookies we collect from 70 popular websites of the Alexa ranking. To obtain the gold set, we devise a semi-automatic procedure that draws on a novel notion of authentication token, which we introduce to capture multiple web authentication schemes. We test existing browser-based defenses in the literature against our gold set, unveiling several pitfalls both in the heuristics adopted and in the methods used to assess them. We then propose a new detection method based on supervised learning, where our gold set is used to train a binary classifier, and report on experimental evidence that our method outperforms existing proposals. Interestingly, the resulting classification, together with our hands-on experience in the construction of the gold set, provides new insight on how web authentication is implemented in practice. Copyright is held by the International World Wide Web Conference Committee (IW3C2).Source: www'14 - 23rd international conference on World Wide Web, pp. 189–199, Seul, Corea, 7-11 April 2014
DOI: 10.1145/2566486.2568047

See at: academic.microsoft.com Restricted | dblp.uni-trier.de Restricted | dl.acm.org Restricted | dl.acm.org Restricted | doi.org Restricted | iris.unive.it Restricted | CNR ExploRA Restricted | www.dais.unive.it Restricted


2014 Journal article Open Access OPEN

A general framework for trajectory data warehousing and visual OLAP
Leonardi L., Orlando S., Raffaetà A., Roncato A., Silvestri C., Andrienko G., Andrienko N.
In this paper we present a formal framework for modelling a trajectory data warehouse (TDW), namely a data warehouse aimed at storing aggregate information on trajectories of moving objects, which also offers visual OLAP operations for data analysis. The data warehouse model includes both temporal and spatial dimensions, and it is flexible and general enough to deal with objects that are either completely free or constrained in their movements (e.g., they move along a road network). In particular, the spatial dimension and the associated concept hierarchy reflect the structure of the environment in which the objects travel. Moreover, we cope with some issues related to the efficient computation of aggregate measures, as needed for implementing roll-up operations. The TDW and its visual interface allow one to investigate the behaviour of objects inside a given area as well as the movements of objects between areas in the same neighbourhood. A user can easily navigate the aggregate measures obtained from OLAP queries at different granularities, and get overall views in time and in space of the measures, as well as a focused view on specific measures, spatial areas, or temporal intervals. We discuss two application scenarios of our TDW, namely road traffic and vessel movement analysis, for which we built prototype systems. They mainly differ in the kind of information available for the moving objects under observation and their movement constraints.Source: Geoinformatica (Dordrecht) 18 (2014): 273–312. doi:10.1007/s10707-013-0181-3
DOI: 10.1007/s10707-013-0181-3

See at: Fraunhofer-ePrints Open Access | GeoInformatica Restricted | GeoInformatica Restricted | GeoInformatica Restricted | GeoInformatica Restricted | GeoInformatica Restricted | GeoInformatica Restricted | GeoInformatica Restricted | link.springer.com Restricted | GeoInformatica Restricted | GeoInformatica Restricted | GeoInformatica Restricted | CNR ExploRA Restricted | GeoInformatica Restricted