2007
Conference article
Restricted
Spatio-temporal aggregations in trajectory data warehouses
Orlando S, Orsini R, Raffaetà A, Roncato A, Silvestri CIn this paper we investigate some issues related to the design of a simple Data Warehouse (DW), storing several aggregate measures about trajectories of moving objects. First we discuss the loading phase of our DW which has to deal with overwhelming streams of trajectory observations, possibly produced at different rates, and arriving in an unpredictable and unbounded way. Then, we focus on the measure presence, the most complex measure stored in our DW. Such a measure returns the number of trajectories that lie in a spatial region during a given temporal interval. We devise a novel way to compute an approximate, but very accurate, presence aggregate function, which algebraically combines a bounded amount of measures stored in the base cells of the data cube.
See at:
CNR IRIS
| CNR IRIS
| www.springerlink.com
2007
Journal article
Restricted
Approximate mining of frequent patterns on streams
Silvestri C, Orlando SMany critical applications, like intrusion detection or stock market analysis, require a nearly immediate result based on a continuous and infinite stream of data. In most cases finding an exact solution is not compatible with limited availability of resources and real time constraints, but an approximation of the exact result is enough for most purposes. This paper introduces a new algorithm for approximate mining of frequent itemsets from streams of transactions using a limited amount of memory. The proposed algorithm is based on the computation of frequent itemsets in recent data and an effective method for inferring the global support of previously infrequent itemsets. Both upper and lower bounds on the support of each pattern found are returned along with the interpolated support. An extensive experimental evaluation shows that AP_Stream, the proposed algorithm, yields a good approximation of the exact global result considering both the set of patterns found and their supports.Source: INTELLIGENT DATA ANALYSIS, vol. 11 (issue 1), pp. 49-73
See at:
dl.acm.org
| CNR IRIS
| CNR IRIS
2007
Journal article
Restricted
The many faces of the integration of instruments and the grid
Lelli F, Frizziero E, Gulmini M, Maron G, Orlando S, Petrucci A, Squizzato SCurrent grid technologies offer unlimited computational power and storage capacity for scientific research and business activities in heterogeneous areas all over the world. Thanks to the grid, different virtual organisations can operate together in order to achieve common goals. However, concrete use cases demand a closer interaction between various types of instruments accessible from the grid on the one hand and the classical grid infrastructure, typically composed of Computing and Storage Elements, on the other. We cope with this open problem by proposing and realising the first release of the Instrument Element (IE), a new grid component that provides the computational/data grid with an abstraction of real instruments, and grid users with a more interactive interface to control them. In this paper we discuss in detail the implemented software architecture for this new component and we present concrete use cases where the IE has been successfully integrated.Source: INTERNATIONAL JOURNAL OF WEB AND GRID SERVICES, vol. 3, pp. 239-266
DOI: 10.1504/ijwgs.2007.014953Metrics:
See at:
International Journal of Web and Grid Services
| CNR IRIS
| CNR IRIS
| www.inderscience.com
2007
Journal article
Restricted
Trajectory data warehouses: design and implementation issues
Orlando S, Orsini R, Raffaetà A, Roncato A, Silvestri CIn this paper we investigate some issues and solutions related to the design of a Data Warehouse (DW), storing several aggregate measures about trajectories of moving objects. First we discuss the loading phase of our DW which has to deal with overwhelming streams of trajectory observations, possibly produced at different rates, and arriving in an unpredictable and unbounded way. Then, we focus on the measure presence, the most complex measure stored in our DW. Such a measure returns the number of distinct trajectories that lie in a spatial region during a given temporal interval. We devise a novel way to compute an approximate, but very accurate, presence aggregate function, which algebraically combines a bounded amount of measures stored in the base cells of the data cube. We conducted many experiments to show the effectiveness of our method to compute such an aggregate function. In addition, the feasibility of our innovative trajectory DW was validated with an implementation based on Oracle. We investigated the most challenging issues in realizing our trajectory DW using standard DW technologies: namely, the preprocessing and loading phase, and the aggregation functions to support OLAP operations.Source: JOURNAL OF COMPUTING SCIENCE AND ENGINEERING, vol. 1 (issue 2), pp. 240-261
See at:
CNR IRIS
| CNR IRIS
2005
Conference article
Restricted
Distributed approximate mining of frequent patterns
Silvestri C, Orlando SThis paper discusses a novel communication effcient distributed algorithm for approximate mining of frequent patterns from transactional databases. The proposed algorithm consists in the distributed exact computation of locally frequent itemsets and an effective method for inferring the local support of locally unfrequent itemsets. The combination of the two strategies gives a good approximation of the set of the globally frequent patterns and their supports. Several tests on publicly available datasets were conducted, aimed at evaluating the similarity between the exact result set and the approximate ones returned by our distributed algorithm as well as the scalability of the proposed method.
See at:
CNR IRIS
| CNR IRIS
2007
Conference article
Restricted
Nine months in the life of EGEE: a look from the South
Da Costa G, Dikaiakos M D, Orlando SGrids have emerged as wide-scale, distributed infrastructures providing enough resources for always more demanding scientific experiments. EGEE is one of the largest scientific Grids in production operation today, with over 220 sites and more than 30,000 CPU all over the world. A further evolution of EGEE needs to be based on knowledge of deficiencies and bottleneck of the current infrastructure and software. To provide this knowledge we analyzed nine months of job submissions on the South-East federation of EGEE. We provide information on how users submit their jobs: throughput, bursts, requirements, VO. We study the current behavior of EGEE middleware too, by evaluating its performance and the retry policy. We finally show that even if the middleware provides advanced functionality, most submissions are still embarrassingly parallel jobs.
See at:
CNR IRIS
| CNR IRIS
2007
Conference article
Restricted
Towards response time estimation in Web services
Lelli F, Maron G, Orlando SMonitor and control operations demand deep interaction between users and devices, while they require the adoption of high interoperable solutions that only SOA-based Web Services can offer. When the access is performed via Internet using Web Services calls, the remote invocation time becomes crucial in order to understand if a service can be controlled properly, or the delays introduced by the wire and the serialization/deserialization process are unacceptable. We propose methodologies, based on a 2^k factorial analysis and a Gaussian Majorization of previous service execution times, which enable the estimation of a generic remote method execution time.
See at:
CNR IRIS
| CNR IRIS
2007
Conference article
Restricted
Client side estimation of a remote service execution
Lelli F, Maron G, Orlando SMonitor and control operations demand deep interaction between users and devices, while they require the adoption of high interoperable solutions that only SOA-based Web Services can offer. When the access is performed via Internet using Web Services calls, the remote invocation time becomes crucial in order to understand if a service can be controlled properly, or the delays introduced by the wire and the serialization/deserialization process are unacceptable. We propose methodologies, based on a 2^k factorial analysis and a Gaussian Majorization of previous service execution times, which enable the estimation of a generic remote method execution time.
See at:
CNR IRIS
| ieeexplore.ieee.org
| CNR IRIS
2007
Conference article
Restricted
Trajectory data warehouses: design issues and use cases
Orlando S, Orsini R, Raffaetà A, Roncato A, Silvestri CIn this paper we discuss how data warehousing technology can be used to store aggregate information about trajectories and to per- form OLAP operations over them. To this end, we define a data cube with spatial and temporal dimensions, discretized according to a hierarchy of regular grids. Moreover, we analyse some measures of interest related to trajectories, such as the number of trajectories starting from a cell, the distance covered by the trajectories in a cell, the average and maximum speed and the average acceleration of the trajectories in the cell. We provide some algorithms to compute and load these measures in the base cells, starting from a stream of trajectory observations. Such stored values are used, along with suitable aggregate functions, to compute the roll-up operations. Finally, we envision some use cases that would benefit from such a framework, in particular in the domain of supervision systems to monitor road traffic (or movements of individuals) in a given geographical area.
See at:
CNR IRIS
| CNR IRIS
2008
Conference article
Restricted
Trajectory data warehouses: storing and aggregating frequent ST patterns
Orlando S, Raffaetà A, Roncato A, Silvestri CIn this paper we present an approach for storing and aggregating spatio-temporal patterns by using a Tra jectory Data Warehouse (TDW). In particular, our aim is to allow the analysts to quickly evaluate frequent patterns mined from tra jectories of moving ob jects occurring in a specific spatial zone and during a given temporal interval. We resort to a TDW, based on a data cube model, having spatial and tem- poral dimensions, discretized according to a hierarchy of regular grids, and whose facts are sets of tra jectories which intersect the ST cells of the cube. The idea is to enrich such a TDW with a new measure: frequent patterns obtained from a data-mining process on tra jectories. As a consequence these patterns can be analysed by the user at various levels of granularity with the use of OLAP queries. The research issues discussed in this paper are (1) the extraction/mining of the patterns to be stored in each cell, which requires an adequate pro jection phase of tra jectories before mining; (2) the ST aggregation of patterns to answer roll-up queries, which poses many problems due to the holistic nature of the aggregation function.
See at:
CNR IRIS
| CNR IRIS
2009
Conference article
Restricted
Frequent spatio-temporal patterns in trajectory data warehouses
Leonardi L, Orlando S, Raffaetà A, Roncato A, Silvestri CIn this paper we present an approach for storing and aggregating spatio-temporal patterns by using a Trajectory Data Warehouse (TDW). In particular, our aim is to allow the analysts to quickly evaluate frequent patterns mined from trajectories of moving objects occurring in a specific spatial zone and during a given temporal interval. We resort to a TDW, based on a data cube model, having spatial and temporal dimensions, discretized according to a hierarchy of regular grids, and whose facts are sets of trajectories which intersect the spatio-temporal cells of the cube. The idea is to enrich such a TDW with a new measure: frequent patterns obtained from a data-mining process on trajectories. As a consequence these patterns can be analysed by the user at various levels of granularity by means of OLAP queries. The research issues discussed in this paper are (1) the extraction/ mining of the patterns to be stored in each cell, which requires an adequate projection phase of trajectories before mining; (2) the spatio-temporal aggregation of patterns to answer roll-up queries, which poses many problems due to the holistic nature of the aggregation function.DOI: 10.1145/1529282.1529603Metrics:
See at:
dl.acm.org
| doi.org
| CNR IRIS
| CNR IRIS
2011
Journal article
Restricted
Visual mobility analysis using T-warehouse
Raffaetà Alessandra, Leonardi Luca, Marketos Gerasimos, Andrienko Gennady L, Andrienko Natalia V Frentzos Elias, Giatrakos Nikos, Orlando Salvatore, Pelekis Nikos, Roncato Alessandro, Silvestri ClaudioTechnological advances in sensing technologies and wireless telecommunication devices enable research fields related to the management of trajectory data. The challenge after storing the data is the implementation of appropriate analytics for extracting useful knowledge. However, traditional data warehousing systems and techniques were not designed for analyzing trajectory data. In this paper, the authors demonstrate a framework that transforms the traditional data cube model into a trajectory warehouse. As a proof-of-concept, the authors implement T-Warehouse, a system that incorporates all the required steps for Visual Trajectory Data Warehousing, from trajectory reconstruction and ETL processing to Visual OLAP analysis on mobility dataDOI: 10.4018/jdwm.2011010101DOI: 10.4018/978-1-4666-2148-0.ch001Metrics:
See at:
doi.org
| International Journal of Data Warehousing and Mining
| CNR IRIS
| CNR IRIS
| Fraunhofer-ePrints
2011
Conference article
Restricted
Introduction to Topic 5: Parallel and Distributed Data Management
Orlando Salvatore, Antoniu Gabriel, Ghoting Amol, Perez Maria SThe manipulation and handling of an ever increasing volume of data by current data-intensive applications require novel techniques for efficient data management. Despite recent advances in every aspect of data management (storage, access, querying, analysis, mining), future applications are expected to scale to even higher degrees, not only in terms of volumes of data handled but also in terms of users and resources, often making use of multiple, pre-existing autonomous, distributed or heterogeneous resources. The notion of parallelism and concurrent execution at all levels remains a key element in achieving scalability and managing efficiently such data-intensive applications, but the changing nature of the underlying environments requires new solutions to cope with such changes. In this context, this topic sought papers in all aspects of data management (including databases and data-intensive applications) that focus on some form of parallelism and concurrency. Each paper was reviewed by four reviewers and, after discussion, we were able to select four regular papers. The accepted papers address relevant issues on various topics such as effective data compression, GPU-based data indexing, distributed collaborative data filtering and parallel query processing.DOI: 10.1007/978-3-642-23400-2_32Metrics:
See at:
CNR IRIS
| CNR IRIS
| www.springerlink.com
2014
Conference article
Restricted
GPU-based computing of repeated range queries over moving objects
Orlando S, Francesco L, Silvestri C, Jensen C SIn this paper we investigate the use of GPUs to solve a data-intensive problem that involves huge amounts of moving objects. The scenario which we focus on regards objects that continuously move in a 2D space, where a large percentage of them also issues range queries. The processing of these queries entails a large quantity of objects falling into the range queries to be returned. In order to solve this problem by maintaining a suitable throughput, we partition the time into ticks, and defer the parallel processing of all the objects events (location updates and range queries) occurring in a given tick to the next tick, thus slightly delaying the overall computation. We process in parallel all the events of each tick by adopting an hybrid approach, based on the combined use of CPU and GPU, and show the suitability of the method by discussing performance results. The exploitation of a GPU allow us to achieve a speedup of more than 20× on several datasets with respect to the best sequential algorithm solving the same problem. More importantly, we show that the adoption of new bitmap-based intermediate data structure we propose to avoid memory access contention entails a 10× speedup with respect to naive GPU based solutions. © 2014 IEEE.DOI: 10.1109/pdp.2014.27Metrics:
See at:
doi.org
| CNR IRIS
| ieeexplore.ieee.org
| CNR IRIS
| PURE Aarhus University
| VBN
2014
Conference article
Restricted
Quite a mess in my cookie jar!: Leveraging machine learning to protect web authentication
Calzavara S, Tolomei G, Bugliesi M, Orlando SBrowser-based defenses have recently been advocated as an effective mechanism to protect web applications against the threats of session hijacking, fixation, and related attacks. In existing approaches, all such defenses ultimately rely on client-side heuristics to automatically detect cookies containing session information, to then protect them against theft or otherwise unintended use. While clearly crucial to the effectiveness of the resulting defense mechanisms, these heuristics have not, as yet, undergone any rigorous assessment of their adequacy. In this paper, we conduct the first such formal assessment, based on a gold set of cookies we collect from 70 popular websites of the Alexa ranking. To obtain the gold set, we devise a semi-automatic procedure that draws on a novel notion of authentication token, which we introduce to capture multiple web authentication schemes. We test existing browser-based defenses in the literature against our gold set, unveiling several pitfalls both in the heuristics adopted and in the methods used to assess them. We then propose a new detection method based on supervised learning, where our gold set is used to train a binary classifier, and report on experimental evidence that our method outperforms existing proposals. Interestingly, the resulting classification, together with our hands-on experience in the construction of the gold set, provides new insight on how web authentication is implemented in practice. Copyright is held by the International World Wide Web Conference Committee (IW3C2).DOI: 10.1145/2566486.2568047Metrics:
See at:
dl.acm.org
| doi.org
| CNR IRIS
| CNR IRIS
2014
Journal article
Restricted
A general framework for trajectory data warehousing and visual OLAP
Leonardi L, Orlando S, Raffaetà A, Roncato A, Silvestri C, Andrienko G, Andrienko NIn this paper we present a formal framework for modelling a trajectory data warehouse (TDW), namely a data warehouse aimed at storing aggregate information on trajectories of moving objects, which also offers visual OLAP operations for data analysis. The data warehouse model includes both temporal and spatial dimensions, and it is flexible and general enough to deal with objects that are either completely free or constrained in their movements (e.g., they move along a road network). In particular, the spatial dimension and the associated concept hierarchy reflect the structure of the environment in which the objects travel. Moreover, we cope with some issues related to the efficient computation of aggregate measures, as needed for implementing roll-up operations. The TDW and its visual interface allow one to investigate the behaviour of objects inside a given area as well as the movements of objects between areas in the same neighbourhood. A user can easily navigate the aggregate measures obtained from OLAP queries at different granularities, and get overall views in time and in space of the measures, as well as a focused view on specific measures, spatial areas, or temporal intervals. We discuss two application scenarios of our TDW, namely road traffic and vessel movement analysis, for which we built prototype systems. They mainly differ in the kind of information available for the moving objects under observation and their movement constraints.Source: GEOINFORMATICA (DORDRECHT), vol. 18 (issue 2), pp. 273-312
DOI: 10.1007/s10707-013-0181-3Metrics:
See at:
GeoInformatica
| CNR IRIS
| CNR IRIS
| link.springer.com
| Fraunhofer-ePrints
2007
Journal article
Open Access
Peer-to-peer systems for discovering resources in a dynamic grid
Marzolla M, Mordacchini M, Orlando SThe convergence of the Grid and Peer-to-Peer (P2P) worlds has led to many solutions that try to efficiently solve the problem of resource discovery on Grids. Some of these solutions are extensions of P2P DHT-based networks. We believe that these systems are not flexible enough when the indexed data are very dynamic, i.e., the values of the resource attributes change very frequently over time. This is a common case for Grid metadata, like CPU loads, queue occupation, etc. Moreover, since common requests for Grid resources may be expressed as multi-attribute range queries, we think that the DHT-based P2P solutions are poorly flexible and efficient in handling them. In this paper we present two P2P systems. Both are based on Routing Indexes, which are used to efficiently route queries and update messages in the presence of highly variable data. The first system uses a tree-shaped overlay network. The second one is an evolution of the first, and is based on a two-level hierarchical network topology, where tree topologies must only be maintained at the lower level of the hierarchy, i.e., within the various node groups making up the network. The main goal of the second organization is to achieve a simpler maintenance of the overall P2P graph topology, by preserving the good properties of the tree-shaped topology. We discuss the results of extensive simulation studies aimed at assessing the performance and scalability of the proposed approaches. We also analyze how the network topologies affect the propagation of query and update messages.Source: PARALLEL COMPUTING, vol. 33 (issue 4-5), pp. 339-358
DOI: 10.1016/j.parco.2007.02.006Metrics:
See at:
Parallel Computing
| Parallel Computing
| Archivio istituzionale della ricerca - Alma Mater Studiorum Università di Bologna
| CNR IRIS
| CNR IRIS
| www.sciencedirect.com