2008
Conference article
Restricted
Sequence mining automata: a new technique for mining frequent sequences under regular expressions
Trasarti R., Bonchi F., Goethals B.In this paper we study the problem of mining frequent se- quences satisfying a given regular expression. Previous ap- proaches to solve this problem were focusing on its search space, pushing (in some way) the given regular expression to prune unpromising candidate patterns. On the contrary, we focus completely on the given input data and regular ex- pression. We introduce SequenceMining Automata (SMA), a specialized kind of Petri Net that while reading input se- quences, it produces for each sequence all and only the pat- terns contained in the sequence and that satisfy the given regular expression. Based on this automaton, we develop a family of algorithms. Our thorough experimentation on different datasets and application domains confirms that in many cases our methods outperform the current state of the art of frequent sequence mining algorithms using regular expressions (in some cases of orders of magnitude).Source: Eighth IEEE International Conference on Data Mining, pp. 1061–1066, Pisa, Italy, 15-19 December 2008
DOI: 10.1109/icdm.2008.111Metrics:
See at:
doi.org | ieeexplore.ieee.org | CNR ExploRA
2012
Conference article
Restricted
Mega-modeling for big data analytics
Ceri S., Della Valle E., Pedreschi D., Trasarti R.The availability of huge amounts of data ("big data") is changing our attitude towards science, which is moving from specialized to massive experi- ments and from very focused to very broad research questions. Models of all kinds, from analytic to numeric, from exact to stochastic, from simulative to predictive, from behavioral to ontological, from patterns to laws, enable mas- sive data analysis and mining, often in real time. Scientific discovery in most cases stems from complex pipelines of data analysis and data mining methods on top of "big" experimental data, confronted and contrasted with state-of-art knowledge. In this setting, we propose mega-modelling as a new holistic data and model management system for the acquisition, composition, integration, management, querying and mining of data and models, capable of mastering the co-evolution of data and models and of supporting the creation of what-if anal- yses, predictive analytics and scenario explorations.Source: Conceptual Modeling. 31st International Conference on Conceptual Modeling, pp. 1–15, Florence, Italy, 15-18 October 2012
DOI: 10.1007/978-3-642-34002-4_1Metrics:
See at:
doi.org | link.springer.com | CNR ExploRA
2013
Conference article
Restricted
A gravity model for speed estimation over road network
Cintia P., Trasarti R., Almada Cruz L., Ferreira Costa C., De Macedo J. A. F.The availability of inexpensive tracking devices, such as GPS-enabled devices, gives the opportunity to collect large amounts of trajectory data from vehicles. In this context, we are interested in the problem of generating the traffic information in time-dependent networks using this kind of data. This problem is not trivial since several works in literature use strong assumptions on the error distribution we want to drop, proposing a gravitational model method to compute road segment average speed from trajectory data. Furthermore we show how to generate travel-time functions from the computed average speeds useful for time-dependent networks routing systems. Our approach allows creating an accurate picture of the traffic conditions in time and space. The method we present in this paper tackles all this aspect showing how its performance over a synthetic dataset and a real case.Source: Mobile Data Management. IEEE 14th International Conference, pp. 136–141, Milano, Italy, 2 June 2013
DOI: 10.1109/mdm.2013.83Project(s): SEEK Metrics:
See at:
doi.org | ieeexplore.ieee.org | CNR ExploRA
2019
Journal article
Open Access
Computational modelling and data-driven techniques for systems analysis
Matwin S., Tesei L., Trasarti R.This JIIS Special Issue aimed at bringing together contributions from academia, industry and research institutions interested in the combined application of computational modelling methods with data-driven techniques from the areas of knowledge management, data mining and machine learning. Modelling methodologies of interest included automata, agents, Petri nets, process algebras and rewriting systems. Application domains included social systems, ecology, biology, medicine, smart cities, governance, education, software engineering, and any other field that deals with complex systems and large amounts of data.Source: Journal of intelligent information systems 52 (2019): 473–475. doi:10.1007/s10844-019-00554-z
DOI: 10.1007/s10844-019-00554-zProject(s): SoBigData Metrics:
See at:
Journal of Intelligent Information Systems | Journal of Intelligent Information Systems | ISTI Repository | CNR ExploRA
2019
Journal article
Open Access
Finding roles of players in football using automatic particle swarm optimization-clustering algorithm
Behravan I., Zahiri S. H., Razavi S. M., Trasarti R.Recently, professional team sport organizations have invested their resources to analyze their own and opponents' performance. So, developing methods and algorithms for analyzing team sports has become one of the most popular topics among data scientists. Analyzing football is hard because of its complexity, number of events in each match, and constant flow of circulation of the ball. Finding roles of players with the purpose of analyzing the performance of a team or making a meaningful comparison between players is crucial. In this article, an automatic big data clustering method, based on a swarm intelligence algorithm, is proposed to automatically cluster the data set of players' performance centers in different matches and extract different kinds of roles in football. The proposed method created using particle swarm optimization algorithm has two phases. In the first phase, the algorithm searches the solution space to find the number of clusters and, in the second phase, it finds the positions of the centroids. To show the effectiveness of the algorithm, it is tested on six synthetic data sets and its performance is compared with two other conventional clustering methods. After that, the algorithm is used to find clusters of a data set containing 93,000 objects, which are the centers of players' performance in about 4900 matches in different European leagues.Source: Big data (Online) 7 (2019): 35–56. doi:10.1089/big.2018.0069
DOI: 10.1089/big.2018.0069Metrics:
See at:
ISTI Repository | Big Data | www.liebertpub.com | CNR ExploRA
2023
Conference article
Open Access
Dataspaces: concepts, architectures and initiatives
Atzori M., Ciaramella A., Diamantini C., Di Martino B., Distefano S., Facchinetti T., Montecchiani F., Nocera A., Ruffo G., Trasarti R.Despite not being a new concept, dataspaces have become a prominent topic due to
the increasing availability of data and the need for efficient management and utilization
of diverse data sources. In simple terms, a dataspace refers to an environment where
data from various sources, formats, and domains can be integrated, shared, and
analyzed. It aims to provide a unified view of heterogeneous data by bridging the gap
between different data silos, enabling interoperability. The concept of dataspaces
promotes the idea that data should be treated as a cohesive entity, rather than being
fragmented across different systems and applications.
Dataspaces often involve the integration of structured and unstructured data, including
databases, documents, sensor data, social media feeds, and more. The goal is to
enable organizations to harness the full potential of their data assets by facilitating
data discovery, access, and analysis. By bringing together diverse data sources,
dataspaces can offer new insights, support decision-making processes, and drive
innovation.
In the context of European Commission-funded research projects, dataspaces are
often explored as part of initiatives focused on data management, data sharing, and
the development of data-driven technologies. These projects aim to address
challenges related to data integration, data privacy, data governance, and scalability.
The goal is to advance the state of the art in data management and enable
organizations to leverage data more effectively for societal, economic, and scientific
advancements.
It is important to notice that while dataspaces offer potential benefits, they also come
with challenges. These challenges include data quality assurance, data privacy and
security, semantic interoperability, scalability, and the need for appropriate data
governance frameworks.
Overall, dataspaces represent an approach to managing and utilizing data that
emphasizes integration, interoperability, and accessibility. The concept is being
explored and researched to develop innovative solutions that can unlock the value of
data in various domains and sectors.Source: ITADATA 2023 - 2nd Italian Conference on Big Data and Data Science, Naples, Italy, 11-13/09/2023
Project(s): SoBigData
See at:
ceur-ws.org | ISTI Repository | CNR ExploRA
2009
Conference article
Restricted
K-BestMatch reconstruction and comparison of trajectory data
Nanni M., Trasarti R.In this paper we propose a map matching method to overcoming the limitations of standard best-match recon- struction strategies. We use a more flexible approach which consider the k-optimal alternative paths to reconstruct the trajectories from the GPS raw data. The preliminary results, obtained on a real dataset of car users in Milan area, suggest that our method leads to beneficial effects on the successive analysis to be performed such as KNN and clusteringSource: International Workshop on Spatial and Spatiotemporal Data Mining (SSTDM-09) In Cooperation with IEEE ICDM 2009, pp. 610, Miami, Florida, USA, 6 December 2009
DOI: 10.1109/icdmw.2009.62Metrics:
See at:
doi.org | ieeexplore.ieee.org | CNR ExploRA
2009
Conference article
Open Access
Mobility, data mining and privacy: the GeoPKDD paradigm
Trasarti R., Giannotti F.The technologies of mobile communications and ubiquitous computing pervade our society, and wireless networks sense the movement of people and vehicles, generating large volumes of mobility data. Miniaturization, wearability, pervasiveness are producing traces of our mobile activity, with increasing positioning accuracy and semantic richness: Location data from mobile phones (GSM cell positions), GPS tracks from mobile devices receiving geo-positions from satellites, etc. The objective of the GeoPKDD (Geographic Privacy-aware Knowledge Discovery and Delivery) project is to discover useful knowledge about human movement behaviour from mobility data, while preserving the privacy of the people under observation. Pursuing this ambitious objective, the GeoPKDD project has started a new exciting multidisciplinary research area, at the crossroads of mobility, data mining, and privacy. This paper gives a short overview of the envisaged research challenges and the project achievements.Source: SIAM Conference Mathematics for Industry: Challenges and Frontiers (MI09), pp. 10–18, San Francisco, California, 9-10 October 2009
See at:
www.siam.org | CNR ExploRA
2009
Conference article
Unknown
A new technique for sequential pattern mining under regular expressions
Trasarti R., Bonchi F., Goethals B.In this paper we study the problem of mining frequent sequences satisfying a given regular expression. Previous approaches to solve this problem were focusing on its search space, pushing (in some way) the given regular expression to prune unpromising candidate patterns. On the contrary, we focus completely on the given input data and regular expression. We introduce Sequence Mining Automata (SMA), a specialized kind of Petri Net that while reading input sequences, it produces for each sequence all and only the patterns contained in the sequence and that satisfy the given regular expression. Based on this automaton, we develop a family of algorithms. Our thorough experimentation on different datasets and application domains confirms that in many cases our methods outperform the current state of the art of frequent sequence mining algorithms using regular expressions (in some cases of orders of magnitude).Source: 17th Italian Symposium on Advanced Database Systems, Camogli, Genova, italy, 21-24 June 2009
See at:
CNR ExploRA
2009
Conference article
Restricted
DAMSEL: a system for progressive querying and reasoning on movement data
Trasarti R., Baglioni M., Renso C.In this paper we present DAMSEL a querying and reasoning system to support the knowledge discovery process over movement data. DAMSEL is the integration of a data mining query language system, called Daedalus, with a reasoning system, called Athena. The synergic integration of the two produces an advanced flexible querying system, specialized on movement data. The query language of the integrated system allows for mixing data mining and semantic enhanced queries. Furthermore, the support for the iterative knowledge discovery process provides the user with a powerful reasoning tool. The architecture of DAMSEL and its query language is outlined in the paper along with an application scenario that expresses the advanced querying capability of the system.Source: 20th International Workshop on Database and Expert Systems Application, pp. 452–456, Linz, Austria, 31 August - 4 September 2009
DOI: 10.1109/dexa.2009.27Metrics:
See at:
doi.org | ieeexplore.ieee.org | CNR ExploRA
2009
Conference article
Restricted
Towards semantic interpretation of movement behavior
Baglioni M., Macedo J., Renso C., Trasarti R., Wachowicz M.In this paper we aim at providing a model for the conceptual representation and deductive reasoning of trajectory patterns obtained from mining raw trajectories. This has been achieved by means of a semantic enrichment process, where raw trajectories are enhanced with semantic information and integrated with geographical knowledge encoded in an ontology. The reasoning mechanisms provided by the chosen ontology formalism are exploited to accomplish a further semantic enrichment step that gives a possible interpretation of discovered patterns in terms of movement behaviour. A sketch of the realised system, called Athena, is given, along with some examples to demonstrate the feasibility of the approach.Source: Advances in GIScience. 12th AGILE Conference, pp. 271–288, Hannover, germania, 3 giugno 2009
DOI: 10.1007/978-3-642-00318-9_14Metrics:
See at:
NARCIS | link.springer.com | NARCIS | CNR ExploRA
2010
Conference article
Unknown
Querying and mining trajectories with gaps: a multi-path reconstruction approach (Extended Abstract)
Nanni M., Trasarti R.In this paper we propose a map matching method to overcoming the limitations of standard best-match reconstruction strategies. We use a more flex- ible approach which consider the k-optimal alternative paths to reconstruct the trajectories from the GPS raw data. The preliminary results, obtained on a real dataset of car users in Milan area, suggest that our method leads to beneficial effects on the successive analysis to be performed such as KNN and clustering.Source: 18th Italian Symposium on Advanced Database Systems, pp. 126–133, Rimini, Italy, 20-23 June 2010
See at:
CNR ExploRA
2011
Journal article
Restricted
C-safety: a framework for the anonymization of semantic trajectories
Monreale Anna, Trasarti Roberto, Pedreschi Dino, Renso Chiara, Bogorny VaniaThe increasing abundance of data about the trajectories of personal movement is opening new opportunities for analyzing and mining human mobility. However, new risks emerge since it opens new ways of intruding into personal privacy. Representing the personal movements as se- quences of places visited by a person during her/his movements - semantic trajectory - poses great privacy threats. In this paper we propose a privacy model defining the attack model of semantic tra- jectory linking and a privacy notion, called c-safety based on a generalization of visited places based on a taxonomy. This method provides an upper bound to the probability of inferring that a given person, observed in a sequence of non-sensitive places, has also visited any sensitive location. Co- herently with the privacy model, we propose an algorithm for transforming any dataset of semantic trajectories into a c-safe one. We report a study on two real-life GPS trajectory datasets to show how our algorithm preserves interesting quality/utility measures of the original trajectories, when min- ing semantic trajectories sequential pattern mining results. We also empirically measure how the probability that the attacker's inference succeeds is much lower than the theoretical upper bound established.Source: Transactions on data privacy (Internet) 4 (2011): 73–101.
See at:
www.tdp.cat | CNR ExploRA
2012
Conference article
Open Access
ComeTogether: discovering communities of places in mobility data
Ramalho Brilhante I., Berlingerio M., Trasarti R., Renso C., Fernandes De Macedo J. A., Casanova M. A.We analyze urban mobility and public places under a new perspective: how can we feature the places in a city based on how people move among them? To answer this question we need to combine places, like points of interest, with mobility information like the trajectories of individuals moving within a city. To accomplish this, we propose a methodology based on complex network analysis: we build a network of points of interests by connecting places by the individual trajectories passing through them. From such network we compute communities finding groups places highly connected by the mobility of the individuals. We present a case study on real trajectory dataset on the city of Milan, showing a complementary view on the urban mobility that is not covered by the state-of-the art techniques on mobility analysis. © 2012 IEEE.Source: MDM 2012 - 2012 IEEE 13th International Conference on Mobile Data Management, pp. 268–273, Bengaluru, India, 23-26 July 2012
DOI: 10.1109/mdm.2012.17Metrics:
See at:
www.inf.puc-rio.br | doi.org | ieeexplore.ieee.org | CNR ExploRA
2013
Report
Unknown
Mob-Warehouse: a semantic approach for mobility analysis with a trajectory data warehouse
Wagner R., De Macedo J. A. F., Raffaetà A., Renso C., Roncato A., Trasarti R.The effective analysis and understanding of huge amount of mobility data have been a hot research topic in the last few years. Some proposals addressed the definition of Trajectory Data Warehouses (TDW) as a way to represent and aggregate mobility data, where the ba- sic object is the trajectory. In this paper, we introduce Mob-Warehouse, a TDW which goes a step further since it models trajectories enriched with semantics. In Mob-Warehouse, the unit of movement is the (spatio- temporal) point enriched with several non spatio-temporal dimensions including the activity, the transportation means and the mobility pat- tern. This model allows us to answer the classical Why, Who, When, Where, What, How questions providing an aggregated view of different aspects of the user movements, no longer limited to space and time. We briefly present an experiment of Mob-Warehouse on a real dataset.Source: ISTI Technical reports, 2013
Project(s): SEEK
See at:
CNR ExploRA
2014
Contribution to book
Restricted
On predicting future location of moving objects: a state of art
Corona N., Giannotti F., Monreale A., Trasarti R.The pervasiveness of mobile devices and location-based services produces as side effects an increasing volume of mobility data, which in turn creates the opportunity for a novel generation of analysis methods of movement behaviors. In this chapter, the authors focus on the problem of predicting future locations aimed at predicting with a certain accuracy the next location of a moving object. In particular, they provide a classification of the proposals in the literature addressing that problem. Then the authors preset the data mining method WhereNext and finally discuss possible improvements of that method.Source: Data Science and Simulation in Transportation Research, edited by Davy Janssens, Ansar-Ul-Haque Yasar, Luk Knapen. Hershey: IGI Global, 2014
DOI: 10.4018/978-1-4666-4920-0.ch002DOI: 10.4018/978-1-4666-9845-1.ch091Project(s): DATA SIM Metrics:
See at:
doi.org | doi.org | CNR ExploRA
2013
Conference article
Unknown
Estimating time-dependent speed functions using a gravity model over road network
Cintia P., Trasarti R., Macedo J. A., Almada L., Ferreira C.The availability of inexpensive tracking devices,such as GPS- enabled devices, gives the opportunity to collect large amounts of trajectory data from vehicles. In this context, we are interested in the problem of generating the traffic information in time-dependent networks using this kind of data. This problem is not trivial since several works in liter- ature use strong assumptions on the error distribution we want to drop, proposing a gravitational model method to compute road segment aver- age speed from trajectory data. Furthermore we show how to generate travel-time functions from the computed average speeds useful for time- dependent networks routing systems. Our approach allows creating an accurate picture of the traffic conditions in time and space. The method we present in this paper tackles all this aspect showing how its perfor- mance over a synthetic dataset and a real case.Source: SEBD 2013 - 21st Italian Symposium on Advanced Database Systems, pp. 321–328, Roccella Jonica, Reggio Calabria, Italy, 30 June - 3 July 2013
Project(s): SEEK
See at:
CNR ExploRA
2014
Contribution to book
Restricted
Mob-warehouse: a semantic approach for mobility analysis with a trajectory data warehouse
Wagner R., De Macedo José A. F., Raffaetà A., Roncato A., Renso C., Trasarti R.The effective analysis and understanding of huge amount of mobility data have been a hot research topic in the last few years. In this paper, we introduce Mob-Warehouse, a Trajectory Data Warehouse which goes a step further to the state of the art on mobility analysis since it models trajectories enriched with semantics. The unit of movement is the (spatio-temporal) point endowed with several semantic dimensions including the activity, the transportation means and the mobility patterns. This model allows us to answer the classical Why, Who, When, Where, What, How questions providing an aggregated view of different aspects of the user movements, no longer limited to space and time. We briefly present an experiment of Mob-Warehouse on a real dataset.Source: Advances in Conceptual Modeling, edited by Jeffrey Parsons, Dickson Chiu, pp. 127–136, 2014
DOI: 10.1007/978-3-319-14139-8_15Project(s): SEEK Metrics:
See at:
doi.org | link.springer.com | CNR ExploRA