2008
Conference article
Restricted
Clustering of German municipalities based on mobility characteristics
Zanda A, Koerner C, Giannotti F, Schulz D, May MThis paper presents a clustering approach which groups German municipalities according to mobility characteristics. As the number of measurements for nationwide mobility studies is usually restricted, this clustering provides a means to infer mobility information for locations without measurements based on values of their respective cluster representatives. Our approach considers local and global information, i.e. characteristics of municipalities as well as relationships between municipalities. We realize previous findings in urban geography by using techniques from graph theory and computer vision. Our clustering consists of a two-step model, which rst extracts and condenses single mobility characteristics and subsequently combines the various features. We apply our model to all German municipalities between 10,000 and 50,000 inhabitants. The clustering has been successfully applied in practice for the inference of traffic frequencies.DOI: 10.1145/1463434.1463514Metrics:
See at:
doi.org
| CNR IRIS
| CNR IRIS
2003
Other
Metadata Only Access
REVIGIS
Giannotti FNon disponibile.
See at:
CNR IRIS
2006
Software
Metadata Only Access
k-Privacy
Carfì D, Atzori M, Giannotti Fk-PRIVACY is a Java implementation of the well-known Datafly anonymization algorithm. It enforces anonymity of tuples in a given (private) table by generalizing and suppressing some tuples. k-PRIVACY protects from linking attacks by providing a k-anonymous (public) table that can be safely shared. It comes with both a graphic user interface and a command line console. Please refer to work on k-anonymity for details on this privacy-preserving technology.
See at:
CNR IRIS
2007
Other
Metadata Only Access
TOCAI
Giannotti FTecnologie Orientate alla Conoscenza per Aggregazioni di Imprese in Internet
See at:
CNR IRIS
2007
Software
Metadata Only Access
K-Privacy
Davide Carfì, Maurizio Atzori M, Giannotti FNo abstract avaible
See at:
CNR IRIS
2009
Conference article
Restricted
Movement data anonymity through generalization.
Andrienko G, Andrienko N, Giannotti F, Monreale A, Pedreschi DIn recent years, spatio-temporal and moving objects databases have gained considerable interest, due to the diusion of mobile devices (e.g., mobile phones, RFID devices and GPS devices) and of new applications, where the discovery of consumable, concise, and applicable knowledge is the key step. Clearly, in these applications privacy is a concern,since models extracted from this kind of data can reveal the behavior of group of individuals, thus compromising their privacy. Movement data present a new challenge for the privacy-preserving data mining community because of their spatial and temporal characteristics. In this position paper we brie y present an approach for the generalization of movement data that can be adopted for obtaining k-anonymity in spatio-temporal datasets; specif- ically, it can be used to realize a framework for publishing of spatio-temporal data while preserving privacy. We ran a preliminary set of experiments on a real-world trajectory dataset, demonstrating that this method of generalization of trajectories preserves the clustering analysis results.DOI: 10.1145/1667502.1667510Metrics:
See at:
dl.acm.org
| doi.org
| CNR IRIS
| CNR IRIS
2002
Other
Metadata Only Access
See at:
CNR IRIS
2002
Other
Metadata Only Access
See at:
CNR IRIS
2004
Journal article
Open Access
Specifying Mining Algorithms with Iterative User-Defined Aggregates
Giannotti F, Manco G, Turini FWe present a way of exploiting domain knowledge in the design and implementation of data mining algorithms, with special attention to frequent patterns discovery, within a deductive framework. In our framework, domain knowledge is represented by way of deductive rules, and data mining algorithms are specified by means of iterative user-defined aggregates and implemented by means of user-defined predicates. This choice allows us to exploit the full expressive power of deductive rules without loosing in performance. Iterative user-defined aggregates have a fixed scheme, in which user-defined predicates are to be added. This feature allows the modularization of data mining algorithms, thus providing a way to integrate the proper domain knowledge exploitation in the right point. As a case study, the paper presents how user-defined aggregates can be exploited to specify and implement a version of the a priori algorithm. Some performance analyzes and comparisons are discussed in order to show the effectiveness of the approach.Source: IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING (PRINT), vol. 16 (issue 10), pp. 1232-1246
DOI: 10.1109/tkde.2004.64Metrics:
See at:
CNR IRIS
| IEEE Transactions on Knowledge and Data Engineering
| CNR IRIS
2004
Journal article
Restricted
Towards a logic query language for data mining
Giannotti F, Manco G, Turini FWe present a logic database language with elementary data mining mechanisms to model the relevant aspects of knowledge discovery, and to provide a support for both the iterative and interactive features of the knowledge discovery process. We adopt the notion of user-defined aggregate to model typical data mining tasks as operations unveiling unseen knowledge. We illustrate the use of aggregates to model specific data mining tasks, such as frequent pattern discovery, classification, data discretization and clustering, and show how the resulting data mining query language allows the modeling of typical steps of the knowledge discovery process, that range from data preparation to knowledge extraction and evaluation.
See at:
CNR IRIS
| CNR IRIS
2003
Contribution to book
Restricted
Logical Languages for Data Mining
Giannotti F, Manco G, Wijsen JData mining focuses on the development of methods and algorithms for such various tasks as classification, clustering, rule induction, and discovery of associations. In the database fields, the view of data mining as advanced querying has recently stimulated much research into the development of data mining query languages. In the fields of machine learning, inductive logic programming has broadenes its scope towards extending standard data mining tasks from the usual attribute-value setting to a multi-relational setting. After a concise description of data mining, the contribution of logic to both fields is discussed. At the end, we indicate the potential use of logic for unifying different existsing data mining formalisms.
See at:
CNR IRIS
| CNR IRIS
2010
Journal article
Open Access
Hiding Sequential and Spatiotemporal Patterns
Giannotti F, Bonchi F, Abul OThe process of discovering relevant patterns holding in a database was first indicated as a threat to database security by O'Leary in [1]. Since then, many different approaches for knowledge hiding have emerged over the years, mainly in the context of association rules and frequent item sets mining. Following many real-world data and application demands, in this paper, we shift the problem of knowledge hiding to contexts where both the data and the extracted knowledge have a sequential structure. We define the problem of hiding sequential patterns and show its NP-hardness. Thus, we devise heuristics and a polynomial sanitization algorithm. Starting from this framework, we specialize it to the more complex case of spatiotemporal patterns extracted from moving objects databases. Finally, we discuss a possible kind of attack to our model, which exploits the knowledge of the underlying road network, and enhance our model to protect from this kind of attack. An exhaustive experiential analysis on real-world data sets shows the effectiveness of our proposal.Source: IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING (PRINT), vol. 22 (issue 12), pp. 1709-1723
DOI: 10.1109/tkde.2009.213Metrics:
See at:
Aperta - TÜBİTAK Açık Arşivi
| IEEE Transactions on Knowledge and Data Engineering
| CNR IRIS
| ieeexplore.ieee.org
| CNR IRIS
2011
Conference article
Open Access
Foundations of multidimensional network analysis
Berlingerio Michele, Coscia Michele, Giannotti Fosca, Monreale Anna, Pedreschi DinoComplex networks have been receiving increasing attention by the scientific community, thanks also to the increasing availability of real-world network data. In the last years, the multidimensional nature of many real world networks has been pointed out, i.e. many networks containing multiple connections between any pair of nodes have been analyzed. Despite the importance of analyzing this kind of networks was recognized by previous works, a complete framework for multidimensional network analysis is still missing. Such a framework would enable the analysts to study different phenomena, that can be either the generalization to the multidimensional setting of what happens inmonodimensional network, or a new class of phenomena induced by the additional degree of complexity that multidimensionality provides in real networks. The aim of this paper is then to give the basis for multidimensional network analysis: we develop a solid repertoire of basic concepts and analytical measures, which takes into account the general structure of multidimensional networks. We tested our framework on a real world multidimensional network, showing the validity and the meaningfulness of the measures introduced, that are able to extract important, nonrandom, information about complex phenomena.DOI: 10.1109/asonam.2011.103Metrics:
See at:
www.michelecoscia.com
| doi.org
| CNR IRIS
| ieeexplore.ieee.org
| CNR IRIS
2011
Conference article
Unknown
Privacy-Preserving Data Mining from Outsourced Databases
Giannotti Fosca, Lakshmanan Laks V S, Monreale Anna, Pedreschi Dino, Wang HuiSpurred by developments such as cloud computing, there has been considerable recent interest in the paradigm of data mining-as-service: a company (data owner) lacking in expertise or computational resources can outsource its mining needs to a third party service provider (server). However, both the outsourced database and the knowledge extract from it by data mining are considered private property of the data owner. To protect corporate privacy, the data owner transforms its data and ships it to the server, sends mining queries to the server, and recovers the true patterns from the extracted patterns received from the server. In this paper, we study the problem of outsourcing a data mining task within a corporate privacy-preserving framework. We propose a scheme for privacy-preserving outsourced mining which offers a formal protection against information disclosure, and show that the data owner can recover the correct data mining results efficiently.DOI: 10.1007/978-94-007-0641-5_19Metrics:
See at:
doi.org
| CNR IRIS
2011
Conference article
Restricted
Mobility, data mining and privacy understanding human movement patterns from trajectory data
Giannotti, FoscaThe technologies of mobile communications and ubiquitous computing pervade our society, and wireless networks sense the movement of people and vehicles, generating large volumes of mobility data, such as mobile phone call records and GPS tracks. This is a scenario of great opportunities and risks: on one side, mining this data can produce useful knowledge, supporting sustainable mobility and intelligent transportation systems; on the other side, individual privacy is at risk, as the mobility data contain sensitive personal information. A new multidisciplinary research area is emerging at this crossroads of mobility, data mining, and privacy. The talk assesses this research frontier from a data mining perspective, and illustrates the results of a European-wide research project called GeoPKDD, Geographic Privacy-Aware Knowledge Discovery and Delivery. GeoPKDD has created an integrated platform named M-ATLAS for complex analysis of mobility data, which combines spatio-temporal querying capabilities with data mining, visual analytics and semantic technologies, thus providing a full support for the Mobility Knowledge Discovery process. In this talk, we focus on the key data mining models: trajectory patterns and trajectory clustering, and illustrate the analytical power of our system in unvealing the complexity of urban mobility in a large metropolitan area by means of a large scale experiment, based on a massive real life GPS dataset, obtained from 17,000 vehicles with on-board GPS receivers, tracked during one week of ordinary mobile activity in the urban area of the city of Milan, Italy.DOI: 10.1109/mdm.2011.103Metrics:
See at:
doi.org
| CNR IRIS
| ieeexplore.ieee.org
| CNR IRIS
2011
Other
Open Access
Quality assurance of outsourced outlier mining
Zhang Yongjin, Liu Ruilin, Wang Hui Wendy, Monreale Anna, Pedreschi Dino, Giannotti Fosca, Guo WengeSpurred by developments such as in cloud computing, there has been considerable recent interest in the paradigm of data mining-as-service. A company (data owner) lacking in expertise or computational resources can outsource its mining needs to a third-party service provider. However, as the service providers may not be fully trusted, a dishonest service provider may return inaccurate mining results to the database owner. In this paper, we study the problem of providing quality assurance for outsourced outlier mining. We propose an efficient and practical auditing approach that can verify (1) whether the service provider returns the outliers originated from the hosted database, and (2) whether the service provider returns correct and complete outlier mining results. The key of our approach is to insert a small amount of artificial tuples into the outsourced database; the mining results of the service provider will be audited by analyzing the inserted tuples in the returned results with probabilistic guarantee. Our empirical results demonstrate the effectiveness and efficiency of our method.
See at:
CNR IRIS
| ISTI Repository
| CNR IRIS