167 result(s)
Page Size: 10, 20, 50
Export: bibtex, xml, json, csv
Order by:

CNR Author operator: and / or
more
Typology operator: and / or
Language operator: and / or
Date operator: and / or
more
Rights operator: and / or
2022 Journal article Open Access OPEN

MAT-Index: an index for fast multiple aspect trajectory similarity measuring
De Souza A. P. R., Renso C., Perego R., Bogorny V.
The semantic enrichment of mobility data with several information sources has led to a new type of movement data, the so-called multiple aspect trajectories. Comparing multiple aspect trajectories is crucial for several analysis tasks such as querying, clustering, similarity, and classification. Multiple aspect trajectory similarity measurement is more complex and computationally expensive, because of the large number and heterogeneous aspects of space, time, and semantics that require a different treatment. Only a few works in the literature focus on optimizing all these dimensions in a single solution, and, to the best of our knowledge, none of them proposes a fast point-to-point comparison. In this article we propose the Multiple Aspect Trajectory Index, an index data structure for optimizing the point-to-point comparison of multiple aspect trajectories, considering its three basic dimensions of space, time, and semantics. Quantitative and qualitative evaluations show a processing time reduction of up to 98.1%.Source: Transactions in GIS (Print) (2022). doi:10.1111/tgis.12889
DOI: 10.1111/tgis.12889
Project(s): MASTER via OpenAIRE

See at: onlinelibrary.wiley.com Open Access | ISTI Repository Open Access | CNR ExploRA Open Access


2021 Contribution to conference Open Access OPEN

Cloud and Data Federation in MobiDataLab
Carlini E., Dazzi P., Lettich F., Perego R., Renso C.
Today's innovative digital services dealing with the mobility of per- sons and goods produce huge amount of data. To propose advanced and efficient mobility services, the collection and aggregation of new sources of data from various producers are necessary. The overall objective of the MobiDataLab H2020 project is to propose to the mobility stakeholders (transport organising authorities, operators, industry, government and innovators) reproducible methodologies and sustainable tools that foster the development of a data-sharing culture in Europe and beyond. This short paper introduces the key concepts driving the design and definition of the Cloud and Data Federation that stands at the basis of MobiDataLab.Source: FRAME'21 - 1st Workshop on Flexible Resource and Application Management on the Edge, Virtual Event, Sweden, 25/06/2021
DOI: 10.1145/3452369.3463819
Project(s): ACCORDION via OpenAIRE

See at: ISTI Repository Open Access | CNR ExploRA Open Access


2021 Conference article Open Access OPEN

A novel similarity measure for multiple aspect trajectory clustering
Varlamis I., Sardianos C., Bogorny V., Alvares L. O., Carvalho J. T., Renso C., Perego R., Violos J.
Multiple aspect trajectories (MATs) is an emerging concept in the domain of Geographical Information Systems, where the basic view of semantic trajectories is enhanced with the notion of multiple heterogeneous aspects, characterizing different semantic dimensions related to the pure movement data. Many applications benefit from the analysis of multiple aspects trajectories, ranging from the analysis of people trajectories and the extraction of daily habits to the monitoring of vessel trajectories and the detection of outlying behaviors. This work proposes a novel MAT similarity measure as the core component in a hierarchical clustering algorithm. Despite the many clustering methods in the literature and the recent works on MAT similarity, there are still no works that dig deeper into the MAT clustering task. The current article copes with this issue by introducing TraFoS, a new similarity measure that defines a novel method for comparing MATs. TraFos includes a multi-vector representation of MATs that improves their similarity comparison. TraFos allows us to compare MATs across each aspect and then combine similarities in a single measure. We compared TraFos with other state of the art similarity metrics in Agglomerative clustering. The experimental results show that TraFos outperforms other similarities metrics in terms of internal, external clustering metrics and training time.Source: SAC 21 - 36th Annual ACM Symposium on Applied Computing, pp. 551–558, Online Conference, 22-26/03/2021
DOI: 10.1145/3412841.3441935
Project(s): MASTER via OpenAIRE

See at: ISTI Repository Open Access | CNR ExploRA Open Access | dl.acm.org Restricted


2021 Report Open Access OPEN

Predicting vehicles parking behaviour in shared premises for aggregated EV electricity demand response programs
Monteiro De Lira V., Pallonetto F., Gabrielli L., Renso C.
The global electric car sales in 2020 continued to exceed the expectations climbing to over 3 millions and reaching a market share of over 4%. However, uncertainty of generation caused by higher penetration of renewable energies and the advent of Electrical Vehicles (EV) with their additional electricity demand could cause strains to the power system, both at distribution and transmission levels. Demand response aggregation and load control will enable greater grid stability and greater penetration of renewable energies into the grid. The present work fits this context in supporting charging optimization for EV in parking premises assuming a incumbent high penetration of EVs in the system. We propose a methodology to predict an estimation of the parking duration in shared parking premises with the objective of estimating the energy requirement of a specific parking lot, evaluate optimal EVs charging schedule and integrate the scheduling into a smart controller. We formalize the prediction problem as a supervised machine learning task to predict the duration of the parking event before the car leaves the slot. This predicted duration feeds the energy management system that will allocate the power over the duration reducing the overall peak electricity demand. We structure our experiments inspired by two research questions aiming to discover the accuracy of the proposed machine learning approach and the most relevant features for the prediction models. We experiment different algorithms and features combination for 4 datasets from 2 different campus facilities in Italy and Brazil. Using both contextual and time of the day features, the overall results of the models shows an higher accuracy compared to a statistical analysis based on frequency, indicating a viable route for the development of accurate predictors for sharing parking premises energy management systemsSource: ISTI Research reports, 2021

See at: arxiv.org Open Access | ISTI Repository Open Access | CNR ExploRA Open Access


2021 Conference article Open Access OPEN

Dependency rule modeling for multiple aspects trajectories
Dos Santos Mello R., Schreiner G. A., Alchini C. A., Dos Santos G. G., Bogorny V., Renso C.
Trajectories of moving objects are usually modeled as sequences of space-time points or, in case of semantic trajectories, as labelled stops and moves. Data analytics methods on these kinds of trajectories tend to discover geometrical and temporal patterns, or simple semantic patterns based on the labels of stops and moves. A recent extension of semantic trajectories is called multiple aspects trajectory, i.e., a trajectory associated to different semantic dimensions called aspects. This kind of trajectory increases in a large scale the number of discovered patterns. This paper introduces the concept of dependency rule to represent patterns discovered from the analysis of trajectories with multiple aspects. They include patterns related to a trajectory, trajectory points, or the moving object. These rules are conceptually represented as an extension of a conceptual model for multiple aspects trajectories. A case study shows that our proposal is relevant as it represents the discovered rules with a concise but expressive conceptual model. Additionally, a performance evaluation shows the feasibility of our conceptual model designed over relational-based database management technologies.Source: ER 2021 - 40th International Conference on Conceptual Modeling, pp. 123–132, Online Conference, 18-21/10/2021
DOI: 10.1007/978-3-030-89022-3_11
Project(s): MASTER via OpenAIRE

See at: ISTI Repository Open Access | ZENODO Open Access | CNR ExploRA Restricted


2021 Contribution to journal Open Access OPEN

Multiple-aspect analysis of semantic trajectories(MASTER)
Renso C., Bogorny V., Tserpes K., Matwin S., Fernandes De Macedo J. A.
Source: International journal of geographical information science (Print) 35 (2021): 763–766. doi:10.1080/13658816.2020.1870982
DOI: 10.1080/13658816.2020.1870982
Project(s): MASTER via OpenAIRE

See at: ISTI Repository Open Access | CNR ExploRA Open Access | International Journal of Geographical Information Science Open Access | International Journal of Geographical Information Science Restricted | International Journal of Geographical Information Science Restricted | International Journal of Geographical Information Science Restricted


2021 Conference article Open Access OPEN

RMkNN and KNORA-IU: combining imbalanced dynamic selection techniques for credit scoring
Melo Junior L., Macedo J. F. Nardini F. M., Renso C.
Credit scoring has become a critical tool for financial institutions to discriminate "bad" applicants from "good" ones. One common characteristic of the credit datasets is the imbalance between good and bad applicants, with low defaults (no paid loans). Ensemble classification methodology is widely used in this field. However, dynamic ensemble selection approaches to imbalanced datasets have drawn little consideration. This study aims to measure the performance of the combination of two recent dynamic selection techniques for imbalanced credit scoring datasets, Reduced Minority k-NN (RMkNN) and KNORAImbalanced Union (KNORA-IU). We comprehensively evaluate the proposed combination against state-of-the-art competitors on six real-world public datasets and one private one. Experiments show that this combination improves the classification performance on the evaluated datasets in terms of AUC, balanced accuracy, H-measure, G-mean, F-measure, and Recall.Source: ICTAI 2021 - IEEE 33rd International Conference on Tools with Artificial Intelligence, pp. 823–830, Washington, DC, USA, 1-3/11/2021
DOI: 10.1109/ictai52525.2021.00131

See at: ISTI Repository Open Access | doi.org Restricted | CNR ExploRA Restricted


2021 Journal article Open Access OPEN

Towards the semantic enrichment of trajectories using spatial data infrastructures
Vidal-Filho J. N., Cesário Times V., Lisboa-Filho J., Renso C.
The term Semantic Trajectories of Moving Objects (STMO) corresponds to a sequence of spatial-temporal points with associated semantic information (for example, annotations about locations visited by the user or types of transportation used). However, the growth of Big Data generated by users, such as data produced by social networks or collected by an electronic equipment with embedded sensors, causes the STMO to require services and standards for enabling data documentation and ensuring the quality of STMOs. Spatial Data Infrastructures (SDI), on the other hand, provide a shared interoperable and integrated environment for data documentation. The main challenge is how to lead traditional SDIs to evolve to an STMO document due to the lack of specific metadata standards and services for semantic annotation. This paper presents a new concept of SDI for STMO, named SDI4Trajectory, which supports the documentation of different types of STMO--holistic trajectories, for example. The SDI4Trajectory allows us to propose semi-automatic and manual semantic enrichment processes, which are efficient in supporting semantic annotations and STMO documentation as well. These processes are hardly found in traditional SDIs and have been developed through Web and semantic micro-services. To validate the SDI4Trajectory, we used a dataset collected by voluntary users through the MyTracks application for the following purposes: (i) comparing the semi-automatic and manual semantic enrichment processes in the SDI4Trajectory; (ii) investigating the viability of the documentation processes carried out by the SDI4Trajectory, which was able to document all the collected trajectories.Source: ISPRS international journal of geo-information 10 (2021). doi:10.3390/ijgi10120825
DOI: 10.3390/ijgi10120825

See at: ISPRS International Journal of Geo-Information Open Access | ISTI Repository Open Access | CNR ExploRA Open Access | ISPRS International Journal of Geo-Information Open Access


2021 Conference article Open Access OPEN

A co-occurrence based approach for mining overlapped co-clusters in binary data
Santa Rosa Nassar Dos Santos Y., De Santiago R., Perego R., Schaly M. H., Alvares L. O., Renso C., Bogorny V.
Co-clustering is a specific type of clustering that addresses the problem of simultaneously clustering objects and attributes of a data matrix. Although general clustering techniques find non-overlapping co-clusters, finding possible overlaps between co-clusters can reveal embedded patterns in the data that the disjoint clusters cannot discover. The overlapping co-clustering approaches proposed in the literature focus on finding global overlapped co-clusters and they might overlook interesting local patterns that are not necessarily identified as global co-clusters. Discovering such local co-clusters increases the granularity of the analysis, and therefore more specific patterns can be captured. This is the objective of the present paper, which proposes the new Overlapped Co-Clustering (OCoClus) method for finding overlapped co-clusters on binary data, including both global and local patterns. This is a non-exhaustive method based on the co-occurrence of attributes and objects in the data. Another novelty of this method is that it is driven by an objective cost function that can automatically determine the number of co-clusters. We evaluate the proposed approach on publicly available datasets, both real and synthetic data, and compare the results with a number of baselines. Our approach shows better results than the baseline methods on synthetic data and demonstrates its efficacy in real data.Source: BRACIS 2021 - 10th Brazilian Conference on Intelligent Systems, pp. 375–389, Online Conference, 29/11/2021 - 3/12/2021
DOI: 10.1007/978-3-030-91702-9_25
Project(s): MASTER via OpenAIRE

See at: ISTI Repository Open Access | ZENODO Open Access | doi.org Restricted | link.springer.com Restricted | CNR ExploRA Restricted


2021 Conference article Open Access OPEN

Towards a federated learning approach for privacy-aware analysis of semantically enriched mobility data
Dagdia Z. C., Renso C., Zeitouni K., Agoulmine N.
Today, Artificial Intelligence is still facing a major challenge which is the fact of handling and strengthening data privacy. This challenge rises from the collected data which are associated with the fast development of mobile technologies, the huge capacities of high performance computing, and the large-scale storage in the cloud. In this paper, we focus on a possible solution to this challenge which is the use and application of federated learning. Specifically, beyond the federated learning based approaches proposed in different application domains, we mainly focus and discuss a federated learning approach for privacy-aware analysis of semantically enriched mobility data. We introduce the main motivation and opportunities of applying federated learning in mobility data, and highlight the main concepts and basics of our approach by describing our objectives and our approaches' requirements. We, also, describe our workplan that will permit achieving our predefined objectives via the setup of several research questions.Source: FRAME '21: 1st Workshop on Flexible Resource and Application Management on the Edge, pp. 17–20, Online Conference, 25/06/2021
DOI: 10.1145/3452369.3463823
Project(s): MobiDataLab via OpenAIRE

See at: ISTI Repository Open Access | doi.org Restricted | HAL Evry Restricted | CNR ExploRA Restricted


2021 Report Open Access OPEN

PTRAIL - A python package for parallel trajectory data preprocessing
Haidri S., Haranwala Y. J., Bogorny V., Renso C., Prado Da Fonseca V., Soares A.
Trajectory data represent a trace of an object that changes its position in space over time. This kind of data is complex to handle and analyze, since it is generally produced in huge quantities, often prone to errors generated by the geolocation device, human mishandling, or area coverage limitation. Therefore, there is a need for software specifically tailored to preprocess trajectory data. In this work we propose PTRAIL, a python package offering several trajectory preprocessing steps, including filtering, feature extraction, and interpolation. PTRAIL uses parallel computation and vectorization, being suitable for large datasets and fast compared to other python libraries.Source: ISTI Research reports, 2021

See at: arxiv.org Open Access | CNR ExploRA Open Access


2020 Journal article Restricted

Leveraging feature selection to detect potential tax fraudsters
Matos T., Macedo J. A., Lettich F., Monteiro J. M., Renso C., Perego R., Nardini F. M.
Tax evasion is any act that knowingly or unknowingly, legally or unlawfully, leads to non-payment or underpayment of tax due. Enforcing the correct payment of taxes by taxpayers is fundamental in maintaining investments that are necessary and benefits a society as a whole. Indeed, without taxes it is not possible to guarantee basic services such as health-care, education, sanitation, transportation, infrastructure, among other services essential to the population. This issue is especially relevant in developing countries such as Brazil. In this work we consider a real-world case study involving the Treasury Office of the State of Ceará (SEFAZ-CE, Brazil), the agency in charge of supervising more than 300,000 active taxpayers companies. SEFAZ-CE maintains a very large database containing vast amounts of information concerning such companies. Its enforcement team struggles to perform thorough inspections on taxpayers accounts as the underlying traditional human-based inspection processes involve the evaluation of countless fraud indicators (i.e., binary features), thus requiring burdensome amounts of time and being potentially prone to human errors. On the other hand, the vast amount of taxpayer information collected by fiscal agencies opens up the possibility of devising novel techniques able to tackle fiscal evasion much more effectively than traditional approaches. In this work we address the problem of using feature selection to select the most relevant binary features to improve the classification of potential tax fraudsters. Finding out possible fraudsters from taxpayer data with binary features presents several challenges. First, taxpayer data typically have features with low linear correlation between themselves. Also, tax frauds may originate from intricate illicit tactics, which in turn requires to uncover non-linear relationships between multiple features. Finally, few features may be correlated with the targeted class. In this work we propose Alicia, a new feature selection method based on association rules and propositional logic with a carefully crafted graph centrality measure that attempts to tackle the above challenges while, at the same time, being agnostic to specific classification techniques. Alicia is structured in three phases: first, it generates a set of relevant association rules from a set of fraud indicators (features). Subsequently, from such association rules Alicia builds a graph, which structure is then used to determine the most relevant features. To achieve this Alicia applies a novel centrality measure we call the Feature Topological Importance. We perform an extensive experimental evaluation to assess the validity of our proposal on four different real-world datasets, where we compare our solution with eight other feature selection methods. The results show that Alicia achieves F-measure scores up to 76.88%, and consistently outperforms its competitors.Source: Expert systems with applications 145 (2020). doi:10.1016/j.eswa.2019.113128
DOI: 10.1016/j.eswa.2019.113128

See at: Expert Systems with Applications Restricted | Expert Systems with Applications Restricted | Expert Systems with Applications Restricted | Expert Systems with Applications Restricted | Expert Systems with Applications Restricted | CNR ExploRA Restricted | Expert Systems with Applications Restricted


2020 Journal article Open Access OPEN

MARC: a robust method for multiple-aspect trajectory classification via space, time, and semantic embeddings
May Petry L., Leite Da Silva C., Esuli A., Renso C., Bogorny V.
The increasing popularity of Location-Based Social Networks (LBSNs) and the semantic enrichment of mobility data in several contexts in the last years has led to the generation of large volumes of trajectory data. In contrast to GPS-based trajectories, LBSN and context-aware trajectories are more complex data, having several semantic textual dimensions besides space and time, which may reveal interesting mobility patterns. For instance, people may visit different places or perform different activities depending on the weather conditions. These new semantically rich data, known as multiple-aspect trajectories, pose new challenges in trajectory classification, which is the problem that we address in this paper. Existing methods for trajectory classification cannot deal with the complexity of heterogeneous data dimensions or the sequential aspect that characterizes movement. In this paper we propose MARC, an approach based on attribute embedding and Recurrent Neural Networks (RNNs) for classifying multiple-aspect trajectories, that tackles all trajectory properties: space, time, semantics, and sequence. We highlight that MARC exhibits good performance especially when trajectories are described by several textual/categorical attributes. Experiments performed over four publicly available datasets considering the Trajectory-User Linking (TUL) problem show that MARC outperformed all competitors, with respect to accuracy, precision, recall, and F1-score.Source: International journal of geographical information science (Print) 34 (2020): 1428–1450. doi:10.1080/13658816.2019.1707835
DOI: 10.1080/13658816.2019.1707835
Project(s): MASTER via OpenAIRE

See at: ISTI Repository Open Access | ZENODO Open Access | International Journal of Geographical Information Science Open Access | International Journal of Geographical Information Science Restricted | International Journal of Geographical Information Science Restricted | CNR ExploRA Restricted | International Journal of Geographical Information Science Restricted | www.tandfonline.com Restricted | International Journal of Geographical Information Science Restricted


2020 Contribution to conference Open Access OPEN

Preface
Tserpes K., Renso C., Matwin S.
Preface of the proceedings of the First International Workshop, MASTER 2019 Held in Conjunction with ECML-PKDD 2019 Würzburg, Germany, September 16, 2019 ProceedingsProject(s): MASTER via OpenAIRE

See at: link.springer.com Open Access | ISTI Repository Open Access | CNR ExploRA Open Access


2020 Journal article Restricted

A novel approach to define the local region of dynamic selection techniques in imbalanced credit scoring problems
Melo Junior L., Nardini F. M. Renso C., Trani R., Macedo J. A.
Lenders, such as banks and credit card companies, use credit scoring models to evaluate the potential risk posed by lending money to customers, and therefore to mitigate losses due to bad credit. The profitability of the banks thus highly depends on the models used to decide on the customer's loans. State-of-the-art credit scoring models are based on machine learning and statistical methods. One of the major problems of this field is that lenders often deal with imbalanced datasets that usually contain many paid loans but very few not paid ones (called defaults). Recently, dynamic selection methods combined with ensemble methods and preprocessing techniques have been evaluated to improve classification models in imbalanced datasets presenting advantages over the static machine learning methods. In a dynamic selection technique, samples in the neighborhood of each query sample are used to compute the local competence of each base classifier. Then, the technique selects only competent classifiers to predict the query sample. In this paper, we evaluate the suitability of dynamic selection techniques for credit scoring problem, and we present Reduced Minority k-Nearest Neighbors (RMkNN), an approach that enhances state of the art in defining the local region of dynamic selection techniques for imbalanced credit scoring datasets. This proposed technique has a superior prediction performance in imbalanced credit scoring datasets compared to state of the art. Furthermore, RMkNN does not need any preprocessing or sampling method to generate the dynamic selection dataset (called DSEL). Additionally, we observe an equivalence between dynamic selection and static selection classification. We conduct a comprehensive evaluation of the proposed technique against state-of-the-art competitors on six real-world public datasets and one private one. Experiments show that RMkNN improves the classification performance of the evaluated datasets regarding AUC, balanced accuracy, H-measure, G-mean, F-measure, and Recall.Source: Expert systems with applications 152 (2020). doi:10.1016/j.eswa.2020.113351
DOI: 10.1016/j.eswa.2020.113351
Project(s): MC2020 via OpenAIRE, BigDataGrapes via OpenAIRE, MASTER via OpenAIRE

See at: Expert Systems with Applications Restricted | Expert Systems with Applications Restricted | Expert Systems with Applications Restricted | Expert Systems with Applications Restricted | Expert Systems with Applications Restricted | CNR ExploRA Restricted | Expert Systems with Applications Restricted


2019 Journal article Open Access OPEN

Analytics Everywhere: Generating Insights from the Internet of Things
Cao H., Wachowicz M., Renso C., Carlini E.
The Internet of Things is expected to generate an unprecedented number of unbounded data streams that will produce a paradigm shift when it comes to data analytics. We are moving away from performing analytics in a public or private cloud to performing analytics locally at the fog and edge resources. In this paper, we propose a network of tasks utilizing edge, fog, and cloud computing that are designed to support an Analytics Everywhere framework. The aim is to integrate a variety of computational resources and analytical capabilities according to a data life-cycle. We demonstrate the proposed framework using an application in smart transit.Source: IEEE access 7 (2019): 71749–71769. doi:10.1109/ACCESS.2019.2919514
DOI: 10.1109/access.2019.2919514

See at: IEEE Access Open Access | IEEE Access Open Access | IEEE Access Open Access | IEEE Access Open Access | IEEE Access Open Access | IEEE Access Open Access | ISTI Repository Open Access | CNR ExploRA Open Access | IEEE Access Open Access


2019 Conference article Open Access OPEN

On combining dynamic selection, sampling, and pool generators for credit scoring
Melo Junior L., Nardini F. M., Renso C., Fernandes De Macedo J. A.
The profitability of the banks highly depends on the models used to decide on the customer's loans. State of the art credit scoring models are based on machine learning methods. These methods need to cope with the problem of imbalanced classes since credit scoring datasets usually contain many paid loans and few not paid ones (defaults). Recently, dynamic selection approaches combined with pre-processing techniques have been evaluated for imbalanced datasets. However, previous works only evaluate oversampling techniques combined with bagging pool generator ensembles. For this reason, we propose to combine different dynamic selection, preprocessing and pool generation techniques. We assess the prediction performance by using four public real-world credit scoring datasets with different levels of imbalanced ratio and four evaluation measures. Experimental results show that KNORA-Union dynamic selection technique combined with Balanced Random Forest improves the classification performance concerning the static ensemble for all levels of imbalance ratio.Source: Machine Learning and Data Mining in Pattern Recognition, 15th International Conference on Machine Learning and Data Mining, MLDM, pp. 443–457, New York, USA, 18/07/2019, 23/07/2019

See at: ISTI Repository Open Access | CNR ExploRA Restricted


2019 Journal article Embargo

Speed prediction in large and dynamic traffic sensor networks
Magalhaes R. P., Lettich F., Macedo J. A., Nardini F. M., Perego R., Renso C., Trani R.
Smart cities are nowadays equipped with pervasive networks of sensors that monitor traffic in real-time and record huge volumes of traffic data. These datasets constitute a rich source of information that can be used to extract knowledge useful for municipalities and citizens. In this paper we are interested in exploiting such data to estimate future speed in traffic sensor networks, as accurate predictions have the potential to enhance decision making capabilities of traffic management systems. Building effective speed prediction models in large cities poses important challenges that stem from the complexity of traffic patterns, the number of traffic sensors typically deployed, and the evolving nature of sensor networks. Indeed, sensors are frequently added to monitor new road segments or replaced/removed due to different reasons (e.g., maintenance). Exploiting a large number of sensors for effective speed prediction thus requires smart solutions to collect vast volumes of data and train effective prediction models. Furthermore, the dynamic nature of real-world sensor networks calls for solutions that are resilient not only to changes in traffic behavior, but also to changes in the network structure, where the cold start problem represents an important challenge. We study three different approaches in the context of large and dynamic sensor networks: local, global, and cluster-based. The local approach builds a specific prediction model for each sensor of the network. Conversely, the global approach builds a single prediction model for the whole sensor network. Finally, the cluster-based approach groups sensors into homogeneous clusters and generates a model for each cluster. We provide a large dataset, generated from ~1.3 billion records collected by up to 272 sensors deployed in Fortaleza, Brazil, and use it to experimentally assess the effectiveness and resilience of prediction models built according to the three aforementioned approaches. The results show that the global and cluster-based approaches provide very accurate prediction models that prove to be robust to changes in traffic behavior and in the structure of sensor networks.Source: Information systems (Oxf.) 98 (2019). doi:10.1016/j.is.2019.101444
DOI: 10.1016/j.is.2019.101444
Project(s): BigDataGrapes via OpenAIRE, MASTER via OpenAIRE

See at: Information Systems Restricted | Information Systems Restricted | Information Systems Restricted | Information Systems Restricted | CNR ExploRA Restricted | Information Systems Restricted


2019 Journal article Open Access OPEN

Event attendance classification in social media
De Lira V. M., Macdonald C., Ounis I., Perego R., Renso C., Cesario Times V.
Popular events are well reflected on social media, where people share their feelings and discuss their experiences. In this paper, we investigate the novel problem of exploiting the content of non-geotagged posts on social media to infer the users' attendance of large events in three temporal periods: before, during and after an event. We detail the features used to train event attendance classifiers and report on experiments conducted on data from two large music festivals in the UK, namely the VFestival and Creamfields events. Our classifiers attain very high accuracy with the highest result observed for the Creamfields festival ( similar to 91% accuracy at classifying users that will participate in the event). We study the most informative features for the tasks addressed and the generalization of the learned models across different events. Finally, we discuss an illustrative application of the methodology in the field of transportation.Source: Information processing & management 56 (2019): 687–703. doi:10.1016/j.ipm.2018.11.001
DOI: 10.1016/j.ipm.2018.11.001

See at: Information Processing & Management Open Access | ISTI Repository Open Access | Information Processing & Management Restricted | Information Processing & Management Restricted | Information Processing & Management Restricted | Information Processing & Management Restricted | Information Processing & Management Restricted | Information Processing & Management Restricted | CNR ExploRA Restricted | Information Processing & Management Restricted | www.sciencedirect.com Restricted


2019 Journal article Open Access OPEN

A comprehensive reputation assessment framework for volunteered geographic information in crowdsensing applications
Jabeur N., Karam R., Melchiori M., Renso C.
Volunteered geographic information (VGI) is the result of activities where individuals, supported by enabling technologies, behave like physical sensors by harvesting and organizing georeferenced content, usually in their surroundings. Both researchers and organizations have recognized the value of VGI content, however this content is typically heterogeneous in quality and spatial coverage. As a consequence, in order for applications to benefit from it, its quality and reliability need to be assessed in advance. This may not be easy since, typically, it is unknown how the process of collecting and organizing the VGI content has been conducted and by whom. In the literature, various proposals focus on an indirect process of quality assessment based on reputation scores. Following this perspective, the present paper provides as main contributions: (i) a multi-layer architecture for VGI which supports a process of reputation evaluation; (ii) a new comprehensive model for computing reputation scores for both VGI data and contributors, based on direct and indirect evaluations expressed by users, and including the concept of data aging; (iii) a variety of experiments evaluating the accuracy of the model. Finally, the relevance of adopting this framework is discussed via an applicative scenario for recommending tourist itineraries.Source: Personal and ubiquitous computing (Print) 23 (2019): 669–685. doi:10.1007/s00779-018-1122-9
DOI: 10.1007/s00779-018-1122-9

See at: Personal and Ubiquitous Computing Open Access | Personal and Ubiquitous Computing Restricted | Personal and Ubiquitous Computing Restricted | Personal and Ubiquitous Computing Restricted | Personal and Ubiquitous Computing Restricted | Personal and Ubiquitous Computing Restricted | Personal and Ubiquitous Computing Restricted | CNR ExploRA Restricted