2022
Conference article
Open Access
The 2nd Workshop on Mixed-Initiative ConveRsatiOnal Systems (MICROS)
Mele I., Muntean C. I., Aliannejadi M., Voskarides N.The Mixed-Initiative ConveRsatiOnal Systems workshop (MICROS) aims at bringing novel ideas and investigating new solutions on conversational assistant systems. The increasing popularity of personal assistant systems, as well as smartphones, has changed the way users access online information, posing new challenges for information seeking and filtering. MICROS has a particular focus on mixed-initiative conversational systems, namely, systems that can provide answers in a proactive way (e.g., asking for clarification or proposing possible interpretations for ambiguous and vague requests). We invite people working on conversational systems or interested in the workshop topics to send us their position and research manuscripts.Source: CIKM '22 - 31st ACM International Conference on Information & Knowledge Management, pp. 5173–5174, Atlanta, USA, 17-21/10/2022
DOI: 10.1145/3511808.3557938Metrics:
See at:
ISTI Repository
| dl.acm.org
| doi.org
| CNR ExploRA
2021
Conference article
Open Access
MICROS: Mixed-Initiative ConveRsatiOnal Systems Workshop
Mele I., Muntean C. I., Aliannejadi M., Voskarides N.The 1st edition of the workshop on Mixed-Initiative ConveRsatiOnal Systems (MICROS@ECIR2021) aims at investigating and collecting novel ideas and contributions in the field of conversational systems. Oftentimes, the users fulfill their information need using smartphones and home assistants. This has revolutionized the way users access online information, thus posing new challenges compared to traditional search and recommendation. The first edition of MICROS will have a particular focus on mixed-initiative conversational systems. Indeed, conversational systems need to be proactive, proposing not only answers but also possible interpretations for ambiguous or vague requests.Source: ECIR 2021 - 43rd European Conference on IR Research, pp. 710–713, Online Conference, March 28 - April 1, 2021
DOI: 10.1007/978-3-030-72240-1_86DOI: 10.48550/arxiv.2101.10219Metrics:
See at:
arXiv.org e-Print Archive
| arxiv.org
| ISTI Repository
| doi.org
| doi.org
| link.springer.com
| CNR ExploRA
2021
Journal article
Restricted
Adaptive utterance rewriting for conversational search
Mele I., Muntean C. I., Nardini F. M., Perego R., Tonellotto N., Frieder O.In a conversational context, a user converses with a system through a sequence of natural-language questions, i.e., utterances. Starting from a given subject, the conversation evolves through sequences of user utterances and system replies. The retrieval of documents relevant to an utterance is difficult due to informal use of natural language in speech and the complexity of understanding the semantic context coming from previous utterances. We adopt the 2019 TREC Conversational Assistant Track (CAsT) framework to experiment with a modular architecture performing in order: (i) automatic utterance understanding and rewriting, (ii) first-stage retrieval of candidate passages for the rewritten utterances, and (iii) neural re-ranking of candidate passages. By understanding the conversational context, we propose adaptive utterance rewriting strategies based on the current utterance and the dialogue evolution of the user with the system. A classifier identifies those utterances lacking context information as well as the dependencies on the previous utterances. Experimentally, we evaluate the proposed architecture in terms of traditional information retrieval metrics at small cutoffs. Results demonstrate the effectiveness of our techniques, achieving an improvement up to 0.6512 for P@1 and 0.4484 for nDCG@3 w.r.t. the CAsT baseline.Source: Information processing & management 58 (2021). doi:10.1016/j.ipm.2021.102682
DOI: 10.1016/j.ipm.2021.102682Project(s): BigDataGrapes
Metrics:
See at:
Information Processing & Management
| Information Processing & Management
| CNR ExploRA
2020
Journal article
Open Access
Crime and its fear in social media
Prieto Curiel R., Cresci S., Muntean C. I., Bishop S. R.Social media posts incorporate real-time information that has, elsewhere, been exploited to predict social trends. This paper considers whether such information can be useful in relation to crime and fear of crime. A large number of tweets were collected from the 18 largest Spanish-speaking countries in Latin America, over a period of 70 days. These tweets are then classified as being crime-related or not and additional information is extracted, including the type of crime and where possible, any geo-location at a city level. From the analysis of collected data, it is established that around 15 out of every 1000 tweets have text related to a crime, or fear of crime. The frequency of tweets related to crime is then compared against the number of murders, the murder rate, or the level of fear of crime as recorded in surveys. Results show that, like mass media, such as newspapers, social media suffer from a strong bias towards violent or sexual crimes. Furthermore, social media messages are not highly correlated with crime. Thus, social media is shown not to be highly useful for detecting trends in crime itself, but what they do demonstrate is rather a reflection of the level of the fear of crime.Source: Palgrave communications 6 (2020). doi:10.1057/s41599-020-0430-7
DOI: 10.1057/s41599-020-0430-7Project(s): CIMPLEX 
,
SoBigData
Metrics:
See at:
Palgrave Communications
| Palgrave Communications
| ISTI Repository
| CNR ExploRA
| www.nature.com
| Palgrave Communications
2020
Journal article
Open Access
(So) Big Data and the transformation of the city
Andrienko G., Andrienko N., Boldrini C., Caldarelli G., Cintia P., Cresci S., Facchini A., Giannotti F., Gionis A., Guidotti R., Mathioudakis M., Muntean C. I., Pappalardo L., Pedreschi D., Pournaras E., Pratesi F., Tesconi M., Trasarti R.The exponential increase in the availability of large-scale mobility data has fueled the vision of smart cities that will transform our lives. The truth is that we have just scratched the surface of the research challenges that should be tackled in order to make this vision a reality. Consequently, there is an increasing interest among different research communities (ranging from civil engineering to computer science) and industrial stakeholders in building knowledge discovery pipelines over such data sources. At the same time, this widespread data availability also raises privacy issues that must be considered by both industrial and academic stakeholders. In this paper, we provide a wide perspective on the role that big data have in reshaping cities. The paper covers the main aspects of urban data analytics, focusing on privacy issues, algorithms, applications and services, and georeferenced data from social media. In discussing these aspects, we leverage, as concrete examples and case studies of urban data science tools, the results obtained in the "City of Citizens" thematic area of the Horizon 2020 SoBigData initiative, which includes a virtual research environment with mobility datasets and urban analytics methods developed by several institutions around Europe. We conclude the paper outlining the main research challenges that urban data science has yet to address in order to help make the smart city vision a reality.Source: International Journal of Data Science and Analytics (Print) 1 (2020). doi:10.1007/s41060-020-00207-3
DOI: 10.1007/s41060-020-00207-3Project(s): SoBigData
Metrics:
See at:
Aaltodoc Publication Archive
| International Journal of Data Science and Analytics
| White Rose Research Online
| HELDA - Digital Repository of the University of Helsinki
| Archivio istituzionale della ricerca - Università degli Studi di Venezia Ca' Foscari
| link.springer.com
| International Journal of Data Science and Analytics
| City Research Online
| ISTI Repository
| CNR ExploRA
| Fraunhofer-ePrints
2020
Journal article
Open Access
Human migration: the big data perspective
Sîrbu A., Andrienko G., Andrienko N., Boldrini C., Conti M., Giannotti F., Guidotti R., Bertoli S., Kim J., Muntean C. I., Pappalardo L., Passarella A., Pedreschi D., Pollacci L., Pratesi F., Sharma R.How can big data help to understand the migration phenomenon? In this paper, we try to answer this question through an analysis of various phases of migration, comparing traditional and novel data sources and models at each phase. We concentrate on three phases of migration, at each phase describing the state of the art and recent developments and ideas. The first phase includes the journey, and we study migration flows and stocks, providing examples where big data can have an impact. The second phase discusses the stay, i.e. migrant integration in the destination country. We explore various data sets and models that can be used to quantify and understand migrant integration, with the final aim of providing the basis for the construction of a novel multi-level integration index. The last phase is related to the effects of migration on the source countries and the return of migrants.Source: International Journal of Data Science and Analytics (Online) (2020). doi:10.1007/s41060-020-00213-5
DOI: 10.1007/s41060-020-00213-5Project(s): SoBigData
Metrics:
See at:
International Journal of Data Science and Analytics
| HAL Clermont Université
| Fraunhofer-ePrints
| link.springer.com | CNR ExploRA
2020
Journal article
Restricted
Weighting passages enhances accuracy
Muntean C. I., Nardini F. M., Perego R., Tonellotto N., Frieder O.We observe that in curated documents the distribution of the occurrences of salient terms, e.g., terms with a high Inverse Document Frequency, is not uniform, and such terms are primarily concentrated towards the beginning and the end of the document. Exploiting this observation, we propose a novel version of the classical BM25 weighting model, called BM25 Passage (BM25P), which scores query results by computing a linear combination of term statistics in the different portions of the document. We study a multiplicity of partitioning schemes of document content into passages and compute the collection-dependent weights associated with them on the basis of the distribution of occurrences of salient terms in documents. Moreover, we tune BM25P hyperparameters and investigate their impact on ad hoc document retrieval through fully reproducible experiments conducted using four publicly available datasets. Our findings demonstrate that our BM25P weighting model markedly and consistently outperforms BM25 in terms of effectiveness by up to 17.44% in NDCG@5 and 85% in NDCG@1, and up to 21% in MRR.Source: ACM transactions on information systems 39 (2020). doi:10.1145/3428687
DOI: 10.1145/3428687Metrics:
See at:
ACM Transactions on Information Systems
| CNR ExploRA
2020
Conference article
Open Access
Topic propagation in conversational search
Mele I., Muntean C. I., Nardini F. M., Perego R., Tonellotto N., Frieder O.In a conversational context, a user expresses her multi-faceted information need as a sequence of natural-language questions, i.e., utterances. Starting from a given topic, the conversation evolves through user utterances and system replies. The retrieval of documents relevant to a given utterance in a conversation is challenging due to ambiguity of natural language and to the difficulty of detecting possible topic shifts and semantic relationships among utterances. We adopt the 2019 TREC Conversational Assistant Track (CAsT) framework to experiment with a modular architecture performing: (i) topic-aware utterance rewriting, (ii) retrieval of candidate passages for the rewritten utterances, and (iii) neural-based re-ranking of candidate passages. We present a comprehensive experimental evaluation of the architecture assessed in terms of traditional IR metrics at small cutoffs. Experimental results show the effectiveness of our techniques that achieve an improvement of up to $0.28$ (+93%) for P@1 and $0.19$ (+89.9%) for nDCG@3 w.r.t. the CAsT baseline.Source: SIGIR 2020 - 43rd International ACM SIGIR Conference on Research and Development in Information Retrieval, pp. 2057–2060, Online Conference, July 25-30, 2020
DOI: 10.1145/3397271.3401268DOI: 10.48550/arxiv.2004.14054Project(s): BigDataGrapes
Metrics:
See at:
arXiv.org e-Print Archive
| arxiv.org
| dl.acm.org
| doi.org
| doi.org
| CNR ExploRA
2020
Conference article
Embargo
High-quality prediction of tourist movements using temporal trajectories in graphs
Moghtasedi S., Muntean C. I., Nardini F. M., Grossi R., Marino A.In this paper, we study the problem of predicting the next position of a tourist given his history. In particular, we propose a model to identify the next point of interest that a tourist will visit in the future, by making use of similarity between trajectories on a graph and taking into account the spatial-temporal aspect of trajectories. We compare our method with a well-known machine learning-based technique, as well as with a popularity baseline, using three public real-world datasets. Our experimental results show that our technique outperforms state-of-the-art machine learning-based methods effectively, by providing at least twice more accurate results.Source: ASONAM 2020 - The 2020 IEEE/ACM International Conference on Advances in Social Networks Analysis and Mining, pp. 348–352, Online conference, 7-10/12/2020
DOI: 10.1109/asonam49781.2020.9381450Metrics:
See at:
ieeexplore.ieee.org
| CNR ExploRA
| xplorestaging.ieee.org
2019
Conference article
Open Access
Enhanced news retrieval: passages lead the way!
Catena M., Nardini F. M., Frieder O., Perego R., Muntean C. I., Tonellotto N.We observe that most relevant terms in unstructured news articles are primarily concentrated towards the beginning and the end of the document. Exploiting this observation, we propose a novel version of the classical BM25 weighting model, called BM25 Passage (BM25P), which scores query results by computing a linear combination of term statistics in the different portions of news articles. Our experimentation, conducted using three publicly available news datasets, demonstrates that BM25P markedly outperforms BM25 in term of effectiveness by up to 17.44% in NDCG@5 and 85% in NDCG@1.Source: 42nd International ACM SIGIR Conference on Research and Development in Information Retrieval, pp. 1269–1272, Parigi, Francia, 21-25 July 2019
DOI: 10.1145/3331184.3331373Metrics:
See at:
dl.acm.org
| CNR ExploRA
| doi.org
2018
Report
Open Access
BASMATI - D3.5 Server- and Client-side Applications Adaptation and Reconfiguration: Design and Specification
Dazzi P., Carlini E., De Lira V. M., Munteanu C.This report provides a description of the mechanisms, tools, and algorithms used to support application adaptation and reconfiguration in the BASMATI brokerage platform. At the core of this support lies the BASMATI Enriched Application Model (BEAM), which is the xml-based language in which an application is modelled and represented in BASMATI. The design principles behind the BEAM (namel: compatibility, extensibility, decomposability) are the prerequisites to provide efficient and effective geo-placement of services and applications on top of federated Cloud resources. The BEAM is made available to all the components of the platform by the Application Repository, which works as a centralization point for the BEAMs of all the applications. The decomposability of BEAM is exploited by the Decision Maker that has the task to proactively and reactively adapt the application according to the behaviour of users and resources, by means of advanced placement algorithms.Source: Project report, BASMATI, Deliverable D3.5, 2018
Project(s): BASMATI 
See at:
ISTI Repository
| CNR ExploRA
2017
Conference article
Restricted
Social Media Image Recognition for Food Trend Analysis
Amato G., Bolettieri P., Monteiro De Lira V., Muntean C. I., Perego R., Renso C.n increasing number of people share their thoughts and the images of their lives on social media platforms. People are exposed to food in their everyday lives and share on-line what they are eating by means of photos taken to their dishes. The hashtag #foodporn is constantly among the popular hashtags in Twitter and food photos are the second most popular subject in Instagram after selfies. The system that we propose, WorldFoodMap, captures the stream of food photos from social media and, thanks to a CNN food image classifier, identifies the categories of food that people are sharing. By collecting food images from the Twitter stream and associating food category and location to them, WorldFoodMap permits to investigate and interactively visualize the popularity and trends of the shared food all over the world.Source: SIGIR 2017 - 40th International ACM SIGIR Conference on Research and Development in Information Retrieval, pp. 1333–1336, Tokyo, Japan, 7 - 11 August, 2017
DOI: 10.1145/3077136.3084142Project(s): SoBigData
Metrics:
See at:
dl.acm.org
| doi.org
| CNR ExploRA
2017
Conference article
Restricted
RankEval: an evaluation and analysis framework for learning-to-rank solutions
Lucchese C., Muntean C. I., Nardini F. M., Perego R., Trani S.In this demo paper we propose RankEval, an open-source tool for the analysis and evaluation of Learning-to-Rank (LtR) models based on ensembles of regression trees. Gradient Boosted Regression Trees (GBRT) is a flexible statistical learning technique for classification and regression at the state of the art for training effective LtR solutions. Indeed, the success of GBRT fostered the development of several open-source LtR libraries targeting efficiency of the learning phase and effectiveness of the resulting models. However, these libraries offer only very limited help for the tuning and evaluation of the trained models. In addition, the implementations provided for even the most traditional IR evaluation metrics differ from library to library, thus making the objective evaluation and comparison between trained models a difficult task. RankEval addresses these issues by providing a common ground for LtR libraries that offers useful and interoperable tools for a comprehensive comparison and in-depth analysis of ranking models.Source: SIGIR '17 - 40th International ACM SIGIR Conference on Research and Development in Information Retrieval, pp. 1281–1284, Tokyo, Japan, 9-11 August 2017
DOI: 10.1145/3077136.3084140Project(s): SoBigData
Metrics:
See at:
dl.acm.org
| doi.org
| CNR ExploRA
2017
Journal article
Open Access
Perception of social phenomena through the multidimensional analysis of online social networks
Coletto M., Esuli A., Lucchese C., Muntean C. I., Nardini F. M., Perego R., Renso C.We propose an analytical framework aimed at investigating different views of the discussions regarding polarized topics which occur in Online Social Networks (OSNs). The framework supports the analysis along multiple dimensions, i.e., time, space and sentiment of the opposite views about a controversial topic emerging in an OSN. To assess its usefulness in mining insights about social phenomena, we apply it to two different Twitter case studies: the discussions about the refugee crisis and the United Kingdom European Union membership referendum. These complex and contended topics are very important issues for EU citizens and stimulated a multitude of Twitter users to take side and actively participate in the discussions. Our framework allows to monitor in a scalable way the raw stream of relevant tweets and to automatically enrich them with location information (user and mentioned locations), and sentiment polarity (positive vs. negative). The analyses we conducted show how the framework captures the differences in positive and negative user sentiment over time and space. The resulting knowledge can support the understanding of complex dynamics by identifying variations in the perception of specific events and locations.Source: Online social networks and media 1 (2017): 14–32. doi:10.1016/j.osnem.2017.03.001
DOI: 10.1016/j.osnem.2017.03.001Project(s): SoBigData
Metrics:
See at:
ISTI Repository
| Online Social Networks and Media
| CNR ExploRA
| www.sciencedirect.com
2017
Conference article
Restricted
Sentiment spreading: an epidemic model for lexicon-based sentiment analysis on Twitter
Pollacci L., Sirbu A., Giannotti F., Pedreschi D., Lucchese C., Muntean C. I.While sentiment analysis has received significant attention in the last years, problems still exist when tools need to be applied to microblogging content. This because, typically, the text to be analysed consists of very short messages lacking in structure and semantic context. At the same time, the amount of text produced by online platforms is enormous. So, one needs simple, fast and effective methods in order to be able to efficiently study sentiment in these data. Lexicon-based methods, which use a predefined dictionary of terms tagged with sentiment valences to evaluate sentiment in longer sentences, can be a valid approach. Here we present a method based on epidemic spreading to automatically extend the dictionary used in lexicon-based sentiment analysis, starting from a reduced dictionary and large amounts of Twitter data. The resulting dictionary is shown to contain valences that correlate well with human-annotated sentiment, and to produce tweet sentiment classifications comparable to the original dictionary, with the advantage of being able to tag more tweets than the original. The method is easily extensible to various languages and applicable to large amounts of data.Source: AI*IA Conference of the Italian Association for Artificial Intelligence, pp. 114–127, Bari, Italy, 14-17 November 2017
DOI: 10.1007/978-3-319-70169-1_9Project(s): SoBigData
Metrics:
See at:
Lecture Notes in Computer Science
| Archivio istituzionale della ricerca - Università degli Studi di Venezia Ca' Foscari
| link.springer.com
| CNR ExploRA
2016
Contribution to conference
Open Access
Understanding human mobility during events in foursquare
Muntean C. I., Nardini F. M., Noulas A.Social events can generate high influxes of people transitioning various locations in a city. They can be considered to have a considerable impact on the local economy, whether they are sport events, concerts or festivals. These events are capable of generating sudden changes in the activity landscape of a city, with the neighborhoods that host events becoming unusually busy and active compared to times of regular citizen activity. While event and anomaly detection more generally has been a topic of study in recent years, as also has been event recommendation for mobile users, progress has been slower towards building systems that are able to capture the sudden shift appropriately in this setting. In this work we exploit data from the location-based service Foursquare to study mobility during events in Chicago, and later expand our study to other cities as well. Our aim is to identify what differences emerge in terms of user mobility during events versus regular periods of human activity.Source: 7th Italian Information Retrieval Workshop, Venezia, Italy, 30-31 May 2016
See at:
ceur-ws.org
| CNR ExploRA
2016
Conference article
Open Access
Sentiment-enhanced multidimensional analysis of online social networks: perception of the mediterranean refugees crisis
Coletto M., Esuli A., Lucchese C., Muntean C. I., Nardini F. M., Perego R., Renso C.We propose an analytical framework able to investigate discussions about polarized topics in online social networks from many different angles. The framework supports the analysis of social networks along several dimensions: time, space and sentiment. We show that the proposed analytical framework and the methodology can be used to mine knowledge about the perception of complex social phenomena. We selected the refugee crisis discussions over Twitter as a case study. This difficult and controversial topic is an increasingly important issue for the EU. The raw stream of tweets is enriched with space information (user and mentioned locations), and sentiment (positive vs. negative) w.r.t. refugees. Our study shows differences in positive and negative sentiment in EU countries, in particular in UK, and by matching events, locations and perception, it underlines opinion dynamics and common prejudices regarding the refugees.Source: IEEE/ACM International Conference on Advances in Social Networks Analysis and Mining, pp. 1270–1277, San Francisco, CA, USA, 18-21 August 2016
DOI: 10.1109/asonam.2016.7752401DOI: 10.48550/arxiv.1605.01895Project(s): SoBigData
Metrics:
See at:
arXiv.org e-Print Archive
| arxiv.org
| ISTI Repository
| doi.org
| doi.org
| gateway.webofknowledge.com
| ieeexplore.ieee.org
| CNR ExploRA
2015
Conference article
Restricted
MUSETS: Diversity-aware web query suggestions for shortening user sessions
Sydow M., Muntean C. I., Nardini F. M., Matwin S., Silvestri F.We propose MUSETS (multi-session total shortening) - a novel formulation of the query suggestion task, specified as an optimization problem. Given an ambiguous user query, the goal is to propose the user a set of query suggestions that optimizes a diversity-aware objective function. The function models the expected number of query reformulations that a user would save until reaching a satisfactory query formulation. The function is diversity-aware, as it naturally enforces high coverage of different alternative continuations of the user session. For modeling the topics covered by the queries, we also use an extended query representation based on entities extracted from Wikipedia. We apply a machine learning approach to learn the model on a set of user sessions to be subsequently used for queries that are under-represented in historical query logs and present an evaluation of the approach.Source: Foundations of Intelligent Systems. 22nd International Symposium, pp. 237–247, Lyon, France, 21-23/10/2015
DOI: 10.1007/978-3-319-25252-0_26Metrics:
See at:
doi.org
| link.springer.com
| CNR ExploRA
2015
Conference article
Open Access
Gamification in information retrieval: State of the art, challenges and opportunities
Muntean C. I., Nardini F. M.Gamification aims at applying game design principles and elements, such as points, badges, feedbacks or leader boards in non- gaming environments. An interesting goal of gamification is to combine and exploit the fun factor for targeting other aspects like achieving more accurate work, more cost effective solutions and better retention rates. The application of gamification techniques to IR tasks poses interesting research challenges. In this paper, we propose an analysis of the state of the art in this field and we summarize interesting challenges and oppor- tunities for the near future.Source: 6th Italian Information Retrieval Workshop, Cagliari, Italy, 25-26/05/2015
See at:
ceur-ws.org
| CNR ExploRA