Page 1 of 13

2021 Journal article Open Access

Re-ranking via local embeddings: A use case with permutation-based indexing and the nSimplex projection
Vadicamo L, Gennaro C, Falchi F, Chavez E, Connor R, Amato G
Approximate Nearest Neighbor (ANN) search is a prevalent paradigm for searching intrinsically high dimensional objects in large-scale data sets. Recently, the permutation-based approach for ANN has attracted a lot of interest due to its versatility in being used in the more general class of metric spaces. In this approach, the entire database is ranked by a permutation distance to the query. Typically, permutations allow the efficient selection of a candidate set of results, but typically to achieve high recall or precision this set has to be reviewed using the original metric and data. This can lead to a sizeable percentage of the database being recalled, along with many expensive distance calculations. To reduce the number of metric computations and the number of database elements accessed, we propose here a re-ranking based on a local embedding using the nSimplex projection. The nSimplex projection produces Euclidean vectors from objects in metric spaces which possess the n-point property. The mapping is obtained from the distances to a set of reference objects, and the original metric can be lower bounded and upper bounded by the Euclidean distance of objects sharing the same set of references. Our approach is particularly advantageous for extensive databases or expensive metric function. We reuse the distances computed in the permutations in the first stage, and hence the memory footprint of the index is not increased. An extensive experimental evaluation of our approach is presented, demonstrating excellent results even on a set of hundreds of millions of objects.Source: INFORMATION SYSTEMS, vol. 95
Project(s): AI4EU via OpenAIRE

See at: CNR IRIS Open Access | ISTI Repository | www.sciencedirect.com | CNR IRIS Restricted | CNR IRIS

2022 Other Open Access

SSHOC - D5.5 'Archive in a Box' repository software and proof of concept of centralised installation in the cloud
Wittenberg M, Tykhonov V, Indarto E, Steinhoff W, Huis In 't Veld L, Kasberger S, Conzett P, Concordia C, Kiraly P, Parkola T
Within task 5.2 (Hosting and sharing data repositories) of the SSHOC project, repository software is being developed based on Dataverse, for the sharing and publication of research data within the Social Science and Humanities (SSH) domain. Dataverse is open-source research data repository software developed by the Institute for Quantitative Social Science (IQSS), Harvard University. This document describes the work done by task 5.2, for the development of 'Archive in a Box' repository software and proof of concept of centralised installation in the cloud. The 'Archive in a Box' makes the installation of Dataverse repository software easier for institutes with a lack of technical staff. This document describes the advantages of such a package. Additionally, task 5.2 worked on a proof of concept of a centralised cloud installation of the Dataverse software at the Google cloud infrastructure of CESSDA ERIC. A cloud installation makes it possible to automate the installation and keep the application up and running, for instance by scaling up or down resources when needed. Another advantage of a cloud orchestrator is the ability to start a new component or part of the application, if it should fail for some reason. Furthermore, task 5.2 developed several additional functionalities to the Dataverse software to make the software more compliant to the needs of the SSH communities in Europe. This document describes the results of the accomplished work, and refers to technical details published in GitHub repositories. Already many of the results of task 5.2 are used by the European and Global Dataverse community and some functionalities are integrated in the new versions of the Dataverse master branch of Harvard.DOI: 10.5281/zenodo.6676391
Project(s): SSHOC via OpenAIRE

Metrics:

See at: CNR IRIS Open Access | ISTI Repository | CNR IRIS Restricted

2021 Software Metadata Only Access

HDN Annotation Tool
Bartalesi Lenzi V, Pratelli N, Metilli D, Meghini C
To facilitate the process of populating the ontology developed within the Hypermedia Dante Network (HDN) project (PRIN 2020-2023), we implemented a semi-automatic tool called HDN Annotation Tool. The tool supports scholars to build a knowledge base of the primary sources of Dante Alighieri's Divine Comedy. The tool was developed using a Python backend with the Django framework, and a frontend built with HTML5, JavaScript, and the Bootstrap library. It takes as input the JSON file, where the knowledge automatically extracted from the corpus of the Dartmouth Dante Project (DDP) is stored and shows the relevant information in the corresponding fields of the tool interface. After analyzing the commentaries of the DDP, scholars use the interface of the tool to insert knowledge about primary sources. The tool is accessible through the HDN-Lab, which is the Virtual Research Environment (VRE) of the project, hosted on the D4Science infrastructure.

See at: dante.d4science.org Restricted | CNR IRIS

2021 Journal article Open Access

TweepFake: about detecting deepfake tweets
Fagni T, Falchi F, Gambini M, Martella A, Tesconi M
The recent advances in language modeling significantly improved the generative capabilities of deep neural models: In 2019 OpenAI released GPT-2, a pre-trained language model that can autonomously generate coherent, non-trivial and human-like text samples. Since then, ever more powerful text generative models have been developed. Adversaries can exploit these tremendous generative capabilities to enhance social bots that will have the ability to write plausible deepfake messages, hoping to contaminate public debate. To prevent this, it is crucial to develop deepfake social media messages detection systems. However, to the best of our knowledge no one has ever addressed the detection of machinegenerated texts on social networks like Twitter or Facebook. With the aim of helping the research in this detection field, we collected the first dataset of real deepfake tweets, Tweep- Fake. It is real in the sense that each deepfake tweet was actually posted on Twitter. We collected tweets from a total of 23 bots, imitating 17 human accounts. The bots are based on various generation techniques, i.e., Markov Chains, RNN, RNN+Markov, LSTM, GPT-2. We also randomly selected tweets from the humans imitated by the bots to have an overall balanced dataset of 25,572 tweets (half human and half bots generated). The dataset is publicly available on Kaggle. Lastly, we evaluated 13 deepfake text detection methods (based on various state-of-the-art approaches) to both demonstrate the challenges that Tweepfake poses and create a solid baseline of detection techniques. We hope that Tweep- Fake can offer the opportunity to tackle the deepfake detection on social media messages as well.Source: PLOS ONE, vol. 16
Project(s): AI4Media via OpenAIRE

, SoBigData-PlusPlus via OpenAIRE

2021 Conference article Open Access

Re-assessing the "Classify and Count" quantification method
Moreo A, Sebastiani F
Learning to quantify (a.k.a. quantification) is a task concerned with training unbiased estimators of class prevalence via supervised learning. This task originated with the observation that "Classify and Count" (CC), the trivial method of obtaining class prevalence estimates, is often a biased estimator, and thus delivers suboptimal quantification accuracy. Following this observation, several methods for learning to quantify have been proposed and have been shown to outperform CC. In this work we contend that previous works have failed to use properly optimised versions of CC. We thus reassess the real merits of CC and its variants, and argue that, while still inferior to some cutting-edge methods, they deliver near-state-of-the-art accuracy once (a) hyperparameter optimisation is performed, and (b) this optimisation is performed by using a truly quantification-oriented evaluation protocol. Experiments on three publicly available binary sentiment classification datasets support these conclusions.

See at: CNR IRIS Open Access | link.springer.com | ISTI Repository | CNR IRIS Restricted | CNR IRIS

2021 Journal article Open Access

Query filtering using two-dimensional local embeddings
Vadicamo L, Connor R, Chávez E
In high dimensional data sets, exact indexes are ineffective for proximity queries, and a sequential scan over the entire data set is unavoidable. Accepting this, here we present a new approach employing two-dimensional embeddings. Each database element is mapped to the XY plane using the four-point property. The caveat is that the mapping is local: in other words, each object is mapped using a different mapping. The idea is that each element of the data is associated with a pair of reference objects that is well-suited to filter that particular object, in cases where it is not relevant to a query. This maximises the probability of excluding that object from a search. At query time, a query is compared with a pool of reference objects which allow its mapping to all the planes used by data objects. Then, for each query/object pair, a lower bound of the actual distance is obtained. The technique can be applied to any metric space that possesses the four-point property, therefore including Euclidean, Cosine, Triangular, Jensen-Shannon, and Quadratic Form distances. Our experiments show that for all the data sets tested, of varying dimensionality, our approach can filter more objects than a standard metric indexing approach. For low dimensional data this does not make a good search mechanism in its own right, as it does not scale with the size of the data: that is, its cost is linear with respect to the data size. However, we also show that it can be added as a post-filter to other mechanisms, increasing efficiency with little extra cost in space or time. For high-dimensional data, we show related approximate techniques which, we believe, give the best known compromise for speeding up the essential sequential scan. The potential uses of our filtering technique include pure GPU searching, taking advantage of the tiny memory footprint of the mapping.Source: INFORMATION SYSTEMS, vol. 101

2021 Conference article Open Access

Transformer reasoning network for image-text matching and retrieval
Messina N, Falchi F, Esuli A, Amato G
Image-text matching is an interesting and fascinating task in modern AI research. Despite the evolution of deep-learning-based image and text processing systems, multi-modal matching remains a challenging problem. In this work, we consider the problem of accurate image-text matching for the task of multi-modal large-scale information retrieval. State-of-the-art results in image-text matching are achieved by inter-playing image and text features from the two different processing pipelines, usually using mutual attention mechanisms. However, this invalidates any chance to extract separate visual and textual features needed for later indexing steps in large-scale retrieval systems. In this regard, we introduce the Transformer Encoder Reasoning Network (TERN), an architecture built upon one of the modern relationship-aware self-attentive architectures, the Transformer Encoder (TE). This architecture is able to separately reason on the two different modalities and to enforce a final common abstract concept space by sharing the weights of the deeper transformer layers. Thanks to this design, the implemented network is able to produce compact and very rich visual and textual features available for the successive indexing step. Experiments are conducted on the MS-COCO dataset, and we evaluate the results using a discounted cumulative gain metric with relevance computed exploiting caption similarities, in order to assess possibly non-exact but relevant search results. We demonstrate that on this metric we are able to achieve state-of-the-art results in the image retrieval task. Our code is freely available at https://github.com/mesnico/TERN.Source: INTERNATIONAL CONFERENCE ON PATTERN RECOGNITION, pp. 5222-5229. Online conference, 10-15/01/2021
Project(s): AI4EU via OpenAIRE

, AI4Media via OpenAIRE

See at: CNR IRIS Open Access | ieeexplore.ieee.org | ISTI Repository | CNR IRIS Restricted | CNR IRIS

2021 Journal article Open Access

Hebbian semi-supervised learning in a sample efficiency setting
Lagani G, Falchi F, Gennaro C, Amato G
We propose to address the issue of sample efficiency, in Deep Convolutional Neural Networks (DCNN), with a semi-supervised training strategy that combines Hebbian learning with gradient descent: all internal layers (both convolutional and fully connected) are pre-trained using an unsupervised approach based on Hebbian learning, and the last fully connected layer (the classification layer) is trained using Stochastic Gradient Descent (SGD). In fact, as Hebbian learning is an unsupervised learning method, its potential lies in the possibility of training the internal layers of a DCNN without labels. Only the final fully connected layer has to be trained with labeled examples. We performed experiments on various object recognition datasets, in different regimes of sample efficiency, comparing our semi-supervised (Hebbian for internal layers + SGD for the final fully connected layer) approach with end-to-end supervised backprop training, and with semi-supervised learning based on Variational Auto-Encoder (VAE). The results show that, in regimes where the number of available labeled samples is low, our semi-supervised approach outperforms the other approaches in almost all the cases.Source: NEURAL NETWORKS, vol. 143, pp. 719-731
Project(s): AI4EU via OpenAIRE

, AI4Media via OpenAIRE

See at: CNR IRIS Open Access | ISTI Repository | www.sciencedirect.com | CNR IRIS Restricted | CNR IRIS

2021 Conference article Open Access

AIMH at SemEval-2021 - Task 6: multimodal classification using an ensemble of transformer models
Messina N, Falchi F, Gennaro C, Amato G
This paper describes the system used by the AIMH Team to approach the SemEval Task 6. We propose an approach that relies on an architecture based on the transformer model to process multimodal content (text and images) in memes. Our architecture, called DVTT (Double Visual Textual Transformer), approaches Subtasks 1 and 3 of Task 6 as multi-label classification problems, where the text and/or images of the meme are processed, and the probabilities of the presence of each possible persuasion technique are returned as a result. DVTT uses two complete networks of transformers that work on text and images that are mutually conditioned. One of the two modalities acts as the main one and the second one intervenes to enrich the first one, thus obtaining two distinct ways of operation. The two transformers outputs are merged by averaging the inferred probabilities for each possible label, and the overall network is trained end-to-end with a binary cross-entropy loss.Source: PROCEEDINGS OF THE CONFERENCE - ASSOCIATION FOR COMPUTATIONAL LINGUISTICS. MEETING, pp. 1020-1026. Bangkok, Thailand, 5-6/08/2021
Project(s): AI4EU via OpenAIRE

, AI4Media via OpenAIRE

2021 Conference article Open Access

Towards efficient cross-modal visual textual retrieval using transformer-encoder deep features
Messina N, Amato G, Falchi F, Gennaro C, Marchandmaillet S
Cross-modal retrieval is an important functionality in modern search engines, as it increases the user experience by allowing queries and retrieved objects to pertain to different modalities. In this paper, we focus on the image-sentence retrieval task, where the objective is to efficiently find relevant images for a given sentence (image-retrieval) or the relevant sentences for a given image (sentence-retrieval). Computer vision literature reports the best results on the image-sentence matching task using deep neural networks equipped with attention and self-attention mechanisms. They evaluate the matching performance on the retrieval task by performing sequential scans of the whole dataset. This method does not scale well with an increasing amount of images or captions. In this work, we explore different preprocessing techniques to produce sparsified deep multi-modal features extracting them from state-of-the-art deep-learning architectures for image-text matching. Our main objective is to lay down the paths for efficient indexing of complex multi-modal descriptions. We use the recently introduced TERN architecture as an image-sentence features extractor. It is designed for producing fixed-size 1024-d vectors describing whole images and sentences, as well as variable-length sets of 1024-d vectors describing the various building components of the two modalities (image regions and sentence words respectively). All these vectors are enforced by the TERN design to lie into the same common space. Our experiments show interesting preliminary results on the explored methods and suggest further experimentation in this important research direction.Project(s): AI4EU via OpenAIRE

See at: CNR IRIS Open Access | ieeexplore.ieee.org | ISTI Repository | CNR IRIS Restricted | CNR IRIS

2021 Software Open Access

TwiGet
Esuli A
TwiGet is a python package for the management of the queries on filtered stream of the Twitter API, and the collection of tweets from it. It can be used as a command line tool (twiget-cli) or as a python class (TwiGet).Project(s): AI4Media via OpenAIRE

See at: github.com Open Access | CNR IRIS | CNR IRIS Restricted

2021 Other Open Access

SSHOC - D3.5 Report on citation enabled SSH catalogues and SSH citation exploitation
Larrousse N, Gray E, Broeder D, Concordia C, Brase J, Papadopoulou A
Citation is a pillar for the construction of knowledge. By creating proper citations in a standardized way researchers can constitute a mesh of linked information for various purposes (from credit to reuse). This becomes increasingly important as the SSHOC Task 3.4 team confronts the realities of Social Sciences and Humanities Research in a digital age, when machine actionability takes on a renewed and vital importance. After conducting an inventory of data citation practices (SSHOC D3.2 "Inventory of SSH citation practices, and choice for SSHOC citation formats and implementation planning") and analysing the citation of data in DH1 2019 conference abstracts in order to build specifications for the citation prototype, the team discovered a very diverse landscape of data repositories. As a result, the team developed recommendations for citation in coordination with SSHOC Work Package 2 (Communication, Dissemination, and Impact), validated by external reviewers. These recommendations were used to guide a deeper analysis of citation practices in various SSH repositories and how they correspond to these recommendations in order to have a better idea of the current situation. This analysis was carried out in both a quantitative and qualitative fashion. For the qualitative part, the main goal was to describe, in detail, a selection of repositories representative of the SSH domain. The choice of repositories was made in collaboration with the SSHOC network in order to have a good representation of the very diverse contexts in SSH. This qualitative analysis focused on how these repositories were constructed to provide data citation services in detail. For the quantitative part, a list of repositories was already established by SSHOC Task 8.2 "Trust & Quality Assurance" and the team took this opportunity to establish synergies and extract a list of repositories to be checked according to defined criteria regarding data citation. The analysis checked 85 repositories from a list of 125 against a set of 7 criteria. In order to facilitate the work, the team used the citation viewer which is part of the prototype mentioned above. The main result of this quick study is that while there are positive signs, especially with respect to the use of landing pages and Persistent Identifiers (PIDs), there is quite a bit of room for improvement as a lot of repositories do not provide machine actionable information. This makes the prototype the Task 3.4 team is currently developing to create actionable citations all the more useful. It also appears from this work that it will be necessary to manually curate some citations in order to enrich them and make them actionable as the information is not always directly available (e.g., a landing page provides a link to a page which contains metadata expressed in another format). The result of this study will feed the development of the citation prototype developed in Task 3.4 and also liaise with SSHOC WP7 "Creating the SSH Open Marketplace" to integrate citations in SSH Open Marketplace2 with a "Cite As" property in the backend and Cite As box in the frontend interface. Another link exists with the very similar work currently being carried out in CLARIN for the Digital Object Gateway.DOI: 10.5281/zenodo.5603306
DOI: 10.5281/zenodo.5603305
Project(s): SSHOC via OpenAIRE

Metrics:

See at: ZENODO Open Access | ZENODO | CNR IRIS | ISTI Repository | CNR IRIS Restricted

2021 Other Open Access

Dataverse: switchboard plugin
Concordia C
Overall description of the 'dataverse-lrs' plugin, a plugin enabling the integration of Dataverse repositories with the CLARIN Language Resource Switchboard (LRS) software, developed in the "Social Sciences and Humanities Open Cloud (SSHOC)" project. The software is available in the SSHOC GitHub repository: https://github.com/SSHOC/dataverse-lrsProject(s): SSHOC via OpenAIRE

See at: CNR IRIS Open Access | ISTI Repository | www.clarin.eu | CNR IRIS Restricted

2021 Software Open Access

Dataverse-LRS: integrating Dataverse with the CLARIN Language Resource Switchboard
Concordia C
The "dataverse-lrs" software is a plugin that can be installed in a Dataverse repository to enable the integration in the Dataverse GUI of the CLARIN Language Resource Switchboard (LRS). A Dataverse repository is implemented using an open-source software platform, developed and maintained by the Harvard Institute for Quantitative Social Science (IQSS) (https://dataverse.harvard.edu/), the CLARIN LRS (https://switchboard.clarin.eu) can be seen as a Virtual Tool Registry: for a given resource, it identifies all tools that can process the resource, sorts the tools in terms of the tasks they perform, and presents a task-oriented list of those tools, each tool of the list can be invoked and the resource is automatically processed. The 'dataverse-lrs' software has been developed in the Social Sciences & Humanities Open Cloud (SSHOC) project and is described in the deliverable "3.8 Implementation report and available SSHOC Switchboard and VCR services". The software has been also presented during the CLARIN Centre Meeting 2021 (https://www.clarin.eu/event/2021/centre-meeting-2021).Project(s): SSHOC via OpenAIRE

See at: github.com Open Access | CNR IRIS | ISTI Repository | CNR IRIS Restricted

2021 Other Open Access

Enhancing the computational representation of narrative and its extraction from text
Metilli D
Narratives are a fundamental part of human life. Every human being encounters countless stories during their life, and these stories contribute to form a common understanding of reality. This is reflected in the current digital landscape, and especially on the Web, where narratives are published and shared everyday. However, the current digital representation of narratives is limited by the fact that each narrative is generally expressed as natural language text or other media, in an unstructured way that is neither standardized nor machine-readable. These limitations hinder the manageability of narratives by automated systems. One way to solve this problem would be to create an ontology of narrative, i.e., a formal model of what a narrative is, then develop semi-automated methods to extract narratives from natural language text, and use the extracted data to populate the ontology. However, the feasibility of this approach remains an open question. This thesis attempts to investigate this research question, starting from the state of the art in the fields of Computational Narratology, Semantic Web, and Natural Language Processing. Based on this analysis, we have identified a set of requirements, and we have developed a methodology for our research work. Then, we have developed an informal conceptualization of narrative, and we have expressed it in a formal way using First-Order Logic. The result of this work is the Narrative Ontology (NOnt), a formal model of narrative that also includes a representation of its textual structure and textual semantics. To ensure interoperability, the ontology is based on the CIDOC CRM and FRBRoo standards, and it has been expressed using the OWL and SWRL languages of the Semantic Web. Based on the ontology, we have developed NarraNext, a semi-automatic tool that is able to extract the main elements of narrative from natural language text. The tool allows the user to create a complete narrative based on a text, using the extracted knowledge to populate the ontology. NarraNext is based on recent advancements in the Natural Language Processing field, including deep neural networks, and is integrated with the Wikidata knowledge base. The validation of our work is being carried out in three different scenarios: (i) a case study on biographies of historical figures found in Wikipedia; (ii) the Mingei project, which applies NOnt to the representation and preservation of Heritage Crafts; (iii) the Hypermedia Dante Network project, where NOnt has been integrated with a citation ontology to represent the content of Dante's Comedy. All three applications have served to validate the representational adequacy of NOnt and the satisfaction of the requirements we defined. The case study on biographies has also evaluated the effectiveness of the NarraNext tool.Project(s): Mingei via OpenAIRE

See at: etd.adm.unipi.it Open Access | CNR IRIS | ISTI Repository | CNR IRIS Restricted

2021 Journal article Open Access

Mapping the knowledge of Dante commentaries in the digital context: a web ontology approach
Meghini C, Tavoni M, Zaccarello M
With digital repositories and databases available since the1990s, Dante scholarship has always been at the forefront of the digital humanities and the digitization of medieval texts and manuscripts. However, the amount of information available about such aspects is imposing, and its location subject to the extreme dispersion of traditional scholarly publications: commentaries first but also academic journals, miscellanies, and so forth. Rather than being based on traditional word searches, a true advancement of knowledge needs to overcome the rigidity of text-based queries (and in-line markup embedded in text). Such paramount evolution is now made possible by the Semantic Web, an extension of the current web by description standards that help machines to understand and connect the information already available on the web. To achieve this, the latter is mapped using formal description and classification patterns, called ontologies. Ontologies are a key factor in managing meaningful search/data extraction, publishing relevant results on the web, search existing web resources, and offering answers to more sophisticated queries. Due to its vastness and complexity, Dante scholarship has calls for an ontology-based mapping, and specific tools have been designed to express the most difficult and articulate aspects of Dante's literary production, such as its use of biblical, classical, and medieval sources. This paper aims to introduce the aims and scope of a new digital library of Dante commentaries, built according to the aforementioned standards and aiming to refine and extend the ontologies developed for Dante's minor works to the more complex world of the Commedia.Source: ROMANIC REVIEW, vol. 112 (issue 1), pp. 138-157

See at: CNR IRIS Open Access | ISTI Repository | read.dukeupress.edu | CNR IRIS Restricted | CNR IRIS

2021 Conference article Open Access

A multi-resolution training for expression recognition in the wild
Massoli F V, Cafarelli D, Amato G, Falchi F
Facial expressions play a fundamental role in human communication, and their study, which represents a multidisciplinary subject, embraces a great variety of research fields, e.g., from psychology to computer science, among others. Concerning Deep Learning, the recognition of facial expressions is a task named Facial Expression Recognition (FER). With such an objective, the goal of a learning model is to classify human emotions starting from a facial image of a given subject. Typically, face images are acquired by cameras that have, by nature, different characteristics, such as the output resolution. Moreover, other circumstances might involve cameras placed far from the observed scene, thus obtaining faces with very low resolutions. Therefore, since the FER task might involve analyzing face images that can be acquired with heterogeneous sources, it is plausible to expect that resolution plays a vital role. In such a context, we propose a multi-resolution training approach to solve the FER task. We ground our intuition on the observation that, often, face images are acquired at different resolutions. Thus, directly considering such property while training a model can help achieve higher performance on recognizing facial expressions. To our aim, we use a ResNet-like architecture, equipped with Squeeze-and-Excitation blocks, trained on the Affect-in-the-Wild 2 dataset. Not being available a test set, we conduct tests and model selection by employing the validation set only on which we achieve more than 90% accuracy on classifying the seven expressions that the dataset comprises.Source: CEUR WORKSHOP PROCEEDINGS, pp. 427-433. Pizzo Calabro, 5-9/9/2021
Project(s): AI4EU via OpenAIRE

See at: ceur-ws.org Open Access | CNR IRIS | ISTI Repository | CNR IRIS Restricted

2022 Journal article Open Access

Comparing the performance of Hebbian against backpropagation learning using convolutional neural networks
Lagani G, Falchi F, Gennaro C, Amato G
In this paper, we investigate Hebbian learning strategies applied to Convolutional Neural Network (CNN) training. We consider two unsupervised learning approaches, Hebbian Winner-Takes-All (HWTA), and Hebbian Principal Component Analysis (HPCA). The Hebbian learning rules are used to train the layers of a CNN in order to extract features that are then used for classification, without requiring backpropagation (backprop). Experimental comparisons are made with state-of-the-art unsupervised (but backprop-based) Variational Auto-Encoder (VAE) training. For completeness,we consider two supervised Hebbian learning variants (Supervised Hebbian Classifiers--SHC, and Contrastive Hebbian Learning--CHL), for training the final classification layer, which are compared to Stochastic Gradient Descent training. We also investigate hybrid learning methodologies, where some network layers are trained following the Hebbian approach, and others are trained by backprop. We tested our approaches on MNIST, CIFAR10, and CIFAR100 datasets. Our results suggest that Hebbian learning is generally suitable for training early feature extraction layers, or to retrain higher network layers in fewer training epochs than backprop. Moreover, our experiments show that Hebbian learning outperforms VAE training, with HPCA performing generally better than HWTA.Source: NEURAL COMPUTING & APPLICATIONS
Project(s): AI4EU via OpenAIRE

, AI4Media via OpenAIRE

2022 Conference article Open Access

Counting or localizing? Evaluating cell counting and detection in microscopy images
Ciampi L, Carrara F, Amato G, Gennaro C
Image-based automatic cell counting is an essential yet challenging task, crucial for the diagnosing of many diseases. Current solutions rely on Convolutional Neural Networks and provide astonishing results. However, their performance is often measured only considering counting errors, which can lead to masked mistaken estimations; a low counting error can be obtained with a high but equal number of false positives and false negatives. Consequently, it is hard to determine which solution truly performs best. In this work, we investigate three general counting approaches that have been successfully adopted in the literature for counting several different categories of objects. Through an experimental evaluation over three public collections of microscopy images containing marked cells, we assess not only their counting performance compared to several state-of-the-art methods but also their ability to correctly localize the counted cells. We show that commonly adopted counting metrics do not always agree with the localization performance of the tested models, and thus we suggest integrating the proposed evaluation protocol when developing novel cell counting solutions.Project(s): AI4Media via OpenAIRE

See at: CNR IRIS Open Access | ISTI Repository | www.scitepress.org | CNR IRIS Restricted | CNR IRIS

2022 Journal article Open Access

Interactive video retrieval evaluation at a distance: comparing sixteen interactive video search systems in a remote setting at the 10th Video Browser Showdown
Heller S, Gsteiger V, Bailer W, Gurrin C, Jonsson Bt, Lokoc J, Leibetseder A, Mejzlik F, Peska L, Rossetto L, Schall K, Schoeffmann K, Schuldt H, Spiess F, Tran Ld, Vadicamo L, Vesely P, Vrochidis S, Wu J
The Video Browser Showdown addresses difficult video search challenges through an annual interactive evaluation campaign attracting research teams focusing on interactive video retrieval. The campaign aims to provide insights into the performance of participating interactive video retrieval systems, tested by selected search tasks on large video collections. For the first time in its ten year history, the Video Browser Showdown 2021 was organized in a fully remote setting and hosted a record number of sixteen scoring systems. In this paper, we describe the competition setting, tasks and results and give an overview of state-of-the-art methods used by the competing systems. By looking at query result logs provided by ten systems, we analyze differences in retrieval model performances and browsing times before a correct submission. Through advances in data gathering methodology and tools, we provide a comprehensive analysis of ad-hoc video search tasks, discuss results, task design and methodological challenges. We highlight that almost all top performing systems utilize some sort of joint embedding for text-image retrieval and enable specification of temporal context in queries for known-item search. Whereas a combination of these techniques drive the currently top performing systems, we identify several future challenges for interactive video search engines and the Video Browser Showdown competition itself.Source: INTERNATIONAL JOURNAL OF MULTIMEDIA INFORMATION RETRIEVAL, vol. 11 (issue 1)
Project(s): AI4Media via OpenAIRE