2022
Doctoral thesis  Open Access

Relational Learning in computer vision

Messina N.

Deep Learning  Relational learning  Information Retrieval  Computer vision  Natural Language Processing  Abstract reasoning 

The increasing interest in social networks, smart cities, and Industry 4.0 is encouraging the development of techniques for processing, understanding, and organizing vast amounts of data. Recent important advances in Artificial Intelligence brought to life a subfield of Machine Learning called Deep Learning, which can automatically learn common patterns from raw data directly, without relying on manual feature selection. This framework overturned many computer science fields, like Computer Vision and Natural Language Processing, obtaining astonishing results. Nevertheless, many challenges are still open. Although deep neural networks obtained impressive results on many tasks, they cannot perform non-local processing by explicitly relating potentially interconnected visual or textual entities. This relational aspect is fundamental for capturing high-level semantic interconnections in multimedia data or understanding the relationships between spatially distant objects in an image. This thesis tackles the relational understanding problem in Deep Neural Networks, considering three different yet related tasks: Relational Content-based Image Retrieval (R-CBIR), Visual-Textual Retrieval, and the Same-Different tasks. We use state-of-the-art deep learning methods for relational learning, such as the Relation Networks and the Transformer Networks for relating the different entities in an image or in a text.



Back to previous page
BibTeX entry
@phdthesis{oai:it.cnr:prodotti:466811,
	title = {Relational Learning in computer vision},
	author = {Messina N.},
	year = {2022}
}
CNR ExploRA

Bibliographic record

ISTI Repository

Deposited version Open Access

Also available from

etd.adm.unipi.itOpen Access