2019
Conference article  Open Access

Learning relationship-aware visual features

Messina N., Amato G., Carrara F., Falchi F., Gennaro C.

deep learning  computer vision  content-based image retrieval  relational learning  multimedia information retrieval 

Relational reasoning in Computer Vision has recently shown impressive results on visual question answering tasks. On the challenging dataset called CLEVR, the recently proposed Relation Network (RN), a simple plug-and-play module and one of the state-of-the-art approaches, has obtained a very good accuracy (95.5%) answering relational questions. In this paper, we define a sub-field of Content-Based Image Retrieval (CBIR) called Relational-CBIR (R-CBIR), in which we are interested in retrieving images with given relationships among objects. To this aim, we employ the RN architecture in order to extract relation-aware features from CLEVR images. To prove the effectiveness of these features, we extended both CLEVR and Sort-of-CLEVR datasets generating a ground-truth for R-CBIR by exploiting relational data embedded into scene-graphs. Furthermore, we propose a modification of the RN module - a two-stage Relation Network (2S-RN) - that enabled us to extract relation-aware features by using a preprocessing stage able to focus on the image content, leaving the question apart. Experiments show that our RN features, especially the 2S-RN ones, outperform the RMAC state-of-the-art features on this new challenging task.

Source: ECCV 2018 - European Conference on Computer Vision, pp. 486–501, Monaco, Germania, 8-14 Settembre 2018

Publisher: Springer, Cham, Heidelberg, New York, Dordrecht, London, CHE


Metrics



Back to previous page
BibTeX entry
@inproceedings{oai:it.cnr:prodotti:402682,
	title = {Learning relationship-aware visual features},
	author = {Messina N. and Amato G. and Carrara F. and Falchi F. and Gennaro C.},
	publisher = {Springer, Cham, Heidelberg, New York, Dordrecht, London, CHE},
	doi = {10.1007/978-3-030-11018-5_40},
	booktitle = {ECCV 2018 - European Conference on Computer Vision, pp. 486–501, Monaco, Germania, 8-14 Settembre 2018},
	year = {2019}
}