2019
Conference article  Open Access

Adversarial Examples Detection in Features Distance Spaces

Carrara F., Becarelli R., Caldelli R., Falchi F., Amato G.

deep learning  adversarial machine learning 

Maliciously manipulated inputs for attacking machine learning methods -- in particular deep neural networks -- are emerging as a relevant issue for the security of recent artificial intelligence technologies, especially in computer vision. In this paper, we focus on attacks targeting image classifiers implemented with deep neural networks, and we propose a method for detecting adversarial images which focuses on the trajectory of internal representations (i.e. hidden layers neurons activation, also known as deep features) from the very first, up?to the last. We argue that the representations of adversarial inputs follow a different evolution with respect to genuine inputs, and we define a distance-based embedding of features to efficiently encode this information. We train an LSTM network that analyzes the sequence of deep features embedded in a distance space to detect adversarial examples. The results of our preliminary experiments are encouraging: our detection scheme is able to detect adversarial inputs targeted to the ResNet-50 classifier pre-trained on the ILSVRC'12 dataset and generated by a variety of crafting algorithms.

Source: ECCV: European Conference on Computer Vision, pp. 313–327, Monaco, Germania, 8-14 Settembre 2018


Metrics



Back to previous page
BibTeX entry
@inproceedings{oai:it.cnr:prodotti:402662,
	title = {Adversarial Examples Detection in Features Distance Spaces},
	author = {Carrara F. and Becarelli R. and Caldelli R. and Falchi F. and Amato G.},
	doi = {10.1007/978-3-030-11012-3_26},
	booktitle = {ECCV: European Conference on Computer Vision, pp. 313–327, Monaco, Germania, 8-14 Settembre 2018},
	year = {2019}
}