Document - Detecting adversarial example attacks to deep neural networks

2017

Conference article Open Access

Detecting adversarial example attacks to deep neural networks

Carrara F., Falchi F., Caldelli R., Amato G., Fumarola R., Becarelli R.

Machine Learning Security Adversarial Images Detection Deep Convolutional Neural Network

Deep learning has recently become the state of the art in many computer vision applications and in image classification in particular. However, recent works have shown that it is quite easy to create adversarial examples, i.e., images intentionally created or modified to cause the deep neural network to make a mistake. They are like optical illusions for machines containing changes unnoticeable to the human eye. This represents a serious threat for machine learning methods. In this paper, we investigate the robustness of the representations learned by the fooled neural network, analyzing the activations of its hidden layers. Specifically, we tested scoring approaches used for kNN classification, in order to distinguishing between correctly classified authentic images and adversarial examples. The results show that hidden layers activations can be used to detect incorrect classifications caused by adversarial attacks.

Source: CBMI '17 - 15th International Workshop on Content-Based Multimedia Indexing, Firenze, Italy, 19-21 June 2017

Publisher: ACM Press, New York, USA

Metrics

Back to previous page

Cite as

BibTeX entry

@inproceedings{oai:it.cnr:prodotti:384736,
	title = {Detecting adversarial example attacks to deep neural networks},
	author = {Carrara F. and Falchi F. and Caldelli R. and Amato G. and Fumarola R. and Becarelli R.},
	publisher = {ACM Press, New York, USA},
	doi = {10.1145/3095713.3095753},
	booktitle = {CBMI '17 - 15th International Workshop on Content-Based Multimedia Indexing, Firenze, Italy, 19-21 June 2017},
	year = {2017}
}