Journal article  Open Access

DeepFlash: turning a flash selfie into a studio portrait

Capece N., Banterle F., Cignoni P., Ganovelli F., Scopigno R., Erra U.

Computer Science - Machine Learning  Image Enhancement  Computer Vision and Pattern Recognition  Machine Learning  Deep Learning  Signal Processing  Electrical and Electronic Engineering  Computer Vision and Pattern Recognition (cs.CV)  Computational Photography  FOS: Computer and information sciences  Software  Machine Learning (cs.LG)  Computer Science - Computer Vision and Pattern Recognition 

We present a method for turning a flash selfie taken with a smartphone into a photograph as if it was taken in a studio setting with uniform lighting. Our method uses a convolutional neural network trained on a set of pairs of photographs acquired in an ad-hoc acquisition campaign. Each pair consists of one photograph of a subject's face taken with the camera flash enabled and another one of the same subject in the same pose illuminated using a photographic studio-lighting setup. We show how our method can amend defects introduced by a close-up camera flash, such as specular highlights, shadows, skin shine, and flattened images.

Source: Signal processing. Image communication 77 (2019): 28–39. doi:10.1016/j.image.2019.05.013

Publisher: Elsevier, Oxford ;, Paesi Bassi

[1] V. Bychkovsky, S. Paris, E. Chan, F. Durand, Learning photographic global tonal adjustment with a database of input/output image pairs, in: Computer Vision and Pattern Recognition (CVPR), 2011 IEEE Conference on, IEEE, 2011, pp. 97{104.
[2] S. Wang, J. Zheng, H.-M. Hu, B. Li, Naturalness preserved enhancement algorithm for non-uniform illumination images, IEEE Transactions on Image Processing 22 (9) (2013) 3538{3548.
[3] Y. Shih, S. Paris, C. Barnes, W. T. Freeman, F. Durand, Style transfer for headshot portraits, ACM Transactions on Graphics (TOG) 33 (4) (2014) 148.
[4] Y. Wang, L. Zhang, Z. Liu, G. Hua, Z. Wen, Z. Zhang, D. Samaras, Face relighting from a single image under arbitrary unknown lighting conditions,
[5] Z. Wen, Z. Liu, T. S. Huang, Face relighting with radiance environment maps, in: null, IEEE, 2003, p. 158.
[6] Z. Shu, S. Hadap, E. Shechtman, K. Sunkavalli, S. Paris, D. Samaras, Portrait lighting transfer using a mass transport approach, ACM Transactions on Graphics (TOG) 37 (1) (2018) 2.
[7] E. Gabriel, K. Joel, D. Gyorgy, M. Rafal, U. Jonas, Hdr image reconstruction from a single exposure using deep cnns, ACM Transactions on Graphics (TOG) 36 (6).
[8] S. Iizuka, E. Simo-Serra, H. Ishikawa, Let there be color!: Joint end-to-end learning of global and local image priors for automatic image colorization with simultaneous classi cation, ACM Trans. Graph. 35 (4) (2016) 110:1{ 110:11. doi:10.1145/2897824.2925974. URL http://doi.acm.org/10.1145/2897824.2925974
[9] C. Ledig, L. Theis, F. Huszr, J. Caballero, A. Cunningham, A. Acosta, A. Aitken, A. Tejani, J. Totz, Z. Wang, W. Shi, Photo-realistic single image super-resolution using a generative adversarial network, in: 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2017, pp. 105{114. doi:10.1109/CVPR.2017.19.
[10] G. Petschnigg, R. Szeliski, M. Agrawala, M. Cohen, H. Hoppe, K. Toyama, Digital photography with ash and no- ash image pairs, ACM Trans. Graph. 23 (3) (2004) 664{672. doi:10.1145/1015706.1015777. URL http://doi.acm.org/10.1145/1015706.1015777
[11] E. Eisemann, F. Durand, Flash photography enhancement via intrinsic relighting, ACM Trans. Graph. 23 (3) (2004) 673{678. doi:10.1145/ 1015706.1015778. URL http://doi.acm.org/10.1145/1015706.1015778
[12] A. Agrawal, R. Raskar, S. K. Nayar, Y. Li, Removing photography artifacts using gradient projection and ash-exposure sampling, ACM Trans. Graph. 24 (3) (2005) 828{835. doi:10.1145/1073204.1073269. URL http://doi.acm.org/10.1145/1073204.1073269
[13] Y. LeCun, Y. Bengio, G. Hinton, Deep learning, Nature 521 (7553) (2015) 436{444.
[14] A. Krizhevsky, I. Sutskever, G. E. Hinton, Imagenet classi cation with deep convolutional neural networks, in: Proceedings of the 25th International Conference on Neural Information Processing Systems - Volume 1, NIPS'12, Curran Associates Inc., USA, 2012, pp. 1097{1105. URL http://dl.acm.org/citation.cfm?id=2999134.2999257
[15] A. Dosovitskiy, J. T. Springenberg, M. Riedmiller, T. Brox, Discriminative unsupervised feature learning with convolutional neural networks, in: Proceedings of the 27th International Conference on Neural Information Processing Systems - Volume 1, NIPS'14, MIT Press, Cambridge, MA, USA, 2014, pp. 766{774. URL http://dl.acm.org/citation.cfm?id=2968826.2968912
[16] Z. Shu, E. Yumer, S. Hadap, K. Sunkavalli, E. Shechtman, D. Samaras, Neural face editing with intrinsic image disentangling, in: Computer Vision and Pattern Recognition (CVPR), 2017 IEEE Conference on, IEEE, 2017, pp. 5444{5453.
[17] S. Sengupta, A. Kanazawa, C. D. Castillo, D. W. Jacobs, Sfsnet: Learning shape, re ectance and illuminance of faces in the wild, in: Computer Vision and Pattern Regognition (CVPR), 2018.
[18] A. Ignatov, N. Kobyshev, R. Timofte, K. Vanhoey, L. V. Gool, Dslr-quality photos on mobile devices with deep convolutional networks, in: 2017 IEEE International Conference on Computer Vision (ICCV), 2017, pp. 3297{ 3305. doi:10.1109/ICCV.2017.355.
[19] Y.-S. Chen, Y.-C. Wang, M.-H. Kao, Y.-Y. Chuang, Deep photo enhancer: Unpaired learning for image enhancement from photographs with gans, in: 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition, IEEE, 2018, pp. 6306{6314.
[20] Y. Hu, H. He, C. Xu, B. Wang, S. Lin, Exposure: A white-box photo post-processing framework, ACM Transactions on Graphics (TOG) 37 (2) (2018) 26.
[21] J.-Y. Zhu, T. Park, P. Isola, A. A. Efros, Unpaired image-to-image translation using cycle-consistent adversarial networks, arXiv preprint.
[22] X. Shen, A. Hertzmann, J. Jia, S. Paris, B. Price, E. Shechtman, I. Sachs, Automatic portrait segmentation for image stylization, Comput. Graph. Forum 35 (2) (2016) 93{102. doi:10.1111/cgf.12814. URL https://doi.org/10.1111/cgf.12814
[23] E. Shelhamer, J. Long, T. Darrell, Fully convolutional networks for semantic segmentation, IEEE Trans. Pattern Anal. Mach. Intell. 39 (4) (2017) 640{651. doi:10.1109/TPAMI.2016.2572683. URL https://doi.org/10.1109/TPAMI.2016.2572683
[24] X. Zhang, R. Ng, Q. Chen, Single image re ection separation with perceptual losses, in: IEEE Conference on Computer Vision and Pattern Recognition, 2018.
[25] O. Ronneberger, P. Fischer, T. Brox, U-net: Convolutional networks for biomedical image segmentation, in: International Conference on Medical image computing and computer-assisted intervention, Springer, 2015, pp. 234{241.
[26] C. Chen, Q. Chen, J. Xu, V. Koltun, Learning to See in the Dark, 2018.
[27] Y. Aksoy, C. Kim, P. Kellnhofer, S. Paris, M. Elgharib, M. Pollefeys, W. Matusik, A dataset of ash and ambient illumination pairs from the crowd, in: Proc. ECCV, 2018.
[29] X. Glorot, A. Bordes, Y. Bengio, Deep sparse recti er neural networks, in: G. Gordon, D. Dunson, M. Dud k (Eds.), Proceedings of the Fourteenth International Conference on Arti cial Intelligence and Statistics, Vol. 15 of Proceedings of Machine Learning Research, PMLR, Fort Lauderdale, FL, USA, 2011, pp. 315{323.
[30] Y. T. Zhou, R. Chellappa, Computation of optical ow using a neural network, in: IEEE 1988 International Conference on Neural Networks, 1988, pp. 71{78 vol.2. doi:10.1109/ICNN.1988.23914.
[31] I. Goodfellow, Y. Bengio, A. Courville, Y. Bengio, Deep learning, Vol. 1, MIT press Cambridge, 2016.
[32] S. Io e, C. Szegedy, Batch normalization: Accelerating deep network training by reducing internal covariate shift, arXiv preprint arXiv:1502.03167.
[33] A. L. Maas, A. Y. Hannun, A. Y. Ng, Recti er nonlinearities improve neural network acoustic models, in: in ICML Workshop on Deep Learning for Audio, Speech and Language Processing, 2013.
[34] B. Xu, N. Wang, T. Chen, M. Li, Empirical evaluation of recti ed activations in convolutional network, CoRR abs/1505.00853.
[35] S. Hochreiter, Y. Bengio, P. Frasconi, J. Schmidhuber, Gradient ow in recurrent nets: the di culty of learning long-term dependencies (2001).
[36] K. He, X. Zhang, S. Ren, J. Sun, Deep residual learning for image recognition, in: 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2016, pp. 770{778. doi:10.1109/CVPR.2016.90.
[37] N. Capece, U. Erra, R. Scolamiero, Converting night-time images to daytime images through a deep learning approach, in: 2017 21st International Conference Information Visualisation (IV), 2017, pp. 324{331. doi:10.
[38] M. D. Zeiler, D. Krishnan, G. W. Taylor, R. Fergus, Deconvolutional networks, in: 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, 2010, pp. 2528{2535. doi:10.1109/CVPR.2010. 5539957.
[39] D. P. Kingma, J. Ba, Adam: A method for stochastic optimization, arXiv preprint arXiv:1412.6980.
[40] D. E. Rumelhart, G. E. Hinton, R. J. Williams, Learning representations by back-propagating errors, nature 323 (6088) (1986) 533.
[41] J. Duchi, E. Hazan, Y. Singer, Adaptive subgradient methods for online learning and stochastic optimization, Journal of Machine Learning Research 12 (Jul) (2011) 2121{2159.
[42] T. Tieleman, G. Hinton, Lecture 6.5-rmsprop, coursera: Neural networks for machine learning, University of Toronto, Technical Report.
[43] O. M. Parkhi, A. Vedaldi, A. Zisserman, Deep face recognition, Vol. 1, 2015, pp. 41.1{41.12. doi:10.5244/C.29.41.
[48] Y. Sun, X. Wang, X. Tang, Deeply learned face representations are sparse, selective, and robust, in: Proceedings of the IEEE conference on computer vision and pattern recognition, 2015, pp. 2892{2900.
[49] V. Nair, G. E. Hinton, Recti ed linear units improve restricted boltzmann machines, in: Proceedings of the 27th International Conference on International Conference on Machine Learning, ICML'10, Omnipress, USA, 2010, pp. 807{814. URL http://dl.acm.org/citation.cfm?id=3104322.3104425
[50] X. Glorot, A. Bordes, Y. Bengio, Deep sparse recti er neural networks, in: Proceedings of the Fourteenth International Conference on Arti cial Intelligence and Statistics, 2011, pp. 315{323.
[55] M. Gharbi, J. Chen, J. T. Barron, S. W. Hasino , F. Durand, Deep bilateral learning for real-time image enhancement, ACM Transactions on Graphics (TOG) 36 (4) (2017) 118.
[56] J. Chen, S. Paris, F. Durand, Real-time edge-aware image processing with the bilateral grid, ACM Trans. Graph. 26 (3). doi:10.1145/1276377. 1276506. URL http://doi.acm.org/10.1145/1276377.1276506
[57] P. Isola, J.-Y. Zhu, T. Zhou, A. A. Efros, Image-to-image translation with conditional adversarial networks, CVPR.
[58] I. Goodfellow, J. Pouget-Abadie, M. Mirza, B. Xu, D. Warde-Farley, S. Ozair, A. Courville, Y. Bengio, Generative adversarial nets, in: Advances in neural information processing systems, 2014, pp. 2672{2680.
[59] M. Mirza, S. Osindero, Conditional generative adversarial nets, arXiv preprint arXiv:1411.1784.
[60] A. B. L. Larsen, S. K. S nderby, H. Larochelle, O. Winther, Autoencoding beyond pixels using a learned similarity metric, in: Proceedings of the 33rd International Conference on International Conference on Machine Learning - Volume 48, ICML'16, JMLR.org, 2016, pp. 1558{1566. URL http://dl.acm.org/citation.cfm?id=3045390.3045555


Back to previous page
BibTeX entry
	title = {DeepFlash: turning a flash selfie into a studio portrait},
	author = {Capece N. and Banterle F. and Cignoni P. and Ganovelli F. and Scopigno R. and Erra U.},
	publisher = {Elsevier, Oxford ;, Paesi Bassi},
	doi = {10.1016/j.image.2019.05.013 and 10.48550/arxiv.1901.04252},
	journal = {Signal processing. Image communication},
	volume = {77},
	pages = {28–39},
	year = {2019}