Document - Increasing biases can be more efficient than increasing weights

2024

Conference article Open Access

Increasing biases can be more efficient than increasing weights

Carlo Metta, Marco Fantozzi, Andrea Papini, Gianluca Amato, Matteo Bergamaschi, Silvia Giulia Galfrè, Alessandro Marchetti, Michelangelo Vegliò, Maurizio Parton, Francesco Morandin

Computer Science - Machine Learning Neural and Evolutionary Computing (cs.NE) FOS: Computer and information sciences Artificial Neural Network Machine learning architectures Computer Science - Neural and Evolutionary Computing Computer Vision Machine Learning (cs.LG) Deep Learning I.2.6

We introduce a novel computational unit for neural networks that features multiple biases, challenging the traditional perceptron structure. This unit emphasizes the importance of preserving uncorrupted information as it is passed from one unit to the next, applying activation functions later in the process with specialized biases for each unit. Through both empirical and theoretical analyses, we show that by focusing on increasing biases rather than weights, there is potential for significant enhancement in a neural network model's performance. This approach offers an alternative perspective on optimizing information flow within neural networks. Commented source code at https://github. com/CuriosAI/dac-dev.

Citations

Agostinelli, F., Hoffman, M., Sadowski, P., and Baldi, P. Learning Activation Functions to Improve Deep Neural Networks, April 2015. URL http://arxiv.org/ abs/1412.6830. arXiv:1412.6830 [cs, stat].
Chen, T., Chen, H., and Liu, R.-w. A constructive proof and an extension of cybenko's approximation theorem. In Page, C. and LePage, R. (eds.), Computing Science and Statistics, pp. 163-168, New York, NY, 1992. Springer New York. ISBN 978-1-4612-2856-1.
Chollet, F. Xception: Deep Learning with Depthwise Separable Convolutions. pp. 1800-1807. IEEE Computer Society, July 2017. ISBN 978- 1-5386-0457-1. doi: 10.1109/CVPR.2017.195. URL https://www.computer.org/csdl/ proceedings-article/cvpr/2017/ 0457b800/12OmNqFJhzG. ISSN: 1063-6919.
Clevert, D.-A., Unterthiner, T., and Hochreiter, S. Fast and Accurate Deep Network Learning by Exponential Linear Units (ELUs), February 2016. URL http://arxiv. org/abs/1511.07289. arXiv:1511.07289 [cs].
Cybenko, G. Approximation by superpositions of a sigmoidal function. Mathematics of control, signals and systems, 2(4):303-314, 1989.
Fang, L., Liu, G., Li, S., Ghamisi, P., and Benediktsson, J. A. Hyperspectral Image Classification With Squeeze Multibias Network. IEEE Transactions on Geoscience and Remote Sensing, 57(3):1291-1301, March 2019. ISSN 1558-0644. doi: 10.1109/TGRS.2018.2865953. Conference Name: IEEE Transactions on Geoscience and Remote Sensing.
Goodfellow, I. J., Warde-Farley, D., Mirza, M., Courville, A. C., and Bengio, Y. Maxout networks. In Proceedings of the 30th International Conference on Machine Learning, ICML 2013, Atlanta, GA, USA, 16- 21 June 2013, volume 28 of JMLR Workshop and Conference Proceedings, pp. 1319-1327. JMLR.org, 2013. URL http://proceedings.mlr.press/ v28/goodfellow13.html.
He, K., Zhang, X., Ren, S., and Sun, J. Deep residual learning for image recognition. In IEEE conference on computer vision and pattern recognition, pp. 770-778, 2016a.
He, K., Zhang, X., Ren, S., and Sun, J. Identity mappings in deep residual networks. In Leibe, B., Matas, J., Sebe, N., and Welling, M. (eds.), Computer Vision - ECCV 2016 - 14th European Conference, Amsterdam, The Netherlands, October 11-14, 2016, Proceedings, Part IV, volume 9908 of Lecture Notes in Computer Science, pp. 630-645. Springer, 2016b. doi: 10.1007/ 978-3-319-46493-0n 38. URL https://doi.org/ 10.1007/978-3-319-46493-0_38.
Hochreiter, S. and Schmidhuber, J. Long short-term memory. Neural computation, 9(8):1735-1780, 1997.
Iyer, A., Grewal, K., Velu, A., Souza, L. O., Forest, J., and Ahmad, S. Avoiding catastrophe: Active dendrites enable multi-task learning in dynamic environments. Frontiers Neurorobotics, 16:846219, 2022. doi: 10.3389/fnbot. 2022.846219. URL https://doi.org/10.3389/ fnbot.2022.846219.
Jeremy Howard. Imagenette and Imagewoof datasets, 2019. https://github.com/fastai/imagenette.
Jiang, T., Wang, D., Ji, J., Todo, Y., and Gao, S. Single dendritic neuron with nonlinear computation capacity: A case study on XOR problem. In 2015 IEEE International Conference on Progress in Informatics and Computing (PIC), pp. 20-24, Dec 2015. doi: 10.1109/PIC.2015. 7489802.
Klabjan, D. and Harmon, M. Activation Ensembles for Deep Neural Networks. In 2019 IEEE International Conference on Big Data (Big Data), pp. 206-214, December 2019. doi: 10.1109/BigData47090.2019.9006069.
Krizhevsky, A. and Nair, V. and Hinton, G. CIFAR-10 and CIFAR-100 datasets, 2009. https://www.cs. toronto.edu/˜kriz/cifar.html.
Larkum, M. E. Are Dendrites Conceptually Useful? Neuroscience, 489:4-14, May 2022. ISSN 03064522. doi: 10.1016/j.neuroscience.2022.03.008.
Li, H., Ouyang, W., and Wang, X. Multi-bias non-linear activation in deep neural networks. In Balcan, M. and Weinberger, K. Q. (eds.), Proceedings of the 33nd International Conference on Machine Learning, ICML 2016, New York City, NY, USA, June 19-24, 2016, volume 48 of JMLR Workshop and Conference Proceedings, pp. 221- 229. JMLR.org, 2016. URL http://proceedings. mlr.press/v48/lia16.html.
Li, X., Tang, J., Zhang, Q., Gao, B., Yang, J. J., Yang, J. J., Song, S., Wu, W., Zhang, W., Yao, P., Deng, N., Deng, L., Xie, Y., Qian, H., and Wu, H. Power-efficient neural network with artificial dendrites. Nature nanotechnology, 15(9):776-782, September 2020. ISSN 1748- 3387. doi: 10.1038/s41565-020-0722-5. URL https: //doi.org/10.1038/s41565-020-0722-5.
Liu, Z., Mao, H., Wu, C.-Y., Feichtenhofer, C., Darrell, T., and Xie, S. A ConvNet for the 2020s. pp. 11976-11986, 2022. URL https://openaccess.thecvf.com/ content/CVPR2022/html/Liu_A_ConvNet_ for_the_2020s_CVPR_2022_paper.html.
Maas, A. L., Hannun, A. Y., and Ng, A. Y. Rectifier Nonlinearities Improve Neural Network Acoustic Models. In Proc. icml, volume 30, pp. 3. Atlanta, Georgia, USA, 2013.
Magee, J. C. Dendritic integration of excitatory synaptic input. Nature Reviews Neuroscience, 1(3):181-190, 2000.
Park, S., Yun, C., Lee, J., and Shin, J. Minimum width for universal approximation. In International Conference on Learning Representations, 2021. URL https:// openreview.net/forum?id=O-XJwyoIF-k.
Poirazi, P. and Papoutsi, A. Illuminating dendritic function with computational models. Nature Reviews Neuroscience, 21(6):303-321, 2020. ISSN 1471-003X, 1471-0048. doi: 10.1038/ s41583-020-0301-7. URL http://www.nature. com/articles/s41583-020-0301-7.
Schwartz, R., Dodge, J., Smith, N. A., and Etzioni, O. Green ai. Commun. ACM, 63(12):54-63, nov 2020. ISSN 0001- 0782. doi: 10.1145/3381831. URL https://doi. org/10.1145/3381831.
Shang, W., Sohn, K., Almeida, D., and Lee, H. Understanding and improving convolutional neural networks via concatenated rectified linear units. In Balcan, M. F. and Weinberger, K. Q. (eds.), Proceedings of The 33rd International Conference on Machine Learning, volume 48 of Proceedings of Machine Learning Research, pp. 2217-2225, New York, New York, USA, 20-22 Jun 2016. PMLR. URL https://proceedings.mlr. press/v48/shang16.html.
Sinha, M. and Narayanan, R. Active dendrites and local field potentials: Biophysical mechanisms and computational explorations. Neuroscience, 489:111-142, 2022. ISSN 0306-4522. doi: https://doi.org/10.1016/j.neuroscience.2021.08.035. URL https://www.sciencedirect.com/ science/article/pii/S0306452221004504. Dendritic contributions to biological and artificial computations.
Wu, X., Liu, X., Li, W., and Wu, Q. Improved expressivity through dendritic neural networks. In Bengio, S., Wallach, H. M., Larochelle, H., Grauman, K., Cesa-Bianchi, N., and Garnett, R. (eds.), Advances in Neural Information Processing Systems 31: Annual Conference on Neural Information Processing Systems 2018, NeurIPS 2018, December 3-8, 2018, Montre´al, Canada, pp. 8068- 8079, 2018. URL https://proceedings. neurips.cc/paper/2018/hash/ e32c51ad39723ee92b285b362c916ca7-Abstract. html.
1(x1)+x2 + 1

Metrics

Back to previous page

Cite as

BibTeX entry

@inproceedings{oai:it.cnr:prodotti:492323,
	title = {Increasing biases can be more efficient than increasing weights},
	author = {Carlo Metta and Marco Fantozzi and Andrea Papini and Gianluca Amato and Matteo Bergamaschi and Silvia Giulia Galfrè and Alessandro Marchetti and Michelangelo Vegliò and Maurizio Parton and Francesco Morandin},
	doi = {10.1109/wacv57701.2024.00279 and 10.48550/arxiv.2301.00924},
	year = {2024}
}