Large and moderate deviations for Gaussian neural networks

Claudio Macci; Barbara Pacchiarotti; Giovanni Luca Torrisi

doi:10.1017/jpr.2025.10024

Large and moderate deviations for Gaussian neural networks

Part of: Limit theorems

Published online by Cambridge University Press: 01 October 2025

Claudio Macci ,

Barbara Pacchiarotti and

Giovanni Luca Torrisi

Show author details

Claudio Macci*: Affiliation:
Università di Roma Tor Vergata
Barbara Pacchiarotti*: Affiliation:
Università di Roma Tor Vergata
Giovanni Luca Torrisi*: Affiliation:
Consiglio Nazionale delle Ricerche
*: *Postal address: Dipartimento di Matematica, Università di Roma Tor Vergata, Via della Ricerca Scientifica, I-00133 Rome, Italy.
*Postal address: Dipartimento di Matematica, Università di Roma Tor Vergata, Via della Ricerca Scientifica, I-00133 Rome, Italy.
****Postal address: Istituto per le Applicazioni del Calcolo (IAC), Consiglio Nazionale delle Ricerche, Via dei Taurini 19, I-00185 Rome, Italy. Email: giovanniluca.torrisi@cnr.it

Article contents

Abstract
References

Get access

Rights & Permissions

Abstract

We prove large and moderate deviations for the output of Gaussian fully connected neural networks. The main achievements concern deep neural networks (i.e. when the model has more than one hidden layer) and hold for bounded and continuous pre-activation functions. However, for deep neural networks fed by a single input, we have results even if the pre-activation is ReLU. When the network is shallow (i.e. there is exactly one hidden layer), the large and moderate principles hold for quite general pre-activation functions.

Keywords

Asymptotic behavior contraction principle deep neural networks ReLU pre-activation function

MSC classification

Primary: 60F10: Large deviations 60F05: Central limit and other weak theorems

Information

Type: Original Article
Information: Journal of Applied Probability , First View , pp. 1 - 20

DOI: https://doi.org/10.1017/jpr.2025.10024 [Opens in a new window]
Copyright: © The Author(s), 2025. Published by Cambridge University Press on behalf of Applied Probability Trust

Access options

Get access to the full version of this content by using one of the access options below. (Log in options will check for institutional or personal access. Content may require purchase if you do not have access.)

Article purchase

Temporarily unavailable

References

Apollonio, N., De Canditiis, D., Franzina, G., Stolfi, P. and Torrisi, G. L. (2024). Normal approximation of random Gaussian neural networks. Stoch. Syst. 15, 88–110.10.1287/stsy.2023.0033CrossRef Google Scholar

Balasubramanian, K., Goldstein, L., Ross, N. and Salim, A. (2024). Gaussian random field approximation via Stein’s method with applications to wide random neural networks. Appl. Comput. Harmonic Anal. 72, 101668.10.1016/j.acha.2024.101668CrossRef Google Scholar

Basteri, A. and Trevisan, D. (2024). Quantitative Gaussian approximation of randomly initialized deep neural networks. Mach. Learn. 113, 1–31.10.1007/s10994-024-06578-zCrossRef Google Scholar

Bathia, R. (1997). Matrix Analysis (Grad. Texts Math. 169). Springer, Berlin.Google Scholar

Bordino, A., Favaro, S. and Fortini, S. (2024). Non-asymptotic approximations of Gaussian neural networks via second-order Poincaré inequalities. Prog. Mach. Learn. Res. 253, 45–78.Google Scholar

Braun, A., Kohler, M., Langer, S. and Walk, H. (2024). Convergence rates for shallow neural networks learned by gradient descent. Bernoulli 30, 475–502.10.3150/23-BEJ1605CrossRef Google Scholar

Cammarota, V., Marinucci, D., Salvi, M. and Vigogna, S. (2024). A quantitative functional central limit theorem for shallow neural networks. Mod. Stoch. Theory Appl. 11, 85–108.10.15559/23-VMSTA238CrossRef Google Scholar

Chaganty, N. R. (1997). Large deviations for joint distributions and statistical applications. Sankhyā A 59, 147–166.Google Scholar

Dembo, A. and Zeitouni, O. (2010). Large Deviations Techniques and Applications (Stoch. Model. Appl. Prob. 38). Springer, Berlin.Google Scholar

Eldan, R., Mikulincer, D. and Schramm, T. (2021). Non-asymptotic approximations of neural networks by Gaussian processes. Proc. Mach. Learn. Res. 134, 1754–1775.Google Scholar

Favaro, S., Fortini, S. and Peluchetti, S. (2023). Deep stable neural networks: Large-width asymptotics and convergence rates. Bernoulli 29, 2574–2597.10.3150/22-BEJ1553CrossRef Google Scholar

Favaro, S., Hanin, B., Marinucci, D., Nourdin, I. and Peccati, G. (2025). Quantitative CLTs in deep neural networks. Prob. Theory Relat. Fields 191, 933–977. https://doi.org/10.1007/s00440-025-01360-1.CrossRef Google Scholar

Giuliano, R., Macci, C. and Pacchiarotti, B. (2024). Asymptotic results for sums and extremes. J. Appl. Prob. 61, 1153–1171.10.1017/jpr.2023.118CrossRef Google Scholar

Goodfellow, I. J., Pouget-Abadie, J., Mirza, M., Xu, B., Warde-Farley, D., Ozair, S., Courville, A. and Bengio, Y. (2014). Generative adversarial nets. Adv. Neural Inf. Process. Syst. 27, 2672–2680.Google Scholar

Hanin, B. (2023). Random neural networks in the infinite width limit as Gaussian processes. Ann. Appl. Prob. 33, 4798–4819.10.1214/23-AAP1933CrossRef Google Scholar

Hanin, B. (2024). Random fully connected neural networks as perturbatively solvable hierarchies. J. Mach. Learn. Res. 25, 1–58.Google Scholar

Hirsch, C. and Willhalm, D. (2024). Large deviations of one-hidden-layer neural networks. Stoch. Dyn. 8, 2550002.10.1142/S0219493725500029CrossRef Google Scholar

Jung, P., Lee, H., Lee, J. and Yang, H.

$\alpha$ -stable convergence of heavy-/light-tailed infinitely wide neural networks. Adv. Appl. Prob. 55, 1415–1441.Google Scholar

LeCun, Y., Bengio, Y. and Hinton, G. (2015). Deep learning. Nature 521, 436–444.10.1038/nature14539CrossRef Google Scholar PubMed

Li, B. and Saad, D. (2020). Large deviation analysis of function sensitivity in random deep neural networks. J. Phys. A 53, 104002.10.1088/1751-8121/ab6a6fCrossRef Google Scholar

Neal, R. M. (1996). Priors for infinite networks. In Bayesian Learning for Neural Networks (Lect. Notes Statist. 118), ed. Neal, R. M.. Springer, New York, pp. 29–53.10.1007/978-1-4612-0745-0_2CrossRef Google Scholar

Roberts, D. A., Yaida, S. and Hanin, B. (2022). The Principles of Deep Learning Theory. Cambridge University Press.10.1017/9781009023405CrossRef Google Scholar

Saulis, L. and Statulevičius, V. A. (1991). Limit Theorems for Large Deviations (Math. Appl. (Sov. Ser.) 73). Kluwer, Dordrecht.Google Scholar

Sirignano, J. and Spiliopoulos, K. (2020). Mean field analysis of neural networks: A law of large numbers. SIAM J. Appl. Math. 80, 725–752.10.1137/18M1192184CrossRef Google Scholar

Vogel, Q. (2024). Large deviations of Gaussian neural networks with ReLU activation. Preprint, arXiv:2405.16958.Google Scholar

Zavatone-Veth, J. A. and Pehlevan, C. (2021). Exact marginal prior distributions of finite Bayesian neural networks. In Advances in Neural Information Processing Systems 34, eds. M. Ranzato, A. Beygelzimer, Y. Dauphin, P. S. Liang and J. Wortman Vaughan. Curran Associates, Red Hook, NY, pp. 3364–3375.Google Scholar

Article contents

Large and moderate deviations for Gaussian neural networks

Abstract

Keywords

MSC classification

Information

Access options

Article purchase

Temporarily unavailable

References

Save article to Kindle

Save article to Dropbox

Save article to Google Drive

Reply to: Submit a response

Your details

You have entered the maximum number of contributors

Conflicting interests