Acceleration methods for fixed-point iterations

Yousef Saad

doi:10.1017/S0962492924000096

Acceleration methods for fixed-point iterations

Part of: Nonlinear algebraic or transcendental equations Acceleration of convergence Numerical linear algebra

Published online by Cambridge University Press: 01 July 2025

Yousef Saad

Show author details

Yousef Saad*: Affiliation:
Department of Compter Science and Engineering, University of Minnesota, Twin cities, MN 55455, USA E-mail: saad@umn.edu

Article contents

Abstract
References

Rights & Permissions

Abstract

Core share and HTML view are not available for this content. However, as you have access to this content, a full PDF is available via the ‘Save PDF’ action button.

A pervasive approach in scientific computing is to express the solution to a given problem as the limit of a sequence of vectors or other mathematical objects. In many situations these sequences are generated by slowly converging iterative procedures, and this led practitioners to seek faster alternatives to reach the limit. ‘Acceleration techniques’ comprise a broad array of methods specifically designed with this goal in mind. They started as a means of improving the convergence of general scalar sequences by various forms of ‘extrapolation to the limit’, i.e. by extrapolating the most recent iterates to the limit via linear combinations. Extrapolation methods of this type, the best-known of which is Aitken’s delta-squared process, require only the sequence of vectors as input.

However, limiting methods to use only the iterates is too restrictive. Accelerating sequences generated by fixed-point iterations by utilizing both the iterates and the fixed-point mapping itself has proved highly successful across various areas of physics. A notable example of these fixed-point accelerators (FP-accelerators) is a method developed by Donald Anderson in 1965 and now widely known as Anderson acceleration (AA). Furthermore, quasi-Newton and inexact Newton methods can also be placed in this category since they can be invoked to find limits of fixed-point iteration sequences by employing exactly the same ingredients as those of the FP-accelerators.

This paper presents an overview of these methods – with an emphasis on those, such as AA, that are geared toward accelerating fixed-point iterations. We will navigate through existing variants of accelerators, their implementations and their applications, to unravel the close connections between them. These connections were often not recognized by the originators of certain methods, who sometimes stumbled on slight variations of already established ideas. Furthermore, even though new accelerators were invented in different corners of science, the underlying principles behind them are strikingly similar or identical.

The plan of this article will approximately follow the historical trajectory of extrapolation and acceleration methods, beginning with a brief description of extrapolation ideas, followed by the special case of linear systems, the application to self-consistent field (SCF) iterations, and a detailed view of Anderson acceleration. The last part of the paper is concerned with more recent developments, including theoretical aspects, and a few thoughts on accelerating machine learning algorithms.

MSC classification

Primary: 65B05: Extrapolation to the limit, deferred corrections 65B99: None of the above, but in this section 65F10: Iterative methods for linear systems 65H10: Systems of equations

Information

Type: Research Article
Information: Acta Numerica , Volume 34 , July 2025 , pp. 805 - 890

DOI: https://doi.org/10.1017/S0962492924000096 [Opens in a new window]
Creative Commons: This is an Open Access article, distributed under the terms of the Creative Commons Attribution licence (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted re-use, distribution, and reproduction in any medium, provided the original work is properly cited.
Copyright: © The Author(s), 2025. Published by Cambridge University Press

References

Aitken, A. (1926), On Bernoulli’s numerical solution of algebraic equations, Proc. Roy. Soc. Edinburgh 46, 289–305.10.1017/S0370164600022070CrossRef Google Scholar

Anderson, D. G. (1965), Iterative procedures for non-linear integral equations, Assoc. Comput. Mach. 12(547), 547–560.CrossRef Google Scholar

Axelsson, O. (1980), Conjugate gradient type-methods for unsymmetric and inconsistent systems of linear equations, Linear Algebra Appl. 29, 1–16.10.1016/0024-3795(80)90226-8CrossRef Google Scholar

Blair, A., Metropolis, N., von Neumann, J., Taub, A. H. and Tsingou, M. (1959), A study of a numerical solution to a two-dimensional hydrodynamical problem, Math. Comp. 13, 145–184.10.1090/S0025-5718-1959-0108885-8CrossRef Google Scholar

Bottou, L. and Le Cun, Y. (2005), On-line learning for very large data sets, Appl. Stoch. Models Bus. Ind. 21, 137–151.10.1002/asmb.538CrossRef Google Scholar

Bottou, L., Curtis, F. and Nocedal, J. (2018), Optimization methods for large-scale machine learning, SIAM Rev. 60, 223–311.10.1137/16M1080173CrossRef Google Scholar

Brezinski, C. (1975), Généralisation de la transformation de Shanks, de la table de Padé et de l’ɛ-algorithme, Calcolo 12, 317–360.10.1007/BF02575753CrossRef Google Scholar

Brezinski, C. (1977), Accélération de la Convergence en Analyse Numérique, Vol. 584 of Lecture Notes in Mathematics, Springer.10.1007/BFb0089363CrossRef Google Scholar

Brezinski, C. (1980), Padé-Type Approximation and General Orthogonal Polynomials, Birkhäuser.10.1007/978-3-0348-6558-6CrossRef Google Scholar

Brezinski, C. (2000), Convergence acceleration during the 20th century, J. Comput. Appl. Math. 122, 1–21.10.1016/S0377-0427(00)00360-5CrossRef Google Scholar

Brezinski, C. and Redivo-Zaglia, M. (1991), Extrapolation Methods: Theory and Practice, North-Holland.Google Scholar

Brezinski, C. and Redivo-Zaglia, M. (2019), The genesis and early developments of Aitken’s process, Shanks’ transformation, the

-algorithm, and related fixed point methods, Numer. Algorithms 80, 11–133.CrossRef Google Scholar

Brezinski, C. and Redivo-Zaglia, M. (2020), Extrapolation and Rational Approximation: The Works of the Main Contributors, Springer.10.1007/978-3-030-58418-4CrossRef Google Scholar

Brezinski, C., Cipolla, S., Redivo-Zaglia, M. and Saad, Y. (2022), Shanks and Anderson-type acceleration techniques for systems of nonlinear equations, IMA J. Numer. Anal. 42, 3058–3093.10.1093/imanum/drab061CrossRef Google Scholar

Brezinski, C., Redivo-Zaglia, M. and Salam, A. (2023), On the kernel of vector

-algorithm and related topics, Numer. Algorithms 92, 207–221.10.1007/s11075-022-01358-zCrossRef Google Scholar

Brown, P. N. and Saad, Y. (1990), Hybrid Krylov methods for nonlinear systems of equations, SIAM J. Sci. Statist. Comput. 11, 450–481.CrossRef Google Scholar

Brown, P. N. and Saad, Y. (1994), Convergence theory of nonlinear Newton–Krylov algorithms, SIAM J. Optim. 4, 297–330.CrossRef Google Scholar

Bubeck, S. (2015), Convex optimization: Algorithms and complexity, Found . Trends Mach. Learn. 8, 231–357.10.1561/2200000050CrossRef Google Scholar

Cabay, S. and Jackson, L. W. (1976), A polynomial extrapolation method for finding limits and antilimits of vector sequences, SIAM J. Numer. Anal. 13, 734–752.10.1137/0713060CrossRef Google Scholar

Cauchy, A. L. (1847), Methode générale pour la résolution des systèmes d’équations simultanées, Comp . Rend. Acad. Sci. Paris 25, 536–538.Google Scholar

Chupin, M., Dupuy, M.-S., Legendre, G. and Séré, E. (2021), Convergence analysis of adaptive DIIS algorithms with application to electronic ground state calculations, ESAIM Math. Model. Numer. Anal. 55, 2785–2825.10.1051/m2an/2021069CrossRef Google Scholar

Degroote, J., Bathe, K.-J. and Vierendeels, J. (2009), Performance of a new partitioned procedure versus a monolithic procedure in fluid–structure interaction, Computers & Structures 87(11), 793–801.CrossRef Google Scholar

Dembo, R. S., Eisenstat, S. C. and Steihaug, T. (1982), Inexact Newton methods, SIAM J. Numer. Anal. 18, 400–408.10.1137/0719025CrossRef Google Scholar

Dirac, P. A. M. (1929), Quantum mechanics of many-electron systems, Proc. R. Soc. Lond. A123, 714–733.Google Scholar

Eddy, R. P. (1979), Extrapolation to the limit of a vector sequence, in Information Linkage Between Applied Mathematics and Industry (Wang, P. C. C., ed.), Academic Press, pp. 387–396.10.1016/B978-0-12-734250-4.50028-XCrossRef Google Scholar

Eddy, R. P. and Wang, P. C. C. (1979), Extrapolating to the limit of a vector sequence, in Information Linkage between Applied Mathematics and Industry, Academic Press, pp. 387–396.CrossRef Google Scholar

Eisenstat, S. C. and Walker, H. F. (1994), Globally convergent inexact Newton methods, SIAM J. Optim. 4, 393–422.10.1137/0804022CrossRef Google Scholar

Eisenstat, S. C., Elman, H. C. and Schultz, M. H. (1983), Variational iterative methods for nonsymmetric systems of linear equations, SIAM J. Numer. Anal. 20, 345–357.10.1137/0720023CrossRef Google Scholar

Evans, C., Pollock, S., Rebholz, L. G. and Xiao, M. (2020), A proof that Anderson acceleration improves the convergence rate in linearly converging fixed-point methods (but not in those converging quadratically), SIAM J. Numer. Anal. 58, 788–810.CrossRef Google Scholar

Eyert, V. (1996), A comparative study on methods for convergence acceleration of iterative vector sequences, J. Comput. Phys. 124, 271–285.CrossRef Google Scholar

Fang, H. and Saad, Y. (2009), Two classes of multisecant methods for nonlinear acceleration, Numer. Linear Algebra Appl. 16, 197–221.10.1002/nla.617CrossRef Google Scholar

Forsythe, G. E. (1951), Gauss to Gerling on relaxation, Mathematical Tables and Other Aids to Computation 5, 255–258.Google Scholar

Forsythe, G. E. (1953), Solving linear algebraic equations can be interesting, Bull. Amer. Math. Soc. 59, 299–329.10.1090/S0002-9904-1953-09718-XCrossRef Google Scholar

Gamov, G. (1966), Thirty Years That Shook Physics: The Story of Quantum Theory, Dover.Google Scholar

Germain-Bonne, B. (1978), Estimation de la limite de suites et formalisation de procédés d’accélération de convergence. PhD thesis, Université des Sciences et Techniques de Lille.Google Scholar

Golub, G. H. and Van Loan, C. F. (2013), Matrix Computations, fourth edition, Johns Hopkins University Press.10.56021/9781421407944CrossRef Google Scholar

Golub, G. H. and Varga, R. S. (1961), Chebyshev semi-iterative methods, successive overrelaxation iterative methods, and second order Richardson iterative methods, Numer. Math. 3, 157–168.10.1007/BF01386014CrossRef Google Scholar

Gower, R. M., Loizou, N., Qian, X., Sailanbayev, A., Shulgin, E. and Richtárik, P. (2019), SGD: General analysis and improved rates, in International Conference on Machine Learning, PMLR, pp. 5200–5209.Google Scholar

Haelterman, R., Degroote, J., Heule, D. V. and Vierendeels, J. (2010), On the similarities between the quasi-Newton inverse least squares method and GMRES, SIAM J. Numer. Anal. 47, 4660–4679.10.1137/090750354CrossRef Google Scholar

Hardt, M., Recht, B. and Singer, Y. (2016), Train faster, generalize better: Stability of stochastic gradient descent, in Proceedings of The 33rd International Conference on Machine Learning (Balcan, M. F. and Weinberger, K. Q., eds), Vol. 48 of Proceedings of Machine Learning Research, PMLR, pp. 1225–1234.Google Scholar

He, H., Tang, Z., Zhao, S., Saad, Y. and Xi, Y. (2024), nlTGCR: A class of nonlinear acceleration procedures based on conjugate residuals, SIAM J. Matrix Anal. Appl. 45, 712–743.10.1137/23M1576360CrossRef Google Scholar

Hestenes, M. R. and Stiefel, E. L. (1952), Methods of conjugate gradients for solving linear systems, J. Res. Nat. Bur. Standards 49, 409–436.10.6028/jres.049.044CrossRef Google Scholar

Hestenes, M. R. and Todd, J. (1991), Mathematicians learning to use computers. NIST Special Publication 730, NBS-INA – The Institute for Numerical Analysis – UCLA 1947–1954.Google Scholar

Higham, N. J. and Strabić, N. (2016), Anderson acceleration of the alternating projections method for computing the nearest correlation matrix, Numer. Algorithms 72, 1021–1042.10.1007/s11075-015-0078-3CrossRef Google Scholar

Jacobsen, J. and Schmitt, K. (2002), The Liouville–Bratu–Gelfand problem for radial operators, J. Differential Equations 184, 283–298.10.1006/jdeq.2001.4151CrossRef Google Scholar

Jbilou, K. (1988), Méthodes d’extrapolation et de projection: Applications aux suites de vecteurs. PhD thesis, Université des Sciences et Techniques de Lille.Google Scholar

Jbilou, K. and Sadok, H. (1991), Some results about vector extrapolation methods and related fixed point iteration, J. Comput. Appl. Math. 36, 385–398.10.1016/0377-0427(91)90018-FCrossRef Google Scholar

Jbilou, K. and Sadok, H. (2000), Vector extrapolation methods: Application and numerical comparison, J. Comput. Appl. Math 122, 149–165.10.1016/S0377-0427(00)00357-5CrossRef Google Scholar

Jea, K. C. and Young, D. M. (1980), Generalized conjugate gradient acceleration of nonsymmetrizable iterative methods, Linear Algebra Appl. 34, 159–194.Google Scholar

Kaniel, S. and Stein, J. (1974), Least-square acceleration of iterative methods for linear equations, J. Optim. Theory Appl. 14, 431–437.10.1007/BF00933309CrossRef Google Scholar

Kelley, C. T. (1995), Iterative Methods for Linear and Nonlinear Equations, Vol. 16 of Frontiers and Applied Mathematics, SIAM.10.1137/1.9781611970944CrossRef Google Scholar

Kerkhoven, T. and Saad, Y. (1992), Acceleration techniques for decoupling algorithms in semiconductor simulation, Numer. Math. 60, 525–548.10.1007/BF01385735CrossRef Google Scholar

Kingma, D. P. and Ba, J. (2015), Adam: A method for stochastic optimization, in 3rd International Conference on Learning Representations (ICLR 2015). Conference Track Proceedings.Google Scholar

Kittel, C. (1986), Introduction to Solid State Physics, Wiley.Google Scholar

Kohn, W. and Sham, L. J. (1965), Self-consistent equations including exchange and correlation effects, Phys. Rev. 140, A1133–A1138.10.1103/PhysRev.140.A1133CrossRef Google Scholar

Lanczos, C. (1950), An iteration method for the solution of the eigenvalue problem of linear differential and integral operators, J. Res. Nat. Bur. Standards 45, 255–282.10.6028/jres.045.026CrossRef Google Scholar

Lanczos, C. (1952), Solution of systems of linear equations by minimized iterations, J. Res. Nat. Bur. Standards 49, 33–53.10.6028/jres.049.006CrossRef Google Scholar

Li, H., Xu, Z., Taylor, G., Studer, C. and Goldstein, T. (2018), Visualizing the loss landscape of neural nets, in Advances in Neural Information Processing Systems 31 (Bengio, S. et al., eds), Curran Associates, pp. 6391–6401.Google Scholar

Mešina, M. (1977), Convergence acceleration for the iterative solution of the equations

, Comput . Methods Appl. Mech. Engrg 10, 165–173.10.1016/0045-7825(77)90004-4CrossRef Google Scholar

Meurant, G. and Tebbens, J. D. (2020), Krylov Methods for Nonsymmetric Linear Systems: From Theory to Computations, Vol. 57 of Springer Series in Computational Mathematics, Springer.10.1007/978-3-030-55251-0CrossRef Google Scholar

Meyer, L., Barrett, C. and Haasen, P. (1964), New crystalline phase in solid argon and its solid solutions, J. Chem. Phys. 40, 2744–2745.10.1063/1.1725600CrossRef Google Scholar

Mohsen, A. (2014), A simple solution of the Bratu problem, Comput. Math. Appl. 67, 26–33.10.1016/j.camwa.2013.10.003CrossRef Google Scholar

Murphy, K. P. (2022), Probabilistic Machine Learning: An Introduction, MIT Press.Google Scholar

Nesterov, Y. (2014), Introductory Lectures on Convex Optimization: A Basic Course, first edition, Springer.Google Scholar

Neyshabur, B., Tomioka, R. and Srebro, N. (2015), In search of the real inductive bias: On the role of implicit regularization in deep learning, in 3rd International Conference on Learning Representations (ICLR) , Workshop Track Proceedings (Bengio, Y. and LeCun, Y., eds).Google Scholar

Ortega, J. M. and Rheinbolt, W. C. (1970), Iterative Solution of Nonlinear Equations in Several Variables, Academic Press.Google Scholar

Paige, C. C. (1971), The computation of eigenvalues and eigenvectors of very large sparse matrices. PhD thesis, Institute of Computer Science, University of London.Google Scholar

Paige, C. C. (1980), Accuracy and effectiveness of the Lanczos algorithm for the symmetric eigenproblem, Linear Algebra Appl. 34, 235–258.10.1016/0024-3795(80)90167-6CrossRef Google Scholar

Pasini, M. L., Yin, J., Reshniak, V. and Stoyanov, M. (2021), Stable Anderson acceleration for deep learning. Available at https://dblp.org/rec/journals/corr/abs-2110-14813.bib.Google Scholar

Pasini, M. L., Yin, J., Reshniak, V. and Stoyanov, M. K. (2022), Anderson acceleration for distributed training of deep learning models, in SoutheastCon 2022, IEEE, pp. 289–295.10.1109/SoutheastCon48659.2022.9763953CrossRef Google Scholar

Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L. et al. (2019), PyTorch: An imperative style, high-performance deep learning library, in Advances in Neural Information Processing Systems 32 (Wallach, H. et al., eds), Curran Associates.Google Scholar

Petrova, S. S. and Solov’ev, A. D. (1997), The origin of the method of steepest descent, Hist. Math. 24, 361–375.10.1006/hmat.1996.2146CrossRef Google Scholar

Pollock, S. and Rebholz, L. G. (2021), Anderson acceleration for contractive and noncontractive operators, IMA J. Numer. Anal. 41, 2841–2872.10.1093/imanum/draa095CrossRef Google Scholar

Pugachëv, B. (1977), Acceleration of the convergence of iterative processes and a method of solving systems of non-linear equations, USSR Comput. Math. Math. Phys. 17, 199–207.10.1016/0041-5553(77)90023-4CrossRef Google Scholar

Pulay, P. (1980), Convergence acceleration of iterative sequences. the case of SCF iteration, Chem. Phys. Lett. 73, 393–398.10.1016/0009-2614(80)80396-4CrossRef Google Scholar

Pulay, P. (1982), Improved SCF convergence acceleration, J. Comput. Chem. 3, 556–560.10.1002/jcc.540030413CrossRef Google Scholar

Ramière, I. and Helfer, T. (2015), Iterative residual-based vector methods to accelerate fixed point iterations, Comput. Math. Appl. 70, 2210–2226.10.1016/j.camwa.2015.08.025CrossRef Google Scholar

Reid, J. K. (1971), On the method of conjugate gradients for the solution of large sparse systems of linear equations, in Large Sparse Sets of Linear Equations (J. K. Reid, ed.), Academic Press, pp. 231–254.Google Scholar

Richardson, L. F. (1910), The approximate arithmetical solution by finite differences of physical problems involving differential equations with an application to the stresses to a masonry dam, Philos . Trans. Roy. Soc. A 210, 307–357.Google Scholar

Robbins, H. and Monro, S. (1951), A stochastic approximation method, Ann. Math. Statist. pp. 400–407.10.1214/aoms/1177729586CrossRef Google Scholar

Rohwedder, T. (2010), An analysis for some methods and algorithms of quantum chemistry. PhD thesis, Technische Universität Berlin, Fakultät II - Mathematik und Naturwissenschaften.Google Scholar

Romberg, W. (1955), Vereinfachte numerische Integration, Norske Vid. Selsk. Forh. 28, 30–36.Google Scholar

Saad, Y. (2003), Iterative Methods for Sparse Linear Systems, second edition, SIAM.10.1137/1.9780898718003CrossRef Google Scholar

Saad, Y. (2011), Numerical Methods for Large Eigenvalue Problems , Vol. 66 of Classics in Applied Mathematics, SIAM.Google Scholar

Saad, Y. (2022), The origin and development of Krylov subspace methods, Comput. Sci. Engrg 24, 28–39.CrossRef Google Scholar

Saad, Y. and Schultz, M. H. (1986), GMRES: A generalized minimal residual algorithm for solving nonsymmetric linear systems, SIAM J. Sci. Statist. Comput. 7, 856–869.10.1137/0907058CrossRef Google Scholar

Saad, Y., Chelikowsky, J. and Shontz, S. (2009), Numerical methods for electronic structure calculations of materials, SIAM Rev. 52, 3–54.10.1137/060651653CrossRef Google Scholar

Shanks, D. (1955), Non-linear transformations of divergent and slowly convergent sequences, J. Math. Phys. 34, 1–42.10.1002/sapm19553411CrossRef Google Scholar

Sheldon, J. W. (1955), On the numerical solution of elliptic difference equations, Mathematical Tables and Other Aids to Computation 9, 101–112.10.2307/2002066CrossRef Google Scholar

Shi, W., Song, S., Wu, H., Hsu, Y.-C., Wu, C. and Huang, G. (2019), Regularized Anderson acceleration for off-policy deep reinforcement learning, in Advances in Neural Information Processing Systems 32 (Wallach, H. et al., eds), Curran Associates.Google Scholar

Shortley, G. H. (1953), Use of Tschebyscheff-polynomial operators in the solution of boundary value problems, J. Appl. Phys. 24, 392–396.10.1063/1.1721292CrossRef Google Scholar

Sidi, A. (2012), Review of two vector extrapolation methods of polynomial type with applications to large-scale problems, J. Comput. Sci. 3, 92–101.10.1016/j.jocs.2011.01.005CrossRef Google Scholar

Sidi, A., Ford, W. F. and Smith, D. A. (1986), Acceleration of convergence of vector sequences, SIAM J. Numer. Anal. 23, 178–196.10.1137/0723013CrossRef Google Scholar

Smith, D. A., Ford, W. F. and Sidi, A. (1987), Extrapolation methods for vector sequences, SIAM Rev. 29, 199–233.10.1137/1029042CrossRef Google Scholar

Sun, K., Wang, Y., Liu, Y., Pan, B., Jui, S., Jiang, B., Kong, L. et al. (2021), Damped Anderson mixing for deep reinforcement learning: Acceleration, convergence, and stabilization, in Advances in Neural Information Processing Systems 34 (Ranzato, M. et al., eds), Curran Associates, pp. 3732–3743.Google Scholar

Tang, Z., Xu, T., He, H., Saad, Y. and Xi, Y. (2024), Anderson acceleration with truncated Gram–Schmidt, SIAM J. Matrix Anal. Appl. 45, 1850–1872.10.1137/24M1648600CrossRef Google Scholar

Toth, A. and Kelley, C. T. (2015), Convergence analysis for Anderson acceleration, SIAM J. Numer. Anal. 53, 805–819.CrossRef Google Scholar

Vanderbilt, D. and Louie, S. G. (1984), Total energies of diamond (111) surface reconstructions by a linear combination of atomic orbitals method, Phys. Rev. B 30, 6118–6130.10.1103/PhysRevB.30.6118CrossRef Google Scholar

Vinsome, P. K. W. (1976), ORTHOMIN: An iterative method for solving sparse sets of simultaneous linear equations, in Proceedings of the Fourth Symposium on Reservoir Simulation, Society of Petroleum Engineers of AIME, pp. 149–159.Google Scholar

Walker, H. F. and Ni, P. (2011), Anderson acceleration for fixed-point iterations, SIAM J. Numer. Anal. 49, 1715–1735.10.1137/10078356XCrossRef Google Scholar

Wu, L., Zhu, Z. and W. E (2017), Towards understanding generalization of deep learning: Perspective of loss landscapes. Available at https://dblp.org/rec/journals/corr/WuZE17.bib.Google Scholar

Wynn, P. (1956), On a device for computing the

transformation, Mathematical Tables and Other Aids to Computation 10, 91–96.10.2307/2002183CrossRef Google Scholar

Wynn, P. (1962), Acceleration techniques for iterated vector and matrix problems, Math. Comp. 16, 301–322.10.1090/S0025-5718-1962-0145647-XCrossRef Google Scholar

Young, D. (1954), On Richardson’s method for solving linear systems with positive definite matrices, J. Math. Phys. 32, 243–255.10.1002/sapm1953321243CrossRef Google Scholar

Zhang, C., Bengio, S., Hardt, M., Recht, B. and Vinyals, O. (2021), Understanding deep learning (still) requires rethinking generalization, Commun . Assoc. Comput. Mach. 64, 107–115.Google Scholar

Zhang, J., O’Donoghue, B. and Boyd, S. (2020), Globally convergent type-I Anderson acceleration for nonsmooth fixed-point iterations, SIAM J. Optim. 30, 3170–3197.10.1137/18M1232772CrossRef Google Scholar

Zhou, P., Feng, J., Ma, C., Xiong, C., Hoi, S. C. H. and W. E (2020), Towards theoretically understanding why SGD generalizes better than Adam in deep learning, in Advances in Neural Information Processing Systems 33 (Larochelle, H. et al., eds), Curran Associates, pp. 21285–21296.Google Scholar

Article contents

Acceleration methods for fixed-point iterations

Abstract

MSC classification

Information

References

Save article to Kindle

Save article to Dropbox

Save article to Google Drive

Reply to: Submit a response

Your details

You have entered the maximum number of contributors

Conflicting interests