<  Back to the Polytechnique Montréal portal

On the convergence of stochastic gradient descent in low-precision number formats

Matteo Cacciola, Antonio Frangioni, Masoud Asgharian, Alireza Ghaffari and Vahid Partovi Nia

Paper (2023)

Open Acess document in PolyPublie and at official publisher
[img]
Preview
Open Access to the full text of this document
Published Version
Terms of Use: Creative Commons Attribution Non-commercial No Derivatives
Download (2MB)
Show abstract
Hide abstract

Abstract

Deep learning models are dominating almost all artificial intelligence tasks such as vision, text, and speech processing. Stochastic Gradient Descent (SGD) is the main tool for training such models, where the computations are usually performed in single-precision floating-point number format. The convergence of single-precision SGD is normally aligned with the theoretical results of real numbers since they exhibit negligible error. However, the numerical error increases when the computations are performed in low-precision number formats. This provides compelling reasons to study the SGD convergence adapted for low-precision computations. We present both deterministic and stochastic analysis of the SGD algorithm, obtaining bounds that show the effect of number format. Such bounds can provide guidelines as to how SGD convergence is affected when constraints render the possibility of performing high-precision computations remote.

Uncontrolled Keywords

convergence Analysis; floating Pint Arithmetic; low-precision number format; optimization; quasi-convex function; stochastic gradient descent

Subjects: 2950 Applied mathematics > 2950 Applied mathematics
Department: Department of Mathematics and Industrial Engineering
Research Center: Other
PolyPublie URL: https://publications.polymtl.ca/54349/
Conference Title: 12th International Conference on Pattern Recognition Applications and Methods (ICPRAM 2023)
Conference Location: Lisbon, Portugal
Conference Date(s): 2023-02-22 - 2023-02-24
Publisher: SciTePress
DOI: 10.5220/0011795500003411
Official URL: https://doi.org/10.5220/0011795500003411
Date Deposited: 13 Nov 2023 11:25
Last Modified: 28 Sep 2024 23:12
Cite in APA 7: Cacciola, M., Frangioni, A., Asgharian, M., Ghaffari, A., & Nia, V. P. (2023, February). On the convergence of stochastic gradient descent in low-precision number formats [Paper]. 12th International Conference on Pattern Recognition Applications and Methods (ICPRAM 2023), Lisbon, Portugal. https://doi.org/10.5220/0011795500003411

Statistics

Total downloads

Downloads per month in the last year

Origin of downloads

Dimensions

Repository Staff Only

View Item View Item