The Backpropagation algorithm for a math student

A Deep Neural Network (DNN) is a composite function of vector-valued functions, and in order to train a DNN, it is necessary to calculate the gradient of the loss function with respect to all parameters. This calculation can be a non-trivial task because the loss function of a DNN is a composition o...

Celý popis

Uložené v:
Podrobná bibliografia
Vydané v:Proceedings of ... International Joint Conference on Neural Networks s. 01 - 09
Hlavní autori: Damadi, Saeed, Moharrer, Golnaz, Cham, Mostafa, Shen, Jinglai
Médium: Konferenčný príspevok..
Jazyk:English
Vydavateľské údaje: IEEE 18.06.2023
Predmet:
ISSN:2161-4407
On-line prístup:Získať plný text
Tagy: Pridať tag
Žiadne tagy, Buďte prvý, kto otaguje tento záznam!
Abstract A Deep Neural Network (DNN) is a composite function of vector-valued functions, and in order to train a DNN, it is necessary to calculate the gradient of the loss function with respect to all parameters. This calculation can be a non-trivial task because the loss function of a DNN is a composition of several nonlinear functions, each with numerous parameters. The Backpropagation (BP) algorithm leverages the composite structure of the DNN to efficiently compute the gradient. As a result, the number of layers in the network does not significantly impact the complexity of the calculation. The objective of this paper is to express the gradient of the loss function in terms of a matrix multiplication using the Jacobian operator. This can be achieved by considering the total derivative of each layer with respect to its parameters and expressing it as a Jacobian matrix. The gradient can then be represented as the matrix product of these Jacobian matrices. This approach is valid because the chain rule can be applied to a composition of vector-valued functions, and the use of Jacobian matrices allows for the incorporation of multiple inputs and outputs. By providing concise mathematical justifications, the results can be made understandable and useful to a broad audience from various disciplines.
AbstractList A Deep Neural Network (DNN) is a composite function of vector-valued functions, and in order to train a DNN, it is necessary to calculate the gradient of the loss function with respect to all parameters. This calculation can be a non-trivial task because the loss function of a DNN is a composition of several nonlinear functions, each with numerous parameters. The Backpropagation (BP) algorithm leverages the composite structure of the DNN to efficiently compute the gradient. As a result, the number of layers in the network does not significantly impact the complexity of the calculation. The objective of this paper is to express the gradient of the loss function in terms of a matrix multiplication using the Jacobian operator. This can be achieved by considering the total derivative of each layer with respect to its parameters and expressing it as a Jacobian matrix. The gradient can then be represented as the matrix product of these Jacobian matrices. This approach is valid because the chain rule can be applied to a composition of vector-valued functions, and the use of Jacobian matrices allows for the incorporation of multiple inputs and outputs. By providing concise mathematical justifications, the results can be made understandable and useful to a broad audience from various disciplines.
Author Damadi, Saeed
Shen, Jinglai
Cham, Mostafa
Moharrer, Golnaz
Author_xml – sequence: 1
  givenname: Saeed
  surname: Damadi
  fullname: Damadi, Saeed
  email: sdamadi1@umbc.edu
  organization: University of Maryland, Baltimore County (UMBC),Department of Mathematics and Statistics,Baltimore,USA
– sequence: 2
  givenname: Golnaz
  surname: Moharrer
  fullname: Moharrer, Golnaz
  email: golnazm1@umbc.edu
  organization: University of Maryland, Baltimore County (UMBC),Department of Information Systems,Baltimore,USA
– sequence: 3
  givenname: Mostafa
  surname: Cham
  fullname: Cham, Mostafa
  email: mcham2@umbc.edu
  organization: University of Maryland, Baltimore County (UMBC),Department of Information Systems,Baltimore,USA
– sequence: 4
  givenname: Jinglai
  surname: Shen
  fullname: Shen, Jinglai
  email: shenj@umbc.edu
  organization: University of Maryland, Baltimore County (UMBC),Department of Mathematics and Statistics,Baltimore,USA
BookMark eNo1j7lOw0AUABcEEnHgDyj2B2ze28u7JVgQgqLQhDp62SM2xIdsU_D3IAHVNKORJmMXXd9FxjhCgQjubv1SbbdaaQWFACELBHSonTljGRqjlbWmhHO2EGgwVwrKK5ZN0zv8uM7JBYNdHfkD-Y9h7Ac60tz0HafTsR-buW556kdOvKW55tP8GWI3X7PLRKcp3vxxyd6eHnfVc755Xa2r-03eCFBzLpVVJtEhaC2S08koDC4YX1qRPAlbkk2qTAftBXgMwXpALY1MpAV6ILlkt7_dJsa4H8ampfFr_78nvwGKl0Z0
ContentType Conference Proceeding
DBID 6IE
6IH
CBEJK
RIE
RIO
DOI 10.1109/IJCNN54540.2023.10191596
DatabaseName IEEE Electronic Library (IEL) Conference Proceedings
IEEE Proceedings Order Plan (POP) 1998-present by volume
IEEE Xplore All Conference Proceedings
IEEE Electronic Library (IEL)
IEEE Proceedings Order Plans (POP) 1998-present
DatabaseTitleList
Database_xml – sequence: 1
  dbid: RIE
  name: IEEE/IET Electronic Library (IEL) (UW System Shared)
  url: https://ieeexplore.ieee.org/
  sourceTypes: Publisher
DeliveryMethod fulltext_linktorsrc
Discipline Computer Science
EISBN 1665488670
9781665488679
EISSN 2161-4407
EndPage 09
ExternalDocumentID 10191596
Genre orig-research
GroupedDBID 6IE
6IF
6IH
6IK
6IL
6IM
6IN
AAJGR
AAWTH
ABLEC
ADZIZ
ALMA_UNASSIGNED_HOLDINGS
BEFXN
BFFAM
BGNUA
BKEBE
BPEOZ
CBEJK
CHZPO
IEGSK
IJVOP
IPLJI
M43
OCL
RIE
RIL
RIO
RNS
ID FETCH-LOGICAL-i204t-34846fabd552f95f641d9d6c782fca287a8f47fb5c20c1dd8c015363fa521c0a3
IEDL.DBID RIE
ISICitedReferencesCount 2
ISICitedReferencesURI http://www.webofscience.com/api/gateway?GWVersion=2&SrcApp=Summon&SrcAuth=ProQuest&DestLinkType=CitingArticles&DestApp=WOS_CPL&KeyUT=001046198704050&url=https%3A%2F%2Fcvtisr.summon.serialssolutions.com%2F%23%21%2Fsearch%3Fho%3Df%26include.ft.matches%3Dt%26l%3Dnull%26q%3D
IngestDate Wed Aug 27 02:12:24 EDT 2025
IsPeerReviewed false
IsScholarly false
Language English
LinkModel DirectLink
MergedId FETCHMERGED-LOGICAL-i204t-34846fabd552f95f641d9d6c782fca287a8f47fb5c20c1dd8c015363fa521c0a3
PageCount 9
ParticipantIDs ieee_primary_10191596
PublicationCentury 2000
PublicationDate 2023-June-18
PublicationDateYYYYMMDD 2023-06-18
PublicationDate_xml – month: 06
  year: 2023
  text: 2023-June-18
  day: 18
PublicationDecade 2020
PublicationTitle Proceedings of ... International Joint Conference on Neural Networks
PublicationTitleAbbrev IJCNN
PublicationYear 2023
Publisher IEEE
Publisher_xml – name: IEEE
SSID ssj0023993
Score 1.8784783
Snippet A Deep Neural Network (DNN) is a composite function of vector-valued functions, and in order to train a DNN, it is necessary to calculate the gradient of the...
SourceID ieee
SourceType Publisher
StartPage 01
SubjectTerms Complexity theory
Computer architecture
Convolutional neural networks
Iterative algorithms
Jacobian matrices
Stochastic processes
Transformers
Title The Backpropagation algorithm for a math student
URI https://ieeexplore.ieee.org/document/10191596
WOSCitedRecordID wos001046198704050&url=https%3A%2F%2Fcvtisr.summon.serialssolutions.com%2F%23%21%2Fsearch%3Fho%3Df%26include.ft.matches%3Dt%26l%3Dnull%26q%3D
hasFullText 1
inHoldings 1
isFullTextHit
isPrint
link http://cvtisr.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwlV1LTwMhEJ7YxoOn-qjxHQ5ed2V3gYWrjY0as-lBk94anrZRW9Nu_f0C3a3x4MEbgRDCTOCbgZn5AK4DmYMfw0l8YyKlYokgmUsUdlyXgufSRdaSp7Kq-HgsRk2yesyFsdbG4DObhmb8yzcLvQ5PZf6Ee--CCtaBTlmyTbLW1rsKSNuG6mBx8_A4qCoa6sulgSE8bef-YlGJIDLs_XP5fej_pOOh0RZoDmDHzg-h1_IxoOZ4HgH2Oke3Ur_5W9HfE1HmSL6_LpazevqBvHmKJPIW6hStNhUt-_AyvHse3CcNI0IyyzGpk4J4c8FJZSjNnaCOkcwIw7SHeaeld34kd6R0iuoc68wYrj3aF6xw0qO0xrI4hu58MbcngLDlDishtJOccGVUYYQkFLM81OjL9Sn0gwQmn5uiF5N282d_9J_DXpBziKLK-AV06-XaXsKu_qpnq-VVVNU3m5ST1g
linkProvider IEEE
linkToHtml http://cvtisr.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwlV1LTwIxEJ4omugJHxjf9uB1sdttd9urRAKKGw6YcCPdPoSoYGDx99sWFuPBg7emTZN2Ju03087MB3DryRzcGI7CGxPNijQSNLZRgS1XmeBE2sBa0svynA-Hor9OVg-5MMaYEHxmmr4Z_vL1TC39U5k74c67YCLdhh1GKcGrdK2Nf-WxtgrWweKu-9jKc-YrzDU9R3izmv2LRyXASLv-zwUcQOMnIQ_1N1BzCFtmegT1ipEBrQ_oMWCndXQv1Zu7F91NEaSO5PvrbD4pxx_IGahIImejjtFiVdOyAS_th0GrE605EaIJwbSMEuoMBisLzRixgtmUxlroVDmgt0o690dySzNbMEWwirXmyuF9kiZWOpxWWCYnUJvOpuYUEDbc4kIIZSWnvNBFooWkDKfEV-kj6gwaXgKjz1XZi1G1-fM_-m9grzN47o163fzpAva9zH1MVcwvoVbOl-YKdtVXOVnMr4PavgF-z5cd
openUrl ctx_ver=Z39.88-2004&ctx_enc=info%3Aofi%2Fenc%3AUTF-8&rfr_id=info%3Asid%2Fsummon.serialssolutions.com&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&rft.genre=proceeding&rft.title=Proceedings+of+...+International+Joint+Conference+on+Neural+Networks&rft.atitle=The+Backpropagation+algorithm+for+a+math+student&rft.au=Damadi%2C+Saeed&rft.au=Moharrer%2C+Golnaz&rft.au=Cham%2C+Mostafa&rft.au=Shen%2C+Jinglai&rft.date=2023-06-18&rft.pub=IEEE&rft.eissn=2161-4407&rft.spage=01&rft.epage=09&rft_id=info:doi/10.1109%2FIJCNN54540.2023.10191596&rft.externalDocID=10191596