The full text of this article is unavailable through your IP address: 18.119.135.231
Contents Online
Annals of Mathematical Sciences and Applications
Volume 8 (2023)
Number 3
Special Issue Dedicated to the Memory of Professor Roland Glowinski
Guest Editors: Annalisa Quaini, Xiaolong Qin, Xuecheng Tai, and Enrique Zuazua
An optimal time variable learning framework for Deep Neural Networks
Pages: 501 – 543
DOI: https://dx.doi.org/10.4310/AMSA.2023.v8.n3.a4
Authors
Abstract
Feature propagation in Deep Neural Networks (DNNs) can be associated to nonlinear discrete dynamical systems. The novelty, in this paper, lies in letting the discretization parameter (time step-size) vary from layer to layer, which needs to be learned, in an optimization framework. The proposed framework can be applied to any of the existing networks such as ResNet, DenseNet or Fractional‑DNN. This framework is shown to help overcome the vanishing and exploding gradient issues. Stability of some of the existing continuous DNNs such as Fractional‑DNN is also studied. The proposed approach is applied to an ill-posed 3D‑Maxwell’s equation.
Keywords
deep learning, Deep Neural Network, fractional time derivatives, fractional neural network, residual neural network, optimal network architecture, exploding gradients, vanishing gradients
2010 Mathematics Subject Classification
34A08, 49J15, 68T05, 82C32
This work is partially supported by NSF grants DMS-2110263, DMS-1913004, and DMS-2111315; by the Air Force Office of Scientific Research (AFOSR) under Award No. FA9550-22-1-0248; and by the Department of the Navy, Naval Postgraduate School, under Award No. N00244-20-1-0005.
Received 21 July 2022
Accepted 14 August 2023
Published 14 November 2023