MonolithNet: Training monolithic deep neural networks via a partitioned training strategy

  • Rene Bidart
  • Alexander Wong

Abstract

In this study, we explore the training of monolithic deep neural net-
works in an effective manner. One of the biggest challenges with
training such networks to the desired level of accuracy is the dif-
ficulty in converging to a good solution using iterative optimization
methods such as stochastic gradient descent due to the enormous
number of parameters that need to be learned. To achieve this,
we introduce a partitioned training strategy, where proxy layers
are connected to different partitions of a deep neural network to
enable isolated training of a much smaller number of parameters
to convergence. To illustrate the efficacy of this training strategy,
we introduce MonolithNet, a massive residual deep neural network
consisting of 437 million parameters. The trained MonolithNet was
able to achieve a top-1 accuracy of 97% on the CIFAR10 image
classification dataset, which demonstrates the feasibility of the pro-
posed training strategy for training monolithic deep neural networks
to high accuracies.

Published
2018-12-24
How to Cite
Bidart, R., & Wong, A. (2018). MonolithNet: Training monolithic deep neural networks via a partitioned training strategy. Journal of Computational Vision and Imaging Systems, 4(1), 3. https://doi.org/10.15353/jcvis.v4i1.340