Annals of Mathematical Sciences and Applications
Volume 7 (2022)
On Breiman’s dilemma in neural networks: phase transitions of margin dynamics
Pages: 221 – 258
Margin enlargement over training data has been an important strategy since perceptrons in machine learning for the purpose of boosting the robustness of classifiers toward a good generalization ability. Yet Breiman shows a dilemma  that a uniform improvement on margin distribution does not necessarily reduces generalization errors. In this paper, we revisit Breiman’s dilemma in deep neural networks with recently proposed spectrally normalized margins. A novel perspective is provided to explain Breiman’s dilemma based on phase transitions in evolution of normalized margin distributions during training, that reflects the trade-off between expressive power of models and complexity of data. When data complexity is comparable to the model expressiveness in the sense that both training and test data share similar phase transitions in normalized margin dynamics, two efficient ways are derived to predict the trend of generalization or test error via classic margin-based generalization bounds with restricted Rademacher complexities. On the other hand, over-expressive models that exhibit uniform improvements on training margins, as a distinct phase transition to test margin dynamics, may lose such a prediction power and fail to prevent the overfitting. Experiments are conducted to show the validity of the proposed method with some basic convolutional networks, AlexNet, VGG-16, and ResNet-18, on several datasets including Cifar10/100 and mini-ImageNet.
Breiman's Dilemma, Convolutional Neural Networks, Rademacher Complexity, Margin, Phase Transition
Part of this work was presented at ICCM 2019 in Beijing.
Received 14 July 2022
Accepted 15 July 2022
Published 12 September 2022