On Breiman’s dilemma in neural networks: phase transitions of margin dynamics

Margin enlargement over training data has been an important strategy since perceptrons in machine learning for the purpose of boosting the robustness of classifiers toward a good generalization ability. Yet Breiman shows a dilemma [5] that a uniform improvement on margin distribution does not necessarily reduces generalization errors. In this paper, we revisit Breiman’s dilemma in deep neural networks with recently proposed spectrally normalized margins. A novel perspective is provided to explain Breiman’s dilemma based on phase transitions in evolution of normalized margin distributions during training, that reflects the trade-off between expressive power of models and complexity of data. When data complexity is comparable to the model expressiveness in the sense that both training and test data share similar phase transitions in normalized margin dynamics, two efficient ways are derived to predict the trend of generalization or test error via classic margin-based generalization bounds with restricted Rademacher complexities. On the other hand, over-expressive models that exhibit uniform improvements on training margins, as a distinct phase transition to test margin dynamics, may lose such a prediction power and fail to prevent the overfitting. Experiments are conducted to show the validity of the proposed method with some basic convolutional networks, AlexNet, VGG-16, and ResNet-18, on several datasets including Cifar10/100 and mini-ImageNet.

Keywords

Breiman's Dilemma, Convolutional Neural Networks, Rademacher Complexity, Margin, Phase Transition

Full Text (PDF format)

Part of this work was presented at ICCM 2019 in Beijing.

Received 14 July 2022

Accepted 15 July 2022

Published 12 September 2022