Jelenlegi hely
Intézeti szeminárium
Regularizing the gradient norm of the output of a neural network
with respect to its inputs is a powerful technique which has been
independently rediscovered several times, most often with the goal
of making models robust against adversarial sampling. The aim of
this presentation is to demonstrate that gradient regularization
can consistently and significantly improve classification accuracy
on vision tasks, especially when the amount of training data is small.
We introduce our regularizers as members of a broader class of
Jacobian-based regularizers, and compare them.
Minimizing the gradient norm at the training points can potentially
lead to solutions where the model has small gradients at the training
points but contains large changes at other regions. We demonstrate
through experiments that stochastic gradient descent tends to avoid
these pathological optima. Instead, we obtain solutions that generalize
well.