deep learning - How L2 Regularization changes backpropogation formulas

deep learning - How L2 Regularization changes backpropogation formulas - Cross Validated

May 15, 2011

i going through online deep learning book , trying recreate neural network written there bit of different class designs. however, i've run problem, when using l2 regularization can't see impact on backpropogation formulas. here's mean. formula in backpropogation uses loss function derivative 1 defining error output layer , defined follow:

error = c'(a) * a'(z)

where c'(a) loss function derivative respect activation , a'(z) activation function derivative respect weighed input. don't see how part of equation changes when adding l2 regularization. believe should derivative of loss function respect activation should change, we're adding squared weights should disappear when calculating derivative(since respect activation, not weights). should wrong logic, please tell is.

edit: more specific. suppose use quadratic loss function l2 regularization. follow true , if not, why?

c'(a) = - y

where activation , y desired output.

for cost function, if use l2 regularization, besides regular loss function, need add additional loss caused high weights. need add below value loss function. lambda hyperparameter controls l2 regularization. when equals 0, no regularization @ all. m number of instances.

now when propagation , calculate derivative, need calcuate additional cost's derivative too

when update weights, need substract learning rate * additional derivative. pushes weight lower, called weight decay.

Search This Blog

Single

deep learning - How L2 Regularization changes backpropogation formulas - Cross Validated

Comments

Post a Comment

Popular posts from this blog

neo4j - finding mutual friends in a cypher statement starting with three or more persons -

php - How to remove letter in front of the word laravel -

linux - Why does bash short curcuit fail in crontab? -