backpropagation - Hessian matrix not defined in stochastic training -
i learning basics of backpropagation algoritms, , found writing code during process helps me better grasp theoretical concepts.
at moment stuck following statement cannot understand:
"conjugate gradient requires batch training, because hessian matrix defined on full training set"
let me define w network's weights, dw weight adjustment gets calculated after each pattern presentation, , bdw sum of dw across patterns same epoch.
my problem is, don't see how hessian matrix cannot valid in stochastic setting, since have calculate @ every pattern presentation, in order determine dw. whether use dw update w after each pattern presentation (stochastic), or sum update w after patterns have been presented seems me irrelevant.
what missing?
Comments
Post a Comment