python - Keras Cost Function Error when trying to round predicted tensor to nearest integer -
i using sigmoid activation in second-last layer , resizing using tf.images.resize_images()
in last layer.
the target tensor has maximum value of 1.0. in dice error cost function.
def dice(y_true, y_pred): return 1.0-dice_coef(y_true, y_pred, 1e-5, 0.5) def dice_coef(y_true, y_pred, smooth, thresh, axis = [1,2,3]): y_pred = k.round(y_pred) inse = k.sum(k.dot(y_true, k.transpose(y_pred)), axis=axis) l = k.sum(y_pred, axis=axis) r = k.sum(y_true, axis=axis) hard_dice = (2. * inse + smooth) / (l + r + smooth) hard_dice = k.mean(hard_dice) return hard_dice
when run code error below. however, error goes away when remove k.round(y_pred)
. idea on how solve problem?
loss,acc,err = final_model.train_on_batch(train_image,label) file "c:\local\anaconda3-4.1.1-windows-x86_64\envs\tensorflow-cpu\lib\site-packages\keras\engine\training.py", line 1761, in train_on_batch self._make_train_function() file "c:\local\anaconda3-4.1.1-windows-x86_64\envs\tensorflow-cpu\lib\site-packages\keras\engine\training.py", line 960, in _make_train_function loss=self.total_loss) file "c:\local\anaconda3-4.1.1-windows-x86_64\envs\tensorflow-cpu\lib\site-packages\keras\legacy\interfaces.py", line 87, in wrapper return func(*args, **kwargs) file "c:\local\anaconda3-4.1.1-windows-x86_64\envs\tensorflow-cpu\lib\site-packages\keras\optimizers.py", line 358, in get_updates new_a = self.rho * + (1. - self.rho) * k.square(g) file "c:\local\anaconda3-4.1.1-windows-x86_64\envs\tensorflow-cpu\lib\site-packages\keras\backend\tensorflow_backend.py", line 1358, in square return tf.square(x) file "c:\local\anaconda3-4.1.1-windows-x86_64\envs\tensorflow-cpu\lib\site-packages\tensorflow\python\ops\math_ops.py", line 447, in square return gen_math_ops.square(x, name=name) file "c:\local\anaconda3-4.1.1-windows-x86_64\envs\tensorflow-cpu\lib\site-packages\tensorflow\python\ops\gen_math_ops.py", line 2591, in square result = _op_def_lib.apply_op("square", x=x, name=name) file "c:\local\anaconda3-4.1.1-windows-x86_64\envs\tensorflow-cpu\lib\site-packages\tensorflow\python\framework\op_def_library.py", line 508, in apply_op (input_name, err)) valueerror: tried convert 'x' tensor , failed. error: none values not supported`
neural networks use gradient descent, train: in high-dimensional parameter space, adjust them in direction of steepest negative gradient find minimum. that, loss function has differentiable. rounding function, however, not (image source):
as can see, die gradient undefined between 2 integers, , 0 everywhere else. thus, if define gradient @ discontinuities manually, backpropagated gradient 0 due chain rule.
i not know exact purpose of network. however, might worth trying convert network regression (where predict continuous number) problem classification problem, predict class score each possible integer instead of rounding.
update:
if masking or segmentation, real-valued output give sort of 'probability' (at least when using softmax in last layer) pixel or voxel belongs region want mask. if round result, loose important detail training network. pixel score of 0.4 given same score 1 0.1. thus, change small weight change not change loss of network , gradient descent not work. original paper introducing dice loss segmentation, not use rounding. if want map each pixel foreground/background visualizatuion purposes, should after computing loss.
however, have possiblity define own 'gradient', since gradient descent not way optimize. there derivative-free optimization techniques. careful.
without trying if works in practice, approach, when don't want go without round function (no guaranty yield sensible results in way): using distribution theory, define derivative of round function, sum of derivatives of many heaviside functions, leaving dirac comb. if replace delta distributions normal distributions small standard deviation, effect, gradient in between integers lead them in direction of nearest integer (with exeption of between, derivative of normal distribution 0).
disclaimer: i've never seen in use anywhere, , best solution abandon round function, if feel experimenting bit, try this. if anyone, has arguments, why plainly false, please tell me!
Comments
Post a Comment