I have two “sub-questions”
1) How can I detect vanishing or exploding gradients with Tensorboard, given the fact that currently write_grads=True is deprecated in the Tensorboard callback as per “un-deprecate write_grads for fit #31173” ?
2) I figured I can probably tell whether my model suffers from vanishing gradients based on the weights’ distributions and histograms in the Distributions and Histograms tab in Tensorboard. My problem is that I have no frame of reference to compare with. Currently, my biases seem to be “moving” but I can’t tell whether my kernel weights (Conv2D layers) are “moving”/”changing” “enough”. Can someone help me by giving a rule of thumb to asses this visually in Tensorboard? I.e. if only the bottom 25% percentile of kernel weights are moving, that’s good enough / not good enough? Or perhaps someone can post two reference images from tensorBoard of vanishing gradients vs, non vanishing gradients.
Here are my histograms and distributions, is it possible to tell whether my model suffers from Vanishing gradients? (some layers omitted for brevity) Thanks in advance.