Categories:

Updated:

In practice, we apply pre-implemented backprop, so we don’t need to check if gradients are correctly calculated.

However, when we want to implement backprop from scratch ourselves, we need to check our gradients.

How do we do that? We approximate gradients and compare them with our implementation.

1. Take all of the weights and biases and reshape them into a big vector $\Theta$.
2. Take all of the gradients of weights and biases and reshape them into a big vector $d\Theta$.
3. For each $i$ $d\Theta_{approx}^i = \frac{J(\Theta_1,\Theta2,...,\Theta_i+\epsilon,...)-J(\Theta_1,\Theta2,...,\Theta_i-\epsilon,...)}{2\epsilon}$ $E = \frac{\|d\Theta_{approx}^i-d\Theta^i\|_2}{\|d\Theta_{approx}^i\|_2+\|d\Theta^i\|_2}$

4. If $E$ is sufficiently small, we have correctly implemented back propagation. (ie. If $\epsilon=10^{-7}$, and $E<10^{-7}$, it is correct.)

Categories: