Forward and Backward Propagation in Neural Networks

Notations

Input layer is 0th layer, first hidden layer is 1st layer, and so on.

Term inside square brackets is the dimensionality of that notation.

$L$: total number of layers excluding input layer

$m$: number of training examples

$n^{[l]}$: number of units in lth layer

$W^{[l]}$: weight matrix of linear transformation that outputs lth layer [$n^{[l-1]}\times n^{[l]}$]

$b^{[l]}$: bias of linear transformation that outputs lth layer [$1 \times n^{[l]}$]

$Z^{[l]}$: linear transformation output in lth layer [$m\times n^{[l]}$]

$A^{[l]}$: unit matrix of lth layer [$m\times n^{[0]}$]

  • $A^{[0]}$ equals $X$ which is an input matrix. When $i>0$, $A^{[l]}$ is an activation of $Z^{[l]}$.
  • $A^{[L]}$ equals $\hat{Y}$ which is our prediction.

$g^{[l]}$: activation function of lth layer

$J$: cost

$dZ, dW$ are abbreviations of $\frac{dJ}{dZ}, \frac{dJ}{dW}$ respectively.

Forward Propagation

For $l=1,2,…,L$

Remember that $A^{[0]}$ is input matrix $X$ and $A^{[L]}$ is our prediction

Backward Propagation

For $l=L-1,L-2…,1$

Leave a Comment