# Forward and Backward Propagation in Binary Logistic Regression

Categories:

Updated:

We use forward propagation to get our prediction and cost, and use backward propagation(‘backprop’) to get our derivatives for gradient descent.

# Notations

$X$: training examples stacked top to bottom ($R^{m\times n}$)

$x^{(i)}, a^{(i)}, z^{(i)}$: row vector or a scalar corresponding to example $i$

$w$: weight vector ($R^{n\times 1}$)

$b$: bias ($R$)

$z$: output of linear transformation ($R^{m\times 1}$)

$a$: prediction ($R^{m\times 1}$)

$J$: cost ($R$)

# Backward Propagation

We calculate $dw, db$ with back propagation.

## When $m=1$

We will look at backprop of a case where there is only one example and generalize it to cases where $m\geq1$.

When $m=1$, note that we call our cost ‘loss’ $L(w,b)=-y\log(a)-(1-y)\log(1-a)$

### Backprop Third Step

Note that $x\in R^{1\times n}$ and $a-y$ is a scalar. So the resulting $\frac{dL}{dw} \in R^{1\times n}$.

## Generalized BackProp

Our cost is a mean of loss of all examples.

Now that we know $\frac{dL}{dw} = x(a-y)$ and $\frac{dL}{db} = a-y$, we can calculate generalized version of backprop.

Categories: