Logistic Regression Derivatives (Simple Explanation)

This page explains how to compute derivatives in logistic regression using a computation graph, broken down into easy-to-understand steps.

🧬 Goal

We want to find out how to adjust the weights and bias in a logistic regression model so it gets better at making predictions. We use something called derivatives (slopes) to guide the learning process.

🔄 Forward Pass (Prediction)

1. Compute the score `z`:

z = w_1 x_1 + w_2 x_2 + b

This is a weighted sum of the inputs.

2. Apply the sigmoid function:

a = \sigma(z) = \frac{1}{1 + e^{-z}}

This turns the score into a probability a (e.g., how likely is this email to be spam?).

3. Compute the loss:

\mathcal{L}(a, y)

This tells us how wrong our prediction a is, compared to the true label y.

🔄 Backward Pass (Learning)

We now work backwards to calculate how much each part contributed to the error.

Step 1: From Loss to Sigmoid Output

\frac{\partial \mathcal{L}}$\partial a$ = -\frac{y}{a} + \frac{1 - y}{1 - a}

This tells us how much the loss changes when a changes.

Step 2: From Sigmoid Output to z

\frac$\partial a$$\partial z$ = a(1 - a)

This is the derivative of the sigmoid function. So:

\frac{\partial \mathcal{L}}$\partial z$ = \frac{\partial \mathcal{L}}$\partial a$ \cdot \frac$\partial a$$\partial z$

Step 3: From z to Weights and Bias

Since z = w_1 x_1 + w_2 x_2 + b, we have:

\frac$\partial z$$\partial w_1$ = x_1\qquad
\frac$\partial z$$\partial w_2$ = x_2\qquad
\frac$\partial z$$\partial b$ = 1

So:

\frac{\partial \mathcal{L}}$\partial w_1$ = \frac{\partial \mathcal{L}}$\partial z$ \cdot x_1
\frac{\partial \mathcal{L}}$\partial w_2$ = \frac{\partial \mathcal{L}}$\partial z$ \cdot x_2
\frac{\partial \mathcal{L}}$\partial b$ = \frac{\partial \mathcal{L}}$\partial z$

🎓 Gradient Descent Update

We update our weights and bias using these gradients:

w_1 := w_1 - \alpha \cdot \frac{\partial \mathcal{L}}$\partial w_1$

w_2 := w_2 - \alpha \cdot \frac{\partial \mathcal{L}}$\partial w_2$

b := b - \alpha \cdot \frac{\partial \mathcal{L}}$\partial b$

Where \alpha is the learning rate (controls the step size).

✅ Summary

Forward pass: Make a prediction using weights and sigmoid.
Loss: Measure how wrong the prediction was.
Backward pass: Compute how much each weight and bias contributed to the error.
Update: Adjust weights and bias to reduce future errors.

This is the heart of training a logistic regression model using derivatives and gradient descent.

🧬 Goal​

🔄 Forward Pass (Prediction)​

1. Compute the score z:​

2. Apply the sigmoid function:​

3. Compute the loss:​

🔄 Backward Pass (Learning)​

Step 1: From Loss to Sigmoid Output​

Step 2: From Sigmoid Output to z​

Step 3: From z to Weights and Bias​

🎓 Gradient Descent Update​

✅ Summary​