Logistic Regression Derivatives (Simple Explanation)
This page explains how to compute derivatives in logistic regression using a computation graph, broken down into easy-to-understand steps.
🧬 Goal
We want to find out how to adjust the weights and bias in a logistic regression model so it gets better at making predictions. We use something called derivatives (slopes) to guide the learning process.
🔄 Forward Pass (Prediction)
1. Compute the score z
:
z = w_1 x_1 + w_2 x_2 + b
This is a weighted sum of the inputs.
2. Apply the sigmoid function:
a = \sigma(z) = \frac{1}{1 + e^{-z}}
This turns the score into a probability a
(e.g., how likely is this email to be spam?).
3. Compute the loss:
\mathcal{L}(a, y)
This tells us how wrong our prediction a
is, compared to the true label y
.
🔄 Backward Pass (Learning)
We now work backwards to calculate how much each part contributed to the error.
Step 1: From Loss to Sigmoid Output
\frac{\partial \mathcal{L}}{\partial a} = -\frac{y}{a} + \frac{1 - y}{1 - a}
This tells us how much the loss changes when a
changes.
Step 2: From Sigmoid Output to z
\frac{\partial a}{\partial z} = a(1 - a)
This is the derivative of the sigmoid function. So:
\frac{\partial \mathcal{L}}{\partial z} = \frac{\partial \mathcal{L}}{\partial a} \cdot \frac{\partial a}{\partial z}
Step 3: From z to Weights and Bias
Since z = w_1 x_1 + w_2 x_2 + b
, we have:
\frac{\partial z}{\partial w_1} = x_1\qquad
\frac{\partial z}{\partial w_2} = x_2\qquad
\frac{\partial z}{\partial b} = 1
So:
\frac{\partial \mathcal{L}}{\partial w_1} = \frac{\partial \mathcal{L}}{\partial z} \cdot x_1
\frac{\partial \mathcal{L}}{\partial w_2} = \frac{\partial \mathcal{L}}{\partial z} \cdot x_2
\frac{\partial \mathcal{L}}{\partial b} = \frac{\partial \mathcal{L}}{\partial z}
🎓 Gradient Descent Update
We update our weights and bias using these gradients:
w_1 := w_1 - \alpha \cdot \frac{\partial \mathcal{L}}{\partial w_1}
w_2 := w_2 - \alpha \cdot \frac{\partial \mathcal{L}}{\partial w_2}
b := b - \alpha \cdot \frac{\partial \mathcal{L}}{\partial b}
Where \alpha
is the learning rate (controls the step size).
✅ Summary
- Forward pass: Make a prediction using weights and sigmoid.
- Loss: Measure how wrong the prediction was.
- Backward pass: Compute how much each weight and bias contributed to the error.
- Update: Adjust weights and bias to reduce future errors.
This is the heart of training a logistic regression model using derivatives and gradient descent.