Logistic Regression
Motivation: why this matters in quant finance
Logistic regression is the linear model for binary events: default or no default, stress regime or calm regime, fill or no fill, positive return or not. It is often the first serious classifier because it estimates probabilities rather than only labels.
It belongs next to Linear Regression, but it solves a different problem. Linear regression predicts a numerical conditional mean. Logistic regression models the log-odds of an event and trains by likelihood.
The informal idea
Start with a linear score . Since can be any real number, pass it through the sigmoid
to obtain . Coefficients are log-odds effects, not direct probability changes.
Formal statement
For labels , logistic regression minimises binary cross-entropy:
where . The gradient is
There is no normal equation; the model is trained iteratively.
Implementation
import numpy as np
class LogisticRegressionGD:
"""Binary logistic regression trained by batch gradient descent."""
def __init__(self, learning_rate: float = 0.5, n_iter: int = 2_000):
self.learning_rate = learning_rate
self.n_iter = n_iter
@staticmethod
def _sigmoid(z: np.ndarray) -> np.ndarray:
return 1.0 / (1.0 + np.exp(-z))
def fit(self, X: np.ndarray, y: np.ndarray):
X_design = np.c_[np.ones(len(X)), X]
theta = np.zeros(X_design.shape[1])
for _ in range(self.n_iter):
p = self._sigmoid(X_design @ theta)
theta -= self.learning_rate * (X_design.T @ (p - y) / len(y))
self.coef_ = theta
return self
rng = np.random.default_rng(11)
vol = rng.normal(0, 1, size=300)
momentum = rng.normal(0, 1, size=300)
logit = -0.4 + 1.2 * vol - 0.7 * momentum
y = rng.binomial(1, 1 / (1 + np.exp(-logit)))
model = LogisticRegressionGD().fit(np.c_[vol, momentum], y)
print(np.round(model.coef_, 3))
# [-0.258 1.224 -0.755]Key properties and trade-offs
| Property | Meaning | Finance consequence |
|---|---|---|
| Probability output | Estimates event probability. | Supports threshold choice and expected-loss ranking. |
| Linear boundary | The score is linear unless features are transformed. | Feature engineering matters. |
| Cross-entropy loss | Rewards calibrated probabilities. | Accuracy alone is not enough. |
| Threshold separate from training | Probability and action are different decisions. | Default cutoffs should reflect cost, capital, or risk appetite. |
Worked example: default threshold
A 7% default probability with 60% loss given default implies expected loss . The accept/reject threshold depends on economics, not on whether the probability exceeds 50%.
Common confusions and pitfalls
"Use MSE because it says regression." For binary labels, log loss is the likelihood-based objective.
"Accuracy is enough." Rare-event finance problems can look accurate while missing the costly class.
"The coefficient is a probability change." It is a log-odds change.
Where this goes next
- Regularisation: L1 vs L2: explains the penalties commonly used with logistic models.
- Support Vector Machine (SVM): learns a margin rather than calibrated probabilities.
- Decision Tree: captures nonlinear threshold rules.
- Cross-Validation: selects penalties and thresholds without using the test set.
References
- Aurelien Geron (2019). Hands-On Machine Learning with Scikit-Learn, Keras, and TensorFlow (2nd ed.). O'Reilly. Ch. 3 (Classification) and Ch. 4 (Logistic Regression).
- Andrew Ng and Tengyu Ma (2023). CS229 Lecture Notes. Ch. 2 (Classification and Logistic Regression) and Ch. 3 (Generalized Linear Models).