CONTENTS

Logistic Regression

Motivation: why this matters in quant finance

Logistic regression is the linear model for binary events: default or no default, stress regime or calm regime, fill or no fill, positive return or not. It is often the first serious classifier because it estimates probabilities rather than only labels.

It belongs next to Linear Regression, but it solves a different problem. Linear regression predicts a numerical conditional mean. Logistic regression models the log-odds of an event and trains by likelihood.

The informal idea

Start with a linear score z=β0+xβz=\beta_0+\mathbf{x}^\top\boldsymbol{\beta}. Since zz can be any real number, pass it through the sigmoid

σ(z)=11+ez\sigma(z)=\frac{1}{1+e^{-z}}

to obtain P(Y=1x)=σ(z)\mathbb{P}(Y=1\mid\mathbf{x})=\sigma(z). Coefficients are log-odds effects, not direct probability changes.

Formal statement

For labels yi0,1y_i\in\\{0,1\\}, logistic regression minimises binary cross-entropy:

L(θ)=1ni=1n[yilogpi+(1yi)log(1pi)],L(\boldsymbol{\theta})=-\frac{1}{n}\sum_{i=1}^n\left[y_i\log p_i+(1-y_i)\log(1-p_i)\right],

where pi=σ(x~iθ)p_i=\sigma(\tilde{\mathbf{x}}_i^\top\boldsymbol{\theta}). The gradient is

L(θ)=1nX~(py).\nabla L(\boldsymbol{\theta})=\frac{1}{n}\tilde{\mathbf{X}}^\top(\mathbf{p}-\mathbf{y}).

There is no normal equation; the model is trained iteratively.

Implementation

import numpy as np class LogisticRegressionGD: """Binary logistic regression trained by batch gradient descent.""" def __init__(self, learning_rate: float = 0.5, n_iter: int = 2_000): self.learning_rate = learning_rate self.n_iter = n_iter @staticmethod def _sigmoid(z: np.ndarray) -> np.ndarray: return 1.0 / (1.0 + np.exp(-z)) def fit(self, X: np.ndarray, y: np.ndarray): X_design = np.c_[np.ones(len(X)), X] theta = np.zeros(X_design.shape[1]) for _ in range(self.n_iter): p = self._sigmoid(X_design @ theta) theta -= self.learning_rate * (X_design.T @ (p - y) / len(y)) self.coef_ = theta return self rng = np.random.default_rng(11) vol = rng.normal(0, 1, size=300) momentum = rng.normal(0, 1, size=300) logit = -0.4 + 1.2 * vol - 0.7 * momentum y = rng.binomial(1, 1 / (1 + np.exp(-logit))) model = LogisticRegressionGD().fit(np.c_[vol, momentum], y) print(np.round(model.coef_, 3)) # [-0.258 1.224 -0.755]

Key properties and trade-offs

PropertyMeaningFinance consequence
Probability outputEstimates event probability.Supports threshold choice and expected-loss ranking.
Linear boundaryThe score is linear unless features are transformed.Feature engineering matters.
Cross-entropy lossRewards calibrated probabilities.Accuracy alone is not enough.
Threshold separate from trainingProbability and action are different decisions.Default cutoffs should reflect cost, capital, or risk appetite.

Worked example: default threshold

A 7% default probability with 60% loss given default implies expected loss 0.07×0.60=4.20.07\times0.60=4.2\\%. The accept/reject threshold depends on economics, not on whether the probability exceeds 50%.

Common confusions and pitfalls

"Use MSE because it says regression." For binary labels, log loss is the likelihood-based objective.
"Accuracy is enough." Rare-event finance problems can look accurate while missing the costly class.
"The coefficient is a probability change." It is a log-odds change.

Where this goes next

References

  • Aurelien Geron (2019). Hands-On Machine Learning with Scikit-Learn, Keras, and TensorFlow (2nd ed.). O'Reilly. Ch. 3 (Classification) and Ch. 4 (Logistic Regression).
  • Andrew Ng and Tengyu Ma (2023). CS229 Lecture Notes. Ch. 2 (Classification and Logistic Regression) and Ch. 3 (Generalized Linear Models).
Logistic Regression | q4quant.studio