CONTENTS

Support Vector Machine (SVM)

Motivation: why this matters in quant finance

Support vector machines are margin-based classifiers. They are useful when the dataset is medium-sized, the boundary matters more than probability calibration, and the right transformed feature space can separate regimes. Market-regime classification and anomaly screening can fit this pattern.

SVMs sit between Logistic Regression and more flexible nonlinear models. Like logistic regression, a linear SVM learns a separating hyperplane. Unlike logistic regression, it focuses on the points near the boundary and maximises a margin.

The informal idea

If two classes are separable, many lines may split them. The SVM chooses the line with the widest street between classes. The observations touching the street are support vectors; they determine the boundary.

Soft-margin SVMs allow violations. This is essential in finance because labels are noisy and regimes are rarely perfectly separable.

Formal statement

For labels yi1,1y_i\in\\{-1,1\\}, the soft-margin SVM solves

minw,b,ξ12w22+Ci=1nξi\min_{\mathbf{w},b,\boldsymbol{\xi}} \frac{1}{2}\lVert\mathbf{w}\rVert_2^2 + C\sum_{i=1}^n \xi_i

subject to yi(wxi+b)1ξiy_i(\mathbf{w}^\top\mathbf{x}_i+b)\geq 1-\xi_i and ξi0\xi_i\geq0. The hinge-loss form is

12w22+Cimax(0,1yi(wxi+b)).\frac{1}{2}\lVert\mathbf{w}\rVert_2^2 + C\sum_i \max(0,1-y_i(\mathbf{w}^\top\mathbf{x}_i+b)).

A kernel K(x,z)=ϕ(x)ϕ(z)K(\mathbf{x},\mathbf{z})=\phi(\mathbf{x})^\top\phi(\mathbf{z}) gives nonlinear boundaries without explicitly constructing ϕ\phi.

Implementation

from sklearn.datasets import make_moons from sklearn.model_selection import train_test_split from sklearn.pipeline import make_pipeline from sklearn.preprocessing import StandardScaler from sklearn.svm import SVC X, y = make_moons(n_samples=400, noise=0.18, random_state=31) X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.3, random_state=31, stratify=y) svm = make_pipeline(StandardScaler(), SVC(kernel="rbf", C=5.0, gamma=1.0)) svm.fit(X_train, y_train) print(round(svm.score(X_test, y_test), 3)) # 0.967

The scaler is part of the model pipeline because margin geometry depends on feature units.

Key properties and trade-offs

PropertyMeaningFinance consequence
Margin maximisationChooses a wide boundary.Useful for robust small-data regime boundaries.
Support vectorsBoundary-near points drive the solution.Outliers near the boundary matter.
Kernel trickNonlinear boundaries via inner products.Flexible, but hyperparameters become critical.
No native probabilitiesScores are margins, not calibrated probabilities.Calibrate if probabilities drive decisions.

Worked example: regime boundary

A classifier using realised volatility and trend strength may need a curved stress boundary. An RBF SVM can carve that boundary. If the desk needs calibrated default probabilities, logistic regression or calibrated tree models may be more appropriate.

Common confusions and pitfalls

"The largest margin always generalises best." The margin must be balanced against violations through CC.
"The kernel removes feature engineering." It adds flexibility, not economic meaning.
"SVM outputs are probabilities." The raw score is a signed margin.

Where this goes next

References

  • Aurelien Geron (2019). Hands-On Machine Learning with Scikit-Learn, Keras, and TensorFlow (2nd ed.). O'Reilly. Ch. 5 (Support Vector Machines).
  • Andrew Ng and Tengyu Ma (2023). CS229 Lecture Notes. Ch. 5 (Kernel Methods) and Ch. 6 (Support Vector Machines).
  • Deisenroth, Faisal, and Ong (2020). Mathematics for Machine Learning. Ch. 12 (Classification with Support Vector Machines).