← Random Forest Evaluation Metrics for Supervised →

Support Vector Machine (SVM) (Detailed Version)

Definition

Support Vector Machine (SVM) is a Supervised Machine Learning algorithm used for:

Classification problems
Regression problems

SVM works by finding the best boundary (hyperplane) that separates different classes with the maximum possible margin.

The main idea is:

Find a decision boundary that separates classes while maximizing the distance between the nearest data points.

Examples:

Spam detection
Face recognition
Cancer diagnosis
Handwritten digit recognition
Image classification

Why SVM?

Suppose we have two classes:

○ = Class A
× = Class B

Data:

×      ×

×      ×

-------------------

○      ○

○      ○

Many lines can separate classes:

Line 1
Line 2
Line 3

SVM chooses:

The line with maximum margin

because it gives better generalization.

Basic Working of SVM

Input Data
      ↓
Find separating boundary
      ↓
Maximize margin
      ↓
Create optimal hyperplane
      ↓
Classify new observations

Important Components of SVM

1. Hyperplane

A hyperplane is the decision boundary separating classes.

For two dimensions:

w_1x_1+w_2x_2+b=0

Where:

(w_1,w_2) = weights
(x_1,x_2) = input variables
(b) = bias

Example:

×    ×

------------------ Hyperplane

○    ○

2. Support Vectors

Support vectors are the nearest data points to the hyperplane.

Example:

×       ×

      ×

------------------

      ○

○       ○

Nearest points:

× and ○

These determine the position of the hyperplane.

3. Margin

Margin is the distance between support vectors and hyperplane.

Support Vector
        |
        |← Margin →
        |
------------------ Hyperplane
        |
        |← Margin →
        |
Support Vector

Goal:

Maximum Margin

Mathematical Objective of SVM

SVM tries to maximize:

Margin=\frac{2}{||w||}

Goal:

Maximize Margin

Equivalent optimization:

\min\frac{1}{2}||w||^2

Subject to:

y_i(w\cdot x_i+b)\ge1

Working of SVM (Step by Step)

Step 1: Collect data

Example:

Study HoursAttendancePass250Fail355Fail685Pass890PassStep 2: Plot data

Pass(○)      ○

             ○

--------------------

×

× Fail

Step 3: Find support vectors

Nearest observations are selected.

Step 4: Create optimal hyperplane

Choose boundary with largest margin.

Step 5: Predict new observations

Example:

Study Hours=7
Attendance=80

Predict:

Pass

What if data is not linearly separable?

Real-world data often looks like:

○ ○ ○

× ○ ×

○ × ○

No straight line can separate this.

SVM uses Kernel Functions.

Kernel Trick

Kernel trick transforms lower-dimensional data into higher dimensions so that separation becomes easier.

Original Data
       ↓
Transform dimensions
       ↓
Find hyperplane

Types of Kernels

1. Linear Kernel

Used for linearly separable data.

Formula:

K(x_i,x_j)=x_i^Tx_j

Applications:

Text classification
Spam detection

2. Polynomial Kernel

Used when relationships are curved.

Formula:

K(x_i,x_j)=(x_i^Tx_j+c)^d

Applications:

Image processing

3. Radial Basis Function (RBF) Kernel

Most commonly used kernel.

Formula:

K(x_i,x_j)=e^{-\gamma||x_i-x_j||^2}

Applications:

Complex datasets

4. Sigmoid Kernel

Formula:

K(x_i,x_j)=\tanh(\alpha x_i^Tx_j+c)

Applications:

Neural-network-like behavior

Types of SVM

1. Linear SVM

Uses a straight-line hyperplane.

Example:

Spam / Not Spam

2. Nonlinear SVM

Uses kernels.

Example:

Image Classification

3. Support Vector Regression (SVR)

Used for regression problems.

Example:

House Price Prediction

Hyperparameters in SVM

1. C Parameter

Controls penalty for incorrect classification.

Example:

C=100

High C:

Low training error
Higher overfitting risk

Low C:

More flexible boundary
Less overfitting

2. Gamma

Controls influence of nearby points.

Example:

gamma=0.1

High Gamma:

Complex boundaries

Low Gamma:

Smooth boundaries

Performance Metrics for Classification SVM

Accuracy

Accuracy=\frac{TP+TN}{TP+TN+FP+FN}

Precision

Precision=\frac{TP}{TP+FP}

Recall

Recall=\frac{TP}{TP+FN}

F1 Score

F1=2\times\frac{Precision\times Recall}{Precision+Recall}

ROC-AUC

Interpretation:

AUC = 1 → Perfect

AUC = 0.5 → Random

Performance Metrics for SVR

MAE

MAE=\frac{1}{n}\sum|y_i-\hat y_i|

MSE

MSE=\frac{1}{n}\sum(y_i-\hat y_i)^2

RMSE

RMSE=\sqrt{\frac{1}{n}\sum(y_i-\hat y_i)^2}

R² Score

R^2=1-\frac{SS_{res}}{SS_{tot}}

Advantages

Works well with high-dimensional data
Effective with small datasets
Handles complex boundaries using kernels
Less overfitting because of maximum margin

Disadvantages

Slow for very large datasets
Difficult to interpret
Choosing kernel can be difficult
Requires parameter tuning

Real-world Applications

Face recognition
Text classification
Disease prediction
Spam detection
Handwriting recognition
Image classification

Complete Workflow

Collect Data
      ↓
Preprocess Data
      ↓
Select Kernel
      ↓
Find Support Vectors
      ↓
Build Hyperplane
      ↓
Classify Data
      ↓
Evaluate Performance

One-line summary

Support Vector Machine (SVM) is a supervised learning algorithm that finds the optimal hyperplane with maximum margin to separate classes and can use kernels to handle non-linear data.

← Previous: Random Forest Next: Evaluation Metrics for Supervised →

Machine Learning — Support Vector Machine (SVM)

Support Vector Machine (SVM) (Detailed Version)

Definition

Why SVM?

Basic Working of SVM

Important Components of SVM

1. Hyperplane

2. Support Vectors

3. Margin

Mathematical Objective of SVM

Working of SVM (Step by Step)

Step 1: Collect data

Study HoursAttendancePass250Fail355Fail685Pass890PassStep 2: Plot data

Step 3: Find support vectors

Step 4: Create optimal hyperplane

Step 5: Predict new observations

What if data is not linearly separable?

Kernel Trick

Types of Kernels

1. Linear Kernel

2. Polynomial Kernel

3. Radial Basis Function (RBF) Kernel

4. Sigmoid Kernel

Types of SVM

1. Linear SVM

2. Nonlinear SVM

3. Support Vector Regression (SVR)

Hyperparameters in SVM

1. C Parameter

2. Gamma

Performance Metrics for Classification SVM

Accuracy

Precision

Recall

F1 Score

ROC-AUC

Performance Metrics for SVR

MAE

MSE

RMSE

R² Score

Advantages

Disadvantages

Real-world Applications

Complete Workflow

One-line summary