Machine Learning — Linear Regression
Linear Regression (Detailed Version)
Definition
Linear Regression is a Supervised Machine Learning algorithm used to predict continuous numerical values by finding a linear relationship between input variables (features) and an output variable (target).
The main idea is:
Find the best possible straight line that explains the relationship between input and output variables.
Examples:
- House price prediction
- Student marks prediction
- Salary prediction
- Temperature forecasting
- Sales prediction
Objective of Linear Regression
The objective of linear regression is:
To minimize prediction error and find the line that best represents the relationship between variables.
Basic Mathematical Equation
For a simple linear regression model:
Y=\[\beta_0+\beta_1X\]
Where:
- Y = Dependent variable (target/output)
- X = Independent variable (feature/input)
- β₀ = Intercept
- β₁ = Coefficient (slope)
Understanding the Equation
Suppose:
Y = 5 + 10X
Interpretation:
- Intercept = 5
- Slope = 10
Meaning:
For every increase of 1 unit in X, Y increases by 10 units.
Example:
Study Hours(X)Marks(Y)115225335445555
Model:
Y = 5 +10X
Prediction:
For:
X=6 Y=5+(10×6) Y=65
Predicted marks:
65
Terminologies in Linear Regression
1. Independent Variable (Feature)
Input variables used for prediction.
Examples:
- Area of house
- Study hours
- Temperature
- Age
2. Dependent Variable (Target)
Variable being predicted.
Examples:
- House price
- Marks
- Salary
3. Coefficient
Shows the effect of an independent variable on the target variable.
Example:
Y=5+8X
Coefficient:
8
Meaning:
Increasing X by 1 increases Y by 8
4. Intercept
Value of output when all inputs become zero.
5. Residual (Error)
Difference between actual and predicted value.
Formula:
Residual=Actual-Predicted
Example:
Actual:
80
Predicted:
75
Residual:
80−75=5
Working of Linear Regression
Step 1: Collect Data
Example:
Study Hours → Marks
Step 2: Preprocess Data
Tasks:
- Handle missing values
- Remove duplicates
- Scale data if needed
- Encode categorical variables
Step 3: Split Data
Typical split:
Training = 80% Testing = 20%
or
Training =70% Testing=30%
Step 4: Train Model
Algorithm learns the relationship:
Input(X)
↓
Learn coefficients
↓
Generate equation
Step 5: Predict New Data
Example:
Study Hours=7
Predict:
Marks=?
Step 6: Evaluate Model
Compare predicted values with actual values.
How does Linear Regression find the Best Line?
Many lines can pass through points.
Linear regression chooses the line that minimizes total error.
Example:
Actual points:
*
*
*
* *
Possible Line 1
----------------
Possible Line 2
-----------
Best Fit Line
----------
Cost Function
Linear regression uses Mean Squared Error (MSE) as a cost function.
J(\[\theta)=\frac{1}{2m}\sum_{i=1}^{m}(y_i-\hat{y}_i)^2\]
Where:
- J(θ) = Cost function
- m = Number of observations
- yi = Actual value
- ŷi = Predicted value
Goal:
Minimize cost function
Gradient Descent
Gradient Descent updates coefficients repeatedly to reduce error.
Steps:
Initialize parameters
↓
Calculate cost
↓
Calculate gradient
↓
Update parameters
↓
Repeat until minimum cost
Parameter update equation:
\[\theta=\theta-\alpha\frac{\partial J(\theta)}{\partial\theta}\]
Where:
- θ = Parameter
- α = Learning rate
Types of Linear Regression
1. Simple Linear Regression
Uses one independent variable.
Y=\[\beta_0+\beta_1X\]
Example:
Study Hours → Marks
2. Multiple Linear Regression
Uses multiple independent variables.
Y=\[\beta_0+\beta_1X_1+\beta_2X_2+\beta_3X_3+...+\beta_nX_n\]
Example:
House price prediction:
Inputs:
- Area
- Number of bedrooms
- Location
- Age of house
Performance Metrics for Linear Regression
1. Mean Absolute Error (MAE)
Measures average absolute difference between actual and predicted values.
MAE=\[\frac{1}{n}\sum |y_i-\hat y_i|\]
Interpretation:
- Lower MAE → Better model
Example:
Actual:
[10,20,30]
Predicted:
[12,18,33]
Absolute errors:
[2,2,3]
MAE:
(2+2+3)/3 =2.33
2. Mean Squared Error (MSE)
Squares errors before averaging.
MSE=\[\frac{1}{n}\sum(y_i-\hat y_i)^2\]
Properties:
- Large errors receive greater penalty
- Sensitive to outliers
3. Root Mean Squared Error (RMSE)
Square root of MSE.
RMSE=\[\sqrt{\frac{1}{n}\sum(y_i-\hat y_i)^2}\]
Advantages:
- Same unit as target variable
- Easier interpretation
4. R² Score (Coefficient of Determination)
Measures how much variance is explained by model.
R^2=1-\[\frac{SS_{res}}{SS_{tot}}\]
Range:
R² = 1 → Perfect model R² = 0 → Poor model R² <0 → Very poor model
Interpretation:
R²=0.85
means:
Model explains 85% of variability in data
5. Adjusted R²
Used in Multiple Linear Regression.
Adjusted;R^2=1-(1-R^2)\[\frac{n-1}{n-p-1}\]
Where:
- n = observations
- p = number of features
Purpose:
- Penalizes unnecessary variables
Assumptions of Linear Regression
- Linear relationship exists
- Variables should be independent
- Errors are normally distributed
- Constant variance exists (Homoscedasticity)
- No multicollinearity among features
Advantages
- Easy to understand
- Fast computation
- Easy interpretation
- Works well for linear relationships
- Less computational cost
Disadvantages
- Cannot model non-linear data
- Sensitive to outliers
- Assumes linear relationship
- Performance decreases with noisy data
Real-world Applications
- House price prediction
- Sales forecasting
- Demand prediction
- Salary prediction
- Temperature forecasting
- Risk analysis
Complete Workflow
Collect Data
↓
Preprocess Data
↓
Split Data
↓
Train Linear Regression Model
↓
Predict Values
↓
Evaluate Performance
↓
Optimize Model