The Hidden Assumptions Behind Linear Regression

Why Assumptions Matter More Than the Equation Linear Regression is one of the first topics that we hear when we start learning Machine Learning. Because of that, many people treat it as something to quickly “get through” before moving on to more complex models. Often, it is labeled as a basic model that we need to learn and then move beyond. Is that a mistake? Yes. Absolutely. Because if you truly understand Linear Regression, you already understand the core of Machine Learning. Why So? To understand this properly, we need to step back and examine what Linear Regression really is, how it’s used in practice, and the intuition behind the mathematics. What Linear Regression Really Is? In theory, Linear Regression is a supervised machine learning algorithm used to predict a continuous numerical value (e.g., price, age, temperature) based on the linear relationship between a dependent variable and one or more independent variables. That definition is correct. But it doesn’t tell us much. At its core, Linear Regression is not about prediction first. It’s about describing a relationship between variables in the simplest possible way. Dependent and Independent Variables (Without the Jargon) The variable we want to predict is called the dependent variable . The variables we use to make that prediction are called independent variables . In a house price prediction problem: The price of the house is the dependent variable Features like size , number of bedrooms , and location are independent variables So far, nothing surprising. But the real insight comes when we visualize the data . What the Model Is Actually Trying to Do If we plot the data points on a scatter plot, Linear Regression tries to draw a straight line that best represents the overall trend in the data. Not a perfect line. Not a line that passes through every point. A line that is “wrong in the least possible way.” That immediately raises an important question: How do we decide which line is “best”? Can We Fit Any Line We Want? No. The line is not chosen by intuition or convenience. It is chosen by error . For every data point: The model makes a prediction We measure how far that prediction is from the actual value This difference is called the error . When we calculate this error for all data points and combine them, we get the total error of the model. Linear Regression simply asks: Which line produces the smallest total error across all data points? That’s it. No magic. No intelligence. Just optimization. Where Linear Regression Came From (And Why That Matters) Linear Regression is not a pure Machine Learning concept. Even before Machine Learning was introduced, it was used in statistics. It mattered a lot more in statistics than it sounds. Originally used to explain instead of predicting The main purpose of linear regression is to understand the relationship with data so that prediction accuracy can be improved. Statisticians used to answer many questions related to the data by using Linear Regression. Linear Regression makes sure that we think more about assumptions, uncertainty, and interpretability. The concepts that we ignore more in model machine learning. Designed for a Simpler World Linear Regression is a model that makes strong assumptions in exchange for clarity. It was introduced as a simple, explainable, and mathematically tractable model for a small, noisy, and easy-to-collect dataset. Because of this, the model has assumptions like relationships are stable, noise behaves nicely, and inputs don’t fight each other. Even though it’s a powerful model. Why Linear Regression Still Matters in Modern ML We are nowadays we are dealing with some powerful models, so the ultimate questions come to everyone’s mind. Why do we need Linear Regression? What does Linear Regression bring to the table that complex models can’t? Linear Regression is preferred because it’s an honest model. A linear model: exposes data leakage quickly highlights weak or irrelevant features sets a performance baseline that keeps expectations realistic If a complex model performs only marginally better than a linear one, that’s a signal, not a success. Linear Regression also remains valuable when: interpretability matters more than raw accuracy The data is limited or noisy decisions need to be explained, not just optimized In many real systems, a slightly less accurate but explainable model is the correct choice. The Math — Only What You Need to Think Clearly When we think about math behind algorithms, we are presented with big formulas and equations. But in the case of Linear Regression its not required. Linear Regression makes predictions using a weighted sum of inputs, and those weights need to be adjusted to reduce the error. Loss function matters the most in Linear Regression. For every prediction, the model measures how wrong it is. Those errors are combined into a single number that represents how well the model is performing. Mathematically, it is represented as below. Linear Regression From Scratch Implementing Linear Regression from scratch helps you see what actually happens under the hood . There are two main ways to do it: Closed-Form Solution The simplest way is the closed-form solution, which calculates the best-fit line directly using a formula from linear algebra. import numpy as np class LinearRegressionClosed: def init(self): self.coef_ = None self.intercept_ = None def fit(self, X, y): X = np.asarray(X) y = np.asarray(y) Xb = np.c_[np.ones((X.shape[0], 1)), X] A = np.linalg.inv(Xb.T @ Xb) @ Xb.T @ y self.intercept_ = A[0] self.coef_ = A[1:] def predict(self, X): X = np.asarray(X) return X @ self.coef_ + self.intercept_ This method solves for the exact line that minimizes error. It’s fast and elegant for small datasets, but it doesn’t scale well to very large data. If you want to see this tested on real data, check my GitHub: l inear-regression from scratch closed form Gradient Descent The other approach is iterative using gradient descent to gradually find the best line. import pandas as pd def gradient_descent(m_now, b_now, points, alpha): x = points[‘MedInc’].values y = points[‘MedHouseVal’].values n = len(x) y_pred = m_now * x + b_now m_gradient = -(2/n) * sum(x * (y - y_pred)) b_gradient = -(2/n) * sum(y - y_pred) m = m_now - alpha * m_gradient b = b_now - alpha * b_gradient return m, b m = 0 b = 0 alpha = 0.01 epochs = 1000 for i in range(epochs): m, b = gradient_descent(m, b, df, alpha) if i % 100 == 0: print(f“Epoch {i}: m={m:.4f}, b={b:.4f}“) print(”\nFinal values:“) print(f“m = {m}”) print(f“b = {b}“) Gradient descent teaches you the process of learning: start with a guess → measure error → adjust gradually until it fits This is exactly what all Machine Learning models do, even deep neural networks. If you want to see this tested on real data, check my GitHub: linear-regression-from-scratch-in-python That’s a Wrap!: Linear Regression is not important because it’s simple. It’s important because it teaches you how models think, how data behaves, and where assumptions quietly shape results. In this newsletter, I’ve tried to explain Linear Regression in my own way, why it’s useful, why it’s interesting, and why understanding it well changes the way you approach Machine Learning. Once you see that, Machine Learning stops feeling like a collection of algorithms and starts feeling like a system you can reason about. Resources & Further Reading If you want to explore Linear Regression in more depth, here are some helpful resources: Tutorials & Courses: Google ML Crash Course — Linear Regression — Great for intuition and interactive exercises. YouTube: Linear Regression Explained — Concise, visual explanation. Linear Rregression — Wiki Research & Articles: Fisher, R.A. (1922). On the Interpretation of χ² from Contingency Tables, and the Calculation of P . — Classic statistics paper introducing regression ideas. Seber, G.A.F., & Lee, A.J. (2012). Linear Regression Analysis . — Comprehensive reference for understanding assumptions and theory in regression. Pro Tip: Even if you don’t read the papers in full, skimming the introduction and conclusions can give you insights into why Linear Regression works the way it does and why it’s still relevant in modern ML. Cheers, Samith Chimminiyan The Hidden Assumptions Behind Linear Regression was originally published in Towards AI on Medium, where people are continuing the conversation by highlighting and responding to this story.

Similar Posts