Why I Moved My ML Model from Flask to AWS Lambda (A Student’s Guide to $0 Hosting)

Some months ago, I built a machine learning model to predict diabetes risk using the Pima Indians Diabetes dataset. It was a standard student project: a Jupyter notebook, some Scikit-learn code, and a Random Forest Classifier. It worked perfectly on my laptop.

But when I wanted to deploy it so a friend could actually use it, I hit a wall.

The standard advice on the internet is usually "Just Dockerize it and run it on AWS Fargate" or "Spin up an EC2 instance and run a Flask server."

For a student developer in Nigeria, that advice is dangerous.

The "Always-On" Problem An EC2 instance is like a generator that you leave running 24/7 just in case someone wants to turn on a light switch once a week. You pay for it every second it runs.

If I deployed my Diabetes Predictor on an…

But when I wanted to deploy it so a friend could actually use it, I hit a wall.

The standard advice on the internet is usually "Just Dockerize it and run it on AWS Fargate" or "Spin up an EC2 instance and run a Flask server."

For a student developer in Nigeria, that advice is dangerous.

The "Always-On" Problem An EC2 instance is like a generator that you leave running 24/7 just in case someone wants to turn on a light switch once a week. You pay for it every second it runs.

If I deployed my Diabetes Predictor on an EC2 t3.medium, it would cost me money even at 3:00 AM when nobody is using it. Operating on a student budget with unpredictable exchange rates, "low cost" isn’t just a preference—it’s a requirement.

I needed an architecture that was "lazy"—one that only wakes up when it has work to do.

Step 1: The "Easy" Part (Training the Model) First, I trained the model using a Random Forest Classifier. This part was straightforward. I used the standard dataset from the UCI Machine Learning Repository.

Here is the core of the training logic I used:

Python

Import Libraries

import pandas as pd from sklearn.preprocessing import StandardScaler from sklearn.model_selection import train_test_split from sklearn.ensemble import RandomForestClassifier from sklearn.metrics import accuracy_score import joblib

Load Dataset & Preprocess

data_url = "https://raw.githubusercontent.com/jbrownlee/Datasets/master/pima-indians-diabetes.data.csv" columns = [‘Pregnancies’, ‘Glucose’, ‘BloodPressure’, ‘SkinThickness’, ‘Insulin’, ‘BMI’, ‘DiabetesPedigreeFunction’, ‘Age’, ‘Outcome’] df = pd.read_csv(data_url, header=None, names=columns)

X = df.drop(‘Outcome’, axis=1) y = df[‘Outcome’]

scaler = StandardScaler() X_scaled = scaler.fit_transform(X) X_train, X_test, y_train, y_test = train_test_split(X_scaled, y, test_size=0.2, random_state=42)

Model Training

model = RandomForestClassifier(n_estimators=100, random_state=42) model.fit(X_train, y_train)

The most important part for deployment:

joblib.dump(model, "diabetes_predictor.pkl") Most tutorials stop here. They tell you to save the model and... good luck. But a .pkl file on my hard drive doesn’t help anyone.

Step 2: The Pivot to Serverless I decided to rip out the idea of a dedicated server. I didn’t need an OS; I just needed a place to run model.predict().

Here is the architecture I designed to keep costs at effectively $0.00:

The Storage: I uploaded my diabetes_predictor.pkl file to an Amazon S3 bucket. (Cost: Pennies).

The Compute: I wrote a Python function using AWS Lambda.

The Front Door: I connected it to Amazon API Gateway to give it a public URL.

Step 3: The Deployment Code This was the tricky part. You can’t just copy-paste your Jupyter Notebook into AWS Lambda. You have to write a "Handler" that knows how to talk to S3.

I had to write a script that pulls the model from S3 into the Lambda’s temporary storage (/tmp) before it can make a prediction.

Here is the actual code running in my Lambda function (which is also in my GitHub repo):

Python import json import boto3 import joblib import os import numpy as np

Initialize S3 client

s3 = boto3.client(‘s3’) BUCKET_NAME = os.environ.get(‘BUCKET_NAME’, ‘my-diabetes-model-bucket’) MODEL_FILE_NAME = os.environ.get(‘MODEL_FILE_NAME’, ‘diabetes_predictor.pkl’)

def load_model_from_s3(): """Downloads the model from S3 to the /tmp directory""" download_path = f’/tmp/{MODEL_FILE_NAME}’ if not os.path.exists(download_path): s3.download_file(BUCKET_NAME, MODEL_FILE_NAME, download_path) return joblib.load(download_path)

model = None

def lambda_handler(event, context): global model

# 1. Load the model if it's not ready
if model is None:
model = load_model_from_s3()

try:
# 2. Parse the incoming JSON body
body = json.loads(event['body'])

# Extract features matching training columns
features = np.array([[
body['Pregnancies'], body['Glucose'], body['BloodPressure'],
body['SkinThickness'], body['Insulin'], body['BMI'],
body['DiabetesPedigreeFunction'], body['Age']
]])

# 3. Predict
prediction = model.predict(features)
result = int(prediction[0])

return {
'statusCode': 200,
'body': json.dumps({'prediction': result})
}

except Exception as e:
return {
'statusCode': 500,
'body': json.dumps({'error': str(e)})
}

The "Ibadan" Constraint: Battling Latency The hardest part wasn’t the code; it was the latency.

When you use Serverless, the function "goes cold" if nobody uses it for a while. The next time someone clicks "Predict," AWS has to spin up the environment, download Python libraries, and load the model.

On my first test, it took 4 seconds to get a result. On a slow 4G network, that felt like an eternity.

The Fix: I learned a counter-intuitive trick from the AWS docs: Increase the memory to save time.

I bumped the Lambda memory from 128MB to 512MB. I wasn’t just getting more RAM; I was getting more CPU power. The function started loading in under 1.5 seconds.

Because AWS bills by the millisecond, the faster function actually cost me slightly less money than the slower, weaker one because it finished the job so much quicker.

Conclusion This project taught me that Cloud Engineering isn’t just about making things work. It’s about making them viable for your specific constraints.

By moving to Serverless, I built an app that can scale to thousands of users but costs me nothing when it’s idle. For developers in emerging markets, mastering these "frugal" architectures is a superpower.

Check out the full code on my GitHub: https://github.com/OAKVISUALZ/Prediction-of-Diabetes/

Import Libraries

Load Dataset & Preprocess

Model Training

The most important part for deployment:

Initialize S3 client

Similar Posts