When you think of “machine learning,” you probably think of GPU clusters crunching terabytes of data, or at the bare minimum, the likes of a singular humble RTX 5090 being used for data processing. But the truth is, ML doesn’t have to be heavy, or even power-hungry. Enter TinyML: a branch of machine learning designed to run directly on microcontrollers like the ESP32.
TinyML models are trained elsewhere, typically on your computer or in the cloud, and then compressed into a lightweight format that fits into the ESP32’s flash memory. Once deployed, the microcontroller runs inference locally, meaning it can make decisions instantly without needin…
When you think of “machine learning,” you probably think of GPU clusters crunching terabytes of data, or at the bare minimum, the likes of a singular humble RTX 5090 being used for data processing. But the truth is, ML doesn’t have to be heavy, or even power-hungry. Enter TinyML: a branch of machine learning designed to run directly on microcontrollers like the ESP32.
TinyML models are trained elsewhere, typically on your computer or in the cloud, and then compressed into a lightweight format that fits into the ESP32’s flash memory. Once deployed, the microcontroller runs inference locally, meaning it can make decisions instantly without needing an internet connection. For example, you can teach an ESP32 to recognize sound patterns, detect gestures, or monitor environmental changes.
TinyML is an incredibly powerful piece of software, and you can easily train your own model and deploy it on an ESP32.
Combining TinyML with an external data source
Identifying temperature and humidity anomalies
Instead of wiring up a DHT11 sensor directly, I wanted to show how the ESP32 could act as a secondary brain within an existing home setup, so I set it up to process data from my existing smart home setup, where data is reported to an MQTT server. Every few seconds, I configured the ESP32 to pull the current temperature and humidity values from one of my Zigbee reporters.
The magic happens using the TinyML model I built, where I trained it on two weeks of data by exporting every timestamped reading from the past two weeks. The ESP32 then compares the current readings against a short sliding window of previous readings, using a TinyML model trained to predict what “normal” conditions look like. When the actual readings deviate significantly from what the model predicts, the ESP32 flags it as an anomaly.
That might sound abstract, but it’s actually incredibly useful, and can be adapted in a lot of different ways. Here are just some of the ways you can interpret the data:
- A spike in humidity could indicate a window left open before rain
- A sudden drop in temperature might mean a heating failure or a window left open
- Even subtle changes can suggest a room’s airflow or insulation is off in some way
All of this happens locally, without streaming data to the cloud or even waking another device, while consuming very little power, too. As for how it works, it’s pretty basic:
- Normalizes the incoming readings using pre-calculated mean (mu) and standard deviation (std)
- Runs inference on the TinyML model to predict the next likely pair of values
- Calculates mean squared error (MSE) between the predicted and actual values
- Flags anomalies when the error exceeds a trained threshold (thr)
Because the entire model is quantized to int8, it’s fast enough to run hundreds of inferences per second, even on the dual-core ESP32-S3. Even better, using my power meter, I measured that the ESP32-S3 running this model continuously was pulling about 0.13A at 5.06V, which amounts to 0.66W of power at any given time. That’s incredibly efficient.
Here’s a simplified version of the inference loop that runs on the ESP32, written in C++ and using TensorFlow Lite Micro.
const int8_t* out = output->data.int8;float os = output->params.scale;int oz = output->params.zero_point;// Calculate mean squared error between predicted and actual datafloat mse = 0.f;for (int i = 0; i  int idx = i * 2; // Convert predicted values from quantized space to normalized float float y_t = (out[idx + 0] - oz) * os; float y_h = (out[idx + 1] - oz) * os; // Normalize incoming sensor values float x_t = (buf[i].t - MU[0]) / STD[0]; float x_h = (buf[i].h - MU[1]) / STD[1]; // Calculate distance between prediction and actual values float dt = x_t - y_t; float dh = x_h - y_h; mse += (dt * dt + dh * dh) * 0.5f;}mse /= WIN;// Compare against trained threshold to detect anomaliesbool isAnomaly = (mse > THR);
The variables mu, std, and THR come from the training process, where “mu” is mean, “std” is standard deviation, and “thr” is threshold. mu and std essentially define how your training data was normalized, and thr essentially determines what counts as “too far” from expected behavior, and is typically tuned from validation results. To sum it up: when the model predicts future values that don’t match reality within that range, it signals that something unusual is happening.
In practice, the model has essentially learned subtle environmental rhythms, like daily temperature cycles, humidity drifts, and even how long it takes for a room to heat up after sunrise. When those rhythms break, the ESP32 knows it, and it’s consistent too, as the data source it was trained on is the same as the data source it actively compares again.
Training your own TinyML model
It’s surprisingly easy
We’ll talk as a high-level overview of how i prepared my own TinyML model, as it will be different for everyone depending on what your goals are. What we’ve built, in this case, is a Python script that trains a simple neural network (specifically, an autoencoder) to learn what “normal” sensor readings for temperature and humidity look like over a 60-sample (so, in this case, 60-minute) window. It then converts this network into a tiny, super-fast, integer-only model, that can run on the ESP32.
In the script, we prepare our data, and we center our input data around 0 and with a standard deviation of 1. This is called Z-score normalization, and it’s the best way to train a neural network to ensure accurate results. I also added a tiny 1e-6 value to the standard deviation just as a precaution, as some values were static for a while. We then slice the 2D array (consisting of number of samples, and features) into a 3D array of shape (number of windows, where windows are 60 minute segments, samples, and features).
An autoencoder has two parts, where we encode and then decode. We flatten each 60 minute segment into a vector of 120 values, compress those 120 values to 32 values, compress them further down to just 8 latent values, and then attempt to reconstruct the original 120 from those 8. The idea is that, by forcing the model to represent complex sensor behavior using such a tiny bottleneck, it’s effectively learning the “essence” of what normal conditions look like. Any deviation from that learned representation will naturally produce higher reconstruction error later on, which is exactly what we’re using to flag anomalies. Finally, I dump the mu, std, and thr values at the end, as they’re necessary for TinyML on the ESP32 to use the model correctly.
This small network trains surprisingly quickly, taking less than a minute on my PC and on my MacBook M4 Pro. After training, you’ll have a model that can take a short rolling sequence of readings and predict what the next few values should look like if everything is behaving normally. Once the autoencoder was trained, I exported the model using TensorFlow Lite and quantized it down to INT8. This step is essential for deploying on the ESP32; not only does it shrink the model to a few kilobytes, but it also makes inference significantly faster.
Finally, we use Linux’s “xxd” command to dump the resulting tflite file to a hexdump that we can reference in our code, saving it as a C++ header file with the “.h” extension. There are many ways of building a TFLite model for deployment on an ESP32, and this is just one way! I highly recommend looking for ESP32 and Arduino code samples that closer align with what you want to do when it comes to training your own model for deployment.
The end result of this experiment is an ESP32 that quietly monitors the environment and understands what “normal” looks like, all without internet access, cloud inference, or even a server. It’s a complete self-contained learning system running on less than a watt of power on average. That’s the real beauty of TinyML: identifying anomalies and teaching small devices to understand data. If you have an ESP32 lying around and you have a lot of data, give it a try. You might find some fun use cases.