Train YOLOv11 on MentorPi M1 for Traffic Sign Recognition

Things used in this project

Hardware components


MentorPi M1 Chassis

×1


Bracket Set

×1


Raspberry Pi 5

×1


64 GB SD Card

×1


Cooling Fan

×1


Raspberry Pi Power Supply C

×1


RPC Lite Controller + RPC Data Cable

×1


Battery Cable

×1


Lidar + 4PIN Wire

×1


Lidar Data Cable

×1


8.4V 2A Charger (DC5.5*2.5 Male Connector)

×1


3D Depth Camera

×1


Depth Data Cable

×1


Wireless Controller

×1


Controller Receiver

×1


EVA Ball (4…

Things used in this project

Hardware components


MentorPi M1 Chassis

×1


Bracket Set

×1


Raspberry Pi 5

×1


64 GB SD Card

×1


Cooling Fan

×1


Raspberry Pi Power Supply C

×1


RPC Lite Controller + RPC Data Cable

×1


Battery Cable

×1


Lidar + 4PIN Wire

×1


Lidar Data Cable

×1


8.4V 2A Charger (DC5.5*2.5 Male Connector)

×1


3D Depth Camera

×1


Depth Data Cable

×1


Wireless Controller

×1


Controller Receiver

×1


EVA Ball (40mm)

×1


Card Reader

×1


3PIN Wire (100mm)

×1


WonderEcho Pro AI Voice Interaction Box + Type C Cable

×1


Accessory Bag

×1


User Manual

×1

Story

Imagine your MentorPi M1 robot car can already navigate autonomously and avoid obstacles in a room. But what if you want it to truly understand more complex rules, like stopping at a red light or recognizing a stop sign? General-purpose AI models often struggle to accurately identify these specific, sometimes tiny, targets.

This guide will take you through a complete, hands-on project: teaching your robot car to "read" traffic signals. We will use the MentorPi M1 itself to complete the entire workflow—from data collection and model training to deployment and application. The most compelling aspect of this project is that you can build a full closed loop from the physical world to artificial intelligence using essentially this single device.

Part 1: Why is This an Ideal Use Case for the MentorPi M1?

A. The Hardware Closed Loop: Perfect Unity of Data and Application

Typically, developing a custom vision model for a robot is a cumbersome process: you collect data with one device, train the model on another more powerful computer, and then struggle to deploy it back onto the robot. This process is often plagued with compatibility issues.

The MentorPi M1 elegantly solves this pain point. Its built-in 3D depth camera is the perfect tool for capturing high-quality, first-person perspective road images. We can build a simple "mock street" on a desk or floor using printed pictures of traffic lights and road signs. As the car patrols this environment, what it "sees" is the exact scenario it will need to understand and respond to. This high consistency between the data and the application scenario is crucial for the model’s ultimate success.

👉Check your MentorPi Tutorial here! Follow Hiwonder GitHub for more repositories.

B. Feasible Compute: Training at the Edge

You might wonder: is it feasible to train an AI model on a Raspberry Pi 5? The answer is yes, with the right approach. We are not trying to train a massive model from scratch—that’s a job for high-performance GPUs. Our strategy is "transfer learning": we start with a lightweight model (like the nano version of YOLOv11) that has already learned basic object recognition skills from a massive general dataset (like COCO). Then, we "fine-tune" it using a few hundred images of traffic signs that we collect ourselves.

This process is akin to teaching a university student who already knows how to recognize "objects" to specifically memorize a few types of "road signs." The computational power of the Raspberry Pi 5 is fully capable of handling this targeted learning task. This means that every step—from data collection and model training to final deployment and verification—can be performed seamlessly on the MentorPi M1, enabling truly efficient development iteration.

Part 2: A Four-Act Play—The Making of Intelligence from Scratch

Act 1: Building the World and Teaching the Robot to "Observe"

It all starts with data. First, we need to create a miniature traffic environment. Print clear pictures of common signs like traffic lights (red, yellow, green), stop, turn, and speed limit signs, and attach them to cardboard or walls. Then, use a simple script to make the MentorPi M1’s camera take timed, automatic photos, or manually drive the car through the scene while saving images.

After collecting a few hundred images, the critical "annotation" work begins. We use a graphical tool like LabelImg to draw precise boxes around every traffic sign in each image and assign names like traffic_light_red, stop_sign, or speed_limit_30. This step provides the "answer key" for the model and is vital. In the end, you’ll have a dataset containing images and their corresponding annotation files—this is the "textbook" for training our model.

Act 2: Conducting a Model "Boot Camp" On-Device

With the textbook ready, training can begin. We set up the necessary AI development environment (Python, PyTorch, etc.) on the MentorPi M1’s Raspberry Pi 5 system. The key is using a library called Ultralytics, which makes training YOLO models remarkably straightforward.

Before training starts, we need to specify a few key parameters for the program: where our data is located, which pre-trained model to use as a starting point (we choose the lightweight yolov11n.pt), and how many rounds (epochs) to train for. Then, we launch the training command. You’ll see feedback in the terminal as the loss value gradually decreases. This process might take several hours—it’s like watching the AI repeatedly practice and correct its mistakes, eventually memorizing the features of all the traffic signs you taught it. When the loss value stabilizes at a low level, it indicates the model has "graduated."

Act 3: Integrating the Model into the Robot’s "Nervous System"

Training yields a.pt model file, but it can’t directly talk to the robot’s brain—the ROS2 system. We need to convert this model into a more universal format (like ONNX) and then write a ROS2 node for it.

Think of this node as the robot’s dedicated "Visual Processing Center." Once launched, it continuously subscribes to the real-time image stream from the depth camera, invokes our trained YOLOv11 model to analyze each frame, and finally publishes the analysis results (e.g., "Stop sign detected at center, 95% confidence") as a standard ROS2 message. Now, other parts of the robot can learn about the environment by "listening" to this message.

Act 4: From "Seeing" to "Acting, " Closing the Decision Loop

Recognizing a signal is only the first step; reacting correctly is the goal. Therefore, we need to create a second ROS2 Decision Node. This node specializes in "listening" to the recognition results published by the vision node.

Here, we implement some intuitive rule-based logic:

When a stop_sign is detected, the decision node immediately issues a "set speed to zero" command to the chassis for 2-3 seconds.
When traffic_light_red is detected, it also sends a stop command and keeps monitoring; only when the vision node reports traffic_light_green does it issue a move-forward command again.
When speed_limit_30 is detected (assuming we assign it a "low speed" meaning), the decision node limits the car’s maximum cruising speed to a lower value.

Now, launch all nodes: the vision recognition node, the decision node, and the chassis control node. Place the MentorPi M1 back into our miniature street setup. You’ll observe it no longer navigates blindly by avoiding obstacles alone. Instead, it starts behaving like an intelligent agent obeying traffic rules—halting at red lights and yielding at stop signs. At this moment, you have successfully built a complete robotic intelligence loop of "perception-decision-control."

Part 3: Review, Optimization, and Infinite Extension

Through this project, we have demonstrated the feasibility of full-cycle embedded AI development using an integrated platform like the MentorPi M1. It doesn’t just provide a robotic body; it offers a complete sandbox for perception from the real world, algorithm iteration, and behavioral output.

Of course, the first model is just a starting point. If you want to enhance its capabilities further, you can try:

Enrich the "Textbook": Collect more images from different angles and lighting conditions, and use image augmentation techniques (like rotation, brightness adjustment) to automatically expand the dataset, making the model more robust.
Optimize the "Student": Experiment with training a slightly larger model like yolov11s to trade off between accuracy and speed; or perform "quantization" on the trained model to significantly improve inference speed with minimal accuracy loss.
Design More Complex "Traffic Rules": Combine LiDAR data to achieve higher-level semantic understanding, such as "only need to check traffic lights at intersections."

The true value of this project lies in providing a clear blueprint. You can apply this knowledge to teach your robot to recognize the TV remote in the living room, specific toolboxes in a warehouse, or different types of weeds in a field. The MentorPi M1 lowers the barrier to exploring and applying cutting-edge computer vision technology in the real physical world, allowing your creativity to run free, no longer hindered by complex toolchains.

**Read more

Things used in this project

Hardware components

Things used in this project

Hardware components

Story

Part 1: Why is This an Ideal Use Case for the MentorPi M1?

Part 2: A Four-Act Play—The Making of Intelligence from Scratch

Part 3: Review, Optimization, and Infinite Extension

Credits

Similar Posts