About six months ago, we started experimenting with an idea for an approximate nearest neighbor (ANN) engine inside SmartEco.
Not a benchmark-chasing project. Not a let’s beat FAISS announcement. Just a practical question we kept running into:
Can we design an ANN system that is predictable, recall-safe, and easy to reason about from an engineering point of view?
This post is not a launch, not a promise, and not a comparison. It’s just a progress update.
Where This Started
This project started as an internal exploration rather than a product goal.
We were working with nearest-neighbor search regularly and kept running into the same questions:
- how to reason about recall behavior, how to keep latency predictable, and how to make the system easy to debug as it grows…
About six months ago, we started experimenting with an idea for an approximate nearest neighbor (ANN) engine inside SmartEco.
Not a benchmark-chasing project. Not a let’s beat FAISS announcement. Just a practical question we kept running into:
Can we design an ANN system that is predictable, recall-safe, and easy to reason about from an engineering point of view?
This post is not a launch, not a promise, and not a comparison. It’s just a progress update.
Where This Started
This project started as an internal exploration rather than a product goal.
We were working with nearest-neighbor search regularly and kept running into the same questions:
- how to reason about recall behavior, how to keep latency predictable, and how to make the system easy to debug as it grows.
Instead of trying to optimize an existing solution, we decided to step back and explore a different design space - one where:
- recall behavior is explicit and measurable,
- latency is bounded by design rather than by chance, and
- the system remains understandable at the implementation level.
What Exists Today
As of now:
- The core design is complete
- A full Python implementation exists
- We’ve run internal experiments on recall and speed
- The results look encouraging, not miraculous
- Native C++ implementation has already started
The system is already usable as a reference implementation, but not something we’re ready to put in users’ hands yet.
Importantly: we are not claiming production readiness, and we are not publishing benchmark charts yet.
Why Python First?
We intentionally built a full Python version before touching C++.
Why?
- To validate the design
- To test recall behavior
- To understand where time is actually spent
For the hot paths, we used Numba to remove Python overhead and get closer to native execution. That helped us answer a very practical question early:
Will rewriting this in C++ meaningfully help, or is the design itself the bottleneck?
The answer so far: yes - a native implementation should help, especially for routing and scoring loops - but only because the algorithmic structure is already stable.
Note : This is not ANN with a few tweaks. It’s an attempt to build something clean, controllable, and explainable, even if that means slower progress.
From our internal testing so far:
- Recall behavior is stable and tunable
- Speed is reasonable for a Python + Numba system
- Latency behavior is predictable
- The design maps cleanly to native code
That’s it.
Anything stronger than that would be premature.
Why This Takes Time
ANN engines are deceptive.
They look simple until you care about:
- tail latency,
- recall under different distributions,
- memory layout,
- build-time vs query-time trade-offs.
We’d rather move slowly than publish something we later have to walk back.
What’s Next
- Continue native C++ implementation
- Validate behavior matches the Python reference
- Run broader benchmarks
- Clean up APIs and internals
- Prepare for open sourcing
This will take time. Possibly more than we expect.
This post isn’t an announcement. It’s a checkpoint.
(And yes—the name starts with Smart. Of course it does.)
If you’re interested in ANN systems from a design and engineering perspective (not just leaderboard numbers), you’ll probably find what we’re building interesting when it’s ready.
Until then, back to writing code.
- SmartEco Team