Lanturn 🔦 (Work in Progress)

A hackathon project connecting the Gemini Live API to an ESP32 Atoms3r-CAM device for voice + vision conversations on embedded hardware.

Overview

Lantern demonstrates real-time AI voice conversations with vision running on an ESP32 microcontroller. It uses:

  • ESP32 Atoms3r‑CAM (mic, speaker, camera)
  • Pipecat (voice/media orchestration)
  • Gemini Live API (multimodal speech + vision)

Features

  • ✅ Real-time voice conversations
  • Real-time vision processing with camera (Work in Progress)
  • ✅ Gemini Live multimodal AI integration
  • ✅ ESP32 hardware support
  • ✅ WebRTC audio + vision streaming
  • ✅ Automatic greeting on connection
  • ✅ Google Search tool call
  • AI can see and describe what the camera shows

Architecture

Similar Posts

Loading similar posts...

Keyboard Shortcuts

Navigation
Next / previous item
j/k
Open post
oorEnter
Preview post
v
Post Actions
Love post
a
Like post
l
Dislike post
d
Undo reaction
u
Recommendations
Add interest / feed
Enter
Not interested
x
Go to
Home
gh
Interests
gi
Feeds
gf
Likes
gl
History
gy
Changelog
gc
Settings
gs
Browse
gb
Search
/
General
Show this help
?
Submit feedback
!
Close modal / unfocus
Esc

Press ? anytime to show this help