oculus-samples/Unity-SpatialLingo: Spatial Lingo is an open source Unity app for Meta Quest that helps users practice languages through real-world object recognition. Built with Meta SDKs, it’s a template for mixed reality experiences using AI.

Spatial Lingo

Spatial Lingo is a spatialized language practice experience, guiding users to identify and describe objects around their environment in a target language. This was made possible using Meta libraries, such as Llama, Mixed Reality Utility Kit (MRUK), and the Voice SDK. This experience supports both hand tracking and controllers.

Follow Golly Gosh (the polyglot!) as they lead you through your own, real-world space, allowing you to practice your vocabulary using familiar objects. Grow the language tree by completing lessons from Golly Gosh, learning nouns, verbs, and adjectiv…

Spatial Lingo

Project Overview

The Spatial Lingo project helps Unity developers understand and develop for multiple Meta features: Passthrough Camera API (PCA), Voice SDK, Interaction SDK, Mixed Reality Utility Kit (MRUK), Llama API, and Unity Sentis. The main scene as well as multiple sample scenes demonstrate the implementation and usefulness of each feature.

Gym Scene	Word Cloud Scene	Character Scene	Camera Image Scene

Getting Started

Getting The Code

First, ensure you have Git LFS installed by running this command:

git lfs install

Then, clone this repo using the "Code" button above, or this command:

git clone https://github.com/oculus-samples/Unity-SpatialLingo.git

Application Settings

For development, configure your Llama API key in Assets/SpatialLingo/Resources/ScriptableSettings/SpatialLingoSettings.asset.

Important: Do not ship Quest apps with embedded API keys, as they can be extracted from the app binary. For production, use LlamaRestApi.GetApiKeyAsync to implement server-side authentication. See the Llama API documentation for details.

How to run the project in Unity

Make sure you’re using Unity 6000.0.51f1 or newer
Load the Assets/SpatialLingo/Scenes/MainScene.unity scene
Open the Meta XR Simulator
Start Play Mode

Showcase Features

Each of these features have been built to be accessible and scalable for other developers to take and build upon in their own projects.

Object Identification

Spatial Lingo is able to identify objects around the user’s environment, allowing for spatial placement and dynamic generation of language lessons.

Lesson Generation

Dynamic vocabulary lessons are generated as the user progresses in growing the langauge tree. After objects are identified in the user’s environment, relevant verbs and adjectives for those objects are generated to allow for more lesson variety.

Voice Synthesis

Golly Gosh is able to speak in several different languages. Voice is dynamically synthesized from text, so they can teach users proper pronounciation during language lessons.

Voice Transcription

Users’ speech is transcribed when presented with a word cloud, which is also supported in several languages.

Lesson Evaluation

A user’s response is sent to Llama to determine if the user has responded well enough to complete a given lesson’s word cloud.

Dependencies

This project makes use of the following plugins and software:

Unity 6000.0.51f1 or newer
YOLO (with COCO dataset)
See MetaSdk.md for all Meta libraries used

Project Documentation

More information about the services and systems of this project can be found in the Documentation section.

Voice Services

LLM

Llama

Object Recognition and Tracking

Systems

Visual Scripting

Sample Scenes

Sample scenes can be found at Assets/SpatialLingo/Scenes.

Voice Transcription Scene

To run, open WordCloudSample.unity and enter play mode with the simulator. Click the "Activate Microphone" button to start transcription thorugh your microphone.

License

See LICENSE.md.

Contributing

See CONTRIBUTING.md.

Spatial Lingo

Spatial Lingo

Project Overview

Getting Started

Getting The Code

Application Settings

How to run the project in Unity

Showcase Features

Object Identification

Lesson Generation

Voice Synthesis

Voice Transcription

Lesson Evaluation

Dependencies

Project Documentation

Voice Services

LLM

Object Recognition and Tracking

Systems

Sample Scenes

Voice Transcription Scene

License

Contributing

Similar Posts