Steve
demo.mp4
We built Cursor for Minecraft. Instead of AI that helps you write code, you get AI agents that actually play the game with you.
What It Does
Steve acts as an Agent, or a series of Agents if you choose to employ all of them. You describe what you want, and he understands the context and executes. Same concept here, except instead of code editing, you get embodied Steves that operate in your Minecraft world.
The interface is simple: press K to open a panel, type what you need. The agents handle the interpretation, planning, and execution. Say “mine some iron” and the agent reasons about where iron spawns, navigates to the appropriate depth, locates ore veins, and extracts the resources. Ask for a house and it considers the available materials, generates an …
Steve
demo.mp4
We built Cursor for Minecraft. Instead of AI that helps you write code, you get AI agents that actually play the game with you.
What It Does
Steve acts as an Agent, or a series of Agents if you choose to employ all of them. You describe what you want, and he understands the context and executes. Same concept here, except instead of code editing, you get embodied Steves that operate in your Minecraft world.
The interface is simple: press K to open a panel, type what you need. The agents handle the interpretation, planning, and execution. Say “mine some iron” and the agent reasons about where iron spawns, navigates to the appropriate depth, locates ore veins, and extracts the resources. Ask for a house and it considers the available materials, generates an appropriate structure, and builds it block by block.
What makes this interesting is the multi-agent coordination. When multiple Steves work on the same task, they don’t just independently execute, they actively coordinate to avoid conflicts and optimize workload distribution. Tell three agents to build a castle and they’ll automatically partition the structure, divide sections among themselves, and parallelize the construction.
The agents aren’t following predefined scripts. They’re operating off natural language instructions, which means:
- Resource extraction where agents determine optimal mining locations and strategies
- Autonomous building with agents planning layouts and material usage
- Combat and defense where agents assess threats and coordinate responses
- Exploration and gathering with pathfinding and resource location
- Collaborative execution with automatic workload balancing and conflict resolution
How It Works
Each Steve is basically running an agent loop. When you give a command:
- It goes to an LLM; we’re using Groq for fast inference
- The LLM breaks down your request into structured code
- Code gets executed using Minecraft’s actual game mechanics
- If something fails, the agent asks the LLM to replan
Multi-Agent Coordination
The interesting part is when you have multiple Steves working together. We built a coordination system so they don’t step on each other’s toes.
When you tell several agents to build the same structure, they:
- Automatically split it into sections
- Each take a part
- Don’t place blocks in the same spot
- Rebalance work if someone finishes early
The coordination happens server-side through a manager that tracks active builds and assigns work. It’s deterministic, so there’s no race conditions or weird conflicts.
Setup
You need:
- Minecraft 1.20.1 with Forge
- Java 17
- An OpenAI API key (or Groq/Gemini if you prefer)
Installation:
- Download the JAR from releases
- Put it in your
modsfolder - Launch Minecraft
- Copy
config/steve-common.toml.exampletoconfig/steve-common.toml - Add your API key to the config
Config looks like this:
[openai]
apiKey = "your-api-key-here"
model = "gpt-3.5-turbo"
maxTokens = 1000
temperature = 0.7
Then just spawn a Steve with /steve spawn Bob and press K to start using them.
How We Built This
Tech Stack:
- Minecraft Forge 47.2.0 for the modding framework
- Java 17
- Groq API for the agent reasoning (pluggable, also supports OpenAI and Gemini)
- Standard Minecraft pathfinding for movement
- Langchain
Architecture:
The core is in the agent package. Each Steve runs a ReAct-style loop:
- Reason about what to do
- Act by executing Java code
- Observe the results
- Repeat
For memory, each Steve maintains a conversation history and context about the world. This gets injected into every LLM call so agents can handle follow-up commands without you repeating context.
The collaborative building system was trickier. We had to build a manager that:
- Divides structures into spatial sections
- Assigns Steves to sections
- Prevents conflicts when placing blocks
- Handles reassignment when Steves finish
It’s all server-side, so there’s no synchronization issues.
Project Structure:
src/main/java/com/steve/ai/
├── entity/ # Steve entity class, spawning, lifecycle
├── ai/ # LLM clients (OpenAI, Groq, Gemini), prompt building
├── action/ # Action classes for mine, build, combat, etc
├── agent/ # Core agent loop and coordination
├── memory/ # Context management and world state
├── client/ # GUI (the Cursor-style panel)
└── command/ # Minecraft commands (/steve spawn, etc)
If you want to understand how it works, start in the agent package. That’s where the reasoning loop lives.
Building From Source
Standard Gradle stuff:
git clone https://github.com/YuvDwi/Steve.git
cd Steve
./gradlew build
Output JAR is in build/libs/.
Usage Examples
Once you’ve got Steves spawned, just press K and start talking:
"mine 20 iron ore"
"build a house near me"
"help Alex with the tower"
"defend me from zombies"
"follow me"
"gather wood from that forest"
"make a cobblestone platform here"
"attack that creeper"
The agents are pretty good at figuring out what you mean. You don’t need to be super specific.
Known Issues
The agents are only as smart as the LLM. GPT-3.5 works but makes occasional weird decisions. GPT-4 is noticeably better at multi-step planning.
No crafting yet. Agents can mine and place blocks but can’t craft tools. We’re working on it.
Actions are synchronous. If a Steve is mining, it can’t do anything else until done. Planning to add proper async execution.
Memory resets on restart. Right now context only persists during a play session. We’re adding persistent memory with a vector DB.
What’s Next
Things we’re working on:
- Crafting system so agents can make their own tools
- Voice commands via Whisper API
- Vector database for long-term memory
- Async action execution for multitasking
- More complex building templates
Goal is to make this actually useful for survival gameplay, not just a tech demo.
Contributing
If you want to add stuff:
- Fork the repo
- Make your changes
- Make sure it builds with
./gradlew build - Submit a PR
If you’re adding new actions, update the prompt template in PromptBuilder.java so the LLM knows about them.
Why We Made This
We wanted to see if the Cursor model could work outside of coding. Turns out it translates pretty well. Same principles: deep environment integration, clear action primitives, persistent context.
Minecraft is actually a good testbed for agent research. Complex enough to be interesting, constrained enough that agents can actually succeed.
Plus it’s just fun watching AIs build castles while you explore.
Credits
- OpenAI for GPT
- Minecraft Forge for the modding API
- LangChain/AutoGPT for agent architecture inspiration
License
MIT
Issues
Found a bug? Open an issue: https://github.com/YuvDwi/Steve/issues