Open-source RAG/LLM evaluation framework; I’m part of the team and would love feedback

Rhesis: Open-Source Gen AI Testing

Your team defines expectations, Rhesis generates and executes thousands of test scenarios. So that you know what you ship.

Rhesis is an open-source testing platform that transforms how Gen AI teams validate their applications. Collaborative test management turns domain expertise into comprehensive automated testing: legal defines requirements, marketing sets expectations, engineers build quality, and everyone knows exactly how the Gen AI application performs before users do.

🎯 Why Rhesis?

The Gen AI Testing Challenge

Gen AI applications present unique testing challenges that traditional approaches can’t handle:

Non-deterministic outputs: Same input, different responses
Unexpected edge cases: Unpredictable user inputs lead…

Rhesis: Open-Source Gen AI Testing

Your team defines expectations, Rhesis generates and executes thousands of test scenarios. So that you know what you ship.

🎯 Why Rhesis?

The Gen AI Testing Challenge

Gen AI applications present unique testing challenges that traditional approaches can’t handle:

Non-deterministic outputs: Same input, different responses
Unexpected edge cases: Unpredictable user inputs lead to problematic outputs
Ethical risks: Biased, harmful, or inappropriate content generation
Compliance requirements: Industry-specific regulatory standards

Traditional testing with hand-coded scenarios can’t scale to unlimited user creativity. Rhesis addresses these challenges through collaborative test management that generates comprehensive automated coverage.

Make testing a peer to development

You’ve transformed your product with Gen AI, now transform how you test it. Testing deserves the same sophistication as your development tooling.

Your whole team should define what matters

Your legal, marketing, and domain experts know what can actually go wrong. Rhesis makes testing everyone’s responsibility.

Know what you’re shipping

The best AI teams understand their system’s capabilities before release. Get complete visibility into how your Gen AI performs across thousands of real-world scenarios.

✨ Key Features

Collaborative Test Management: Your entire team contributes requirements, legal, compliance, marketing, domain experts, all without writing code
Automated Test Generation: Automatically generate thousands of test scenarios from team expertise, requirements and existing knowledge sources
Comprehensive Coverage: Scale from dozens of manual tests to thousands of automated scenarios that match your AI’s complexity
Edge Case Discovery: Find potential failures before your users do with sophisticated scenario generation
Compliance Validation: Ensure Gen AI systems meet regulatory and ethical standards with team-defined requirements
Performance Analytics: Track quality metrics over time

🌐 Open Source & Community-Driven

Rhesis is built by Gen AI developers who experienced inadequate testing tools firsthand. The core platform and SDK remain MIT-licensed forever, with a clear commitment: core functionality never moves to paid tiers. All commercial code lives in dedicated ee/ folders.

Join our community calls to discuss roadmap, features, and contributions. Connect via Discord for announcements.

📑 Repository Structure

This monorepo contains the complete Rhesis ecosystem:

rhesis/
├── apps/
│   ├── backend/       # FastAPI backend service
│   ├── frontend/      # React frontend application
│   ├── worker/        # Celery worker for background tasks
│   ├── chatbot/       # Conversational testing interface
│   └── polyphemus/    # Uncensored LLM for comprehensive test generation
├── sdk/               # Python SDK for Rhesis
├── infrastructure/    # Infrastructure as code
├── scripts/           # Utility scripts
└── docs/              # Documentation

🚀 Quick Start

Option 1: Use the cloud platform (fastest)

Get started in minutes at app.rhesis.ai:

Create a free account
Start generating test scenarios collaboratively
Invite your team to define requirements together

Option 2: Use the SDK

Install and configure the Python SDK:

pip install rhesis-sdk

Quick example:

import os
from pprint import pprint

from rhesis.sdk.entities import TestSet
from rhesis.sdk.synthesizers import PromptSynthesizer

os.environ["RHESIS_API_KEY"] = "rh-your-api-key"  # Get from app.rhesis.ai settings
os.environ["RHESIS_BASE_URL"] = "https://api.rhesis.ai"  # optional

# Browse available test sets
for test_set in TestSet().all():
pprint(test_set)

# Generate custom test scenarios
synthesizer = PromptSynthesizer(
prompt="Generate tests for a medical chatbot that must never provide diagnosis"
)
test_set = synthesizer.generate(num_tests=10)
pprint(test_set.tests)

Option 3: Run locally with Docker (zero configuration)

Get the full platform running locally with a single command - no configuration needed:

# Clone the repository
git clone https://github.com/rhesis-ai/rhesis.git
cd rhesis

# Start all services (auto-generates .env.docker.local with encryption keys)
./rh start

That’s it! Visit http://localhost:3000 - you’ll be automatically logged in to the dashboard!

What happens automatically:

✅ Generates database encryption key
✅ Creates .env.docker.local with local configuration
✅ Enables auto-login (no Auth0 setup needed)
✅ Starts all services (backend, frontend, database, worker, docs)
✅ Creates default admin user and example data

Optional: To enable test generation, get your API key from app.rhesis.ai, then edit .env.docker.local and update RHESIS_API_KEY.

Useful commands:

./rh logs          # View logs from all services
./rh stop          # Stop all services
./rh restart       # Restart all services
./rh delete        # Delete everything (fresh start)

Note: This is a simplified setup for local testing only. For production deployments, see the Self-hosting Documentation.

👥 Contributing

Rhesis thrives thanks to our community. Here’s how you can contribute:

Ways to Contribute

Code: Fix bugs, implement features, improve performance
Test Sets: Contribute test cases for common AI failure modes
Documentation: Enhance guides, tutorials, and API references
Community Support: Help others in Discord or GitHub discussions
Feedback: Report bugs, request features, share your experience

Contributing Workflow

Fork the repository
Create a feature branch (git checkout -b feature/amazing-feature)
Make your changes with tests
Commit with clear messages
Push and open a pull request

We review PRs regularly and maintain a welcoming environment through our code of conduct.

Detailed guidelines: CONTRIBUTING.md Release process: RELEASING.md

📝 License

Community Edition: MIT License - see LICENSE file for details.

Enterprise Edition: Enterprise features located in ee/ folders are subject to separate commercial licenses. Contact hello@rhesis.ai for enterprise licensing information.

🆘 Support

Documentation: docs.rhesis.ai
Discord Community: discord.rhesis.ai
GitHub Discussions: Community discussions
Email: hello@rhesis.ai
Issues: Report bugs or request features

Made with in Potsdam, Germany

Learn more at rhesis.ai

Rhesis: Open-Source Gen AI Testing

🎯 Why Rhesis?

Rhesis: Open-Source Gen AI Testing

🎯 Why Rhesis?

✨ Key Features

🌐 Open Source & Community-Driven

📑 Repository Structure

🚀 Quick Start

Option 1: Use the cloud platform (fastest)

Option 2: Use the SDK

Option 3: Run locally with Docker (zero configuration)

👥 Contributing

Ways to Contribute

Contributing Workflow

📝 License

🆘 Support

Similar Posts