The Autonomous Data
Infrastructure Platform. AI-powered schema intelligence, governance, and evolution.
Forge transforms nested JSON into production-ready dbt models for BigQuery, Snowflake, Databricks, and Redshift—with AI that classifies schemas, recommends governance policies, and evolves with your data.
Stop writing ETL. Deploy infrastructure that understands and manages itself.
{ "event_id": "evt_123", "user": { "id": 42, "traits": { "plan": "pro" } }, "items": [ { "sku": "A-1", "price": 10.00 }, { "sku": "B-2", "price": 5.50 } ] }
TABLE `events_v1` ( event_id STRING, _ingested_at TIMESTAMP ); TABLE `events_v1__user` ( _parent_id STRING, id_hash STRING, traits_plan STRING ); TABLE `events_v1__items...
The Autonomous Data
Infrastructure Platform. AI-powered schema intelligence, governance, and evolution.
Forge transforms nested JSON into production-ready dbt models for BigQuery, Snowflake, Databricks, and Redshift—with AI that classifies schemas, recommends governance policies, and evolves with your data.
Stop writing ETL. Deploy infrastructure that understands and manages itself.
{ "event_id": "evt_123", "user": { "id": 42, "traits": { "plan": "pro" } }, "items": [ { "sku": "A-1", "price": 10.00 }, { "sku": "B-2", "price": 5.50 } ] }
TABLE `events_v1` ( event_id STRING, _ingested_at TIMESTAMP ); TABLE `events_v1__user` ( _parent_id STRING, id_hash STRING, traits_plan STRING ); TABLE `events_v1__items` ( _parent_id STRING, sku STRING, price FLOAT64 );
"Data infrastructure that classifies, governs, and evolves itself."
4-Layer AI Intelligence Stack
Forge’s AI understands your data, recommends governance policies, and will soon manage your entire pipeline autonomously.
⚒️ Forge Core
Production
Multi-Warehouse Compiler
The foundation. Transforms deeply nested JSON into production-ready dbt models with automatic normalization and type inference.
- BigQuery, Snowflake, Databricks, Redshift
- 5+ levels deep unnesting
- Automatic dbt docs & lineage
⚔️ Excalibur
Production
Schema Classification
Graph Neural Network that treats your schema as a graph to classify data patterns. Privacy-preserving—field names never leave your environment.
- GraphSAGE GNN + RandomForest
- 89% accuracy across 5 categories
- Privacy-first fingerprinting
🛡️ Pridwen
Production
Governance ML
Hybrid 3-layer system (Rules + ML + Crowd) that detects PII and recommends transformations like hash, mask, and encrypt.
- 15 SQL transformation templates
- Day-1 intelligence out of the box
- Gets smarter with every customer
🐴 Llamrei
Q2 2026
Schema Evolution
Automatically detects legacy API versions and normalizes them to modern golden schemas. Save $200K-$500K per avoided migration.
- 50+ API golden schemas
- Stripe, Salesforce, Shopify support
- Non-destructive transformations
🧙 Merlin
Q4 2026
Autonomous Agent
"Set up my data pipelines and maintain them." LLM-powered agent that understands goals, plans workflows, and self-heals failures.
- Natural language commands
- Multi-step planning & execution
- Learns from outcomes
Data Infrastructure That Manages Itself
❌ Fragile Pipelines: One upstream JSON change breaks your dbt models and crashes your dashboards.
❌ Vendor Lock-in: Custom ETL logic traps you in a single warehouse dialect.
❌ Manual Governance: Engineers waste 30% of their time on PII detection and compliance.
✅ AI-Powered Classification: Excalibur instantly understands your data patterns.
✅ Automatic Governance: Pridwen detects PII and applies transformations—no manual rules.
✅ Universal Compiler: One parse generates optimized SQL for 4 major warehouses.
Infrastructure that understands, governs, and evolves with your data.
🧠 AI Classification
Excalibur’s Graph Neural Network instantly understands your data patterns. Privacy-preserving design means field names never leave your environment.
🛡️ Automatic Governance
Pridwen detects PII and recommends transformations like hash, mask, and encrypt. Day-1 intelligence that gets smarter with every customer.
🔄 Multi-Warehouse
One JSON source compiles to optimized SQL for BigQuery, Snowflake, Databricks, and Redshift. Write once, deploy anywhere.
One JSON Source → Four Optimized Warehouses
Forge transforms nested JSON into query-ready native structs—ready to traverse with dot notation. 60-90% cost savings on repeated queries vs parsing raw JSON every time.
// Raw JSON (parsed every query = expensive)
get_json_object(get_json_object(root, ‘$.patient’), ‘$.name’)
⬇️ Forge Rollup (native struct traversal)
// Query any warehouse with dot notation
frg.root.patient[0].name
Compute Costs
60-90% savings
Query Speed
15x faster
Nesting Depth
5+ levels auto
How is Forge Different?
Traditional ETL tools replicate structured data. Forge understands, governs, and transforms any data—with AI intelligence built in.
| Capability | Traditional ETL (Fivetran, Stitch) | Forge |
|---|---|---|
| JSON Handling | Loads raw JSON into a single `VARIANT` column. Requires manual parsing. | Automatically unnests nested objects and arrays into clean, queryable tables with proper keys. |
| Schema Intelligence | No understanding of data semantics. Schemas are just column names. | Excalibur GNN classifies data patterns (payment, customer, inventory) to enable smart defaults. |
| Data Governance | Manual PII detection. Compliance is your problem. | Pridwen automatically detects PII and recommends hash, mask, or encrypt transformations. |
| Multi-Warehouse | Separate connectors/config per warehouse. Limited dialect support. | One source → simultaneous output for BigQuery, Snowflake, Databricks, and Redshift with optimized SQL. |
| Schema Evolution | Pipelines break when fields change. Manual intervention required. | Automatically detects and adapts to schema changes—no downtime. (Llamrei coming Q2 2026) |
Focus on Analytics, Not Engineering
Building a robust JSON processing pipeline is a significant engineering effort. Here’s how Forge accelerates your time to insight.
| Data Engineering Task | Manual Process (Time Estimate) | With Forge |
|---|---|---|
| Initial Schema Discovery | Write scripts to scan data, identify fields, and determine data types. (2-4 hours) | Automatic. Done in minutes during the first run. |
| Data Classification | Manually review schemas to understand data semantics. (1-2 hours) | Excalibur AI. Instant classification with 89% accuracy. |
| PII Detection & Governance | Security review, manual tagging, compliance documentation. (1-3 days) | Pridwen AI. Automatic detection and transformation recommendations. |
| Write Parsing & Unnesting Logic | Develop and debug complex SQL or Python code. (1-3 days) | Automatic. Forge handles all unnesting logic. |
| Total Time to Value | Days to Weeks | Under 15 Minutes |
Security & Compliance: Your Cloud, Your Rules
For organizations with strict data residency and security requirements, Forge Enterprise offers a unique deployment model that keeps you in control.
| Requirement | Traditional SaaS | Forge Enterprise |
|---|---|---|
| Data Location | Sent to vendor’s cloud | Stays in YOUR cloud |
| AI/ML Privacy | Your data trains their models | Privacy-first: only structural fingerprints |
| Security Review Time | 3-6 months | 2-4 weeks |
| Works with PII/PHI | Requires extensive compliance | Yes, by architecture |
From Black Box to Glass Box: Complete Transparency
Forge isn’t just a parser; it’s a fully-managed, transparent data cataloging system with AI intelligence. Every run generates rich metadata that gives you unprecedented visibility.
📜
Automatic dbt Docs
Forge uses dbt Core to power its transformations. After every run, it generates and hosts a full dbt documentation site for your project.
🔍
Complete Code Transparency
Ever wonder what a transformation tool is actually doing? With Forge, you can inspect the exact SQL code used to generate every table.
🗺️
End-to-End Data Lineage
The generated dbt docs include a complete, interactive DAG showing how data flows from your raw JSON sources to the final tables.
🌐
Write Once, Deploy Anywhere
Forge generates native dbt models for BigQuery, Snowflake, Redshift, and Databricks simultaneously from one JSON source.
🛠️
Full dbt Infrastructure
Get production-ready dbt models executed automatically—no separate dbt Cloud subscription needed (saves $300-500/month).
Get Started with Forge
💰 Transform JSON into AI-governed, query-ready tables for BigQuery, Snowflake, Databricks, or Redshift. Most customers see 3-5x ROI in the first month through compute savings alone.
Start Your Free Trial
🎁
30-Day Free Trial
Full access to test Forge
🧠
AI Features Included
Excalibur + Pridwen in trial
💳
No Credit Card
Start immediately
Pricing Options
Monthly Starter
$299/month
Up to 100M rows, 5 job profiles, all warehouses + AI.
Monthly Professional
$999/month
Up to 1B rows, unlimited profiles, priority support.
Enterprise
Custom pricing
In-VPC deployment, unlimited usage, SLA guarantees. Contact sales.