Press enter or click to view image in full size
Craft your data storage as a mythical hero.
5 min readJust now
–
Prologue: The Rise of Federated Architecture
Hey there and thanks for reading! As it often happens, building of a Data Warehouse (DWH) begins from a stakeholder’s request and pipelines which feed it by data. Over time, the DWH grows into a central fortress — impossible to conquer, and even harder to manage.
Today let’s discuss architectural features and find the answer to the question “when should we think about Data Architecture?”.
Previously, companies had one central data team which built the entire infrastructure, reports, dashboards, etc. and at the same time, was a bottleneck. In some companies such teams are big but still inefficient because many peopl…
Press enter or click to view image in full size
Craft your data storage as a mythical hero.
5 min readJust now
–
Prologue: The Rise of Federated Architecture
Hey there and thanks for reading! As it often happens, building of a Data Warehouse (DWH) begins from a stakeholder’s request and pipelines which feed it by data. Over time, the DWH grows into a central fortress — impossible to conquer, and even harder to manage.
Today let’s discuss architectural features and find the answer to the question “when should we think about Data Architecture?”.
Previously, companies had one central data team which built the entire infrastructure, reports, dashboards, etc. and at the same time, was a bottleneck. In some companies such teams are big but still inefficient because many people are hungry for data insights and the central team has not enough capacity to serve all of them. Teams across the business found themselves queuing up, waiting for the central team to ingest, process, and serve their data needs. “Traditional” centralized data architectures often struggle with ownership transparency, scalability, data discoverability, and delivery speed.
Nowadays, more and more companies are breaking down these giant fortresses and moving to Federated Data Architecture implemented through the Data Mesh paradigm. This architecture matches the pace of modern business and drives performance. This shift unlocks teams and enables them to work fast, smoothly and efficiently. The key of this independence and fast pace is Data Domains.
We’ll drill down into this topic shortly, but firstly special thanks to Theo Gough for his inspiration, thoughtful discussions, and debates we had during our search for truth.
Forged in Data
If the old approach was a fortress, the federated architecture approach is a Hephaestus’s forge. In Greek mythology, Hephaestus was not a warrior but a craftsman — god of fire, metallurgy, and creation. He didn’t control Olympus with walls; he empowered it with tools: shields, weapons to empower each own strength.
In a Data Mesh, Data Domains are the forges. Some of them are big and some are small. But all of them have their own specialties and they shape raw material into something purposeful. Each domain becomes responsible for the full lifecycle of the Data and Data Products within its boundary, encompassing:
- Data Ingestion: Responsibility for the initial acquisition of relevant data into the domain boundary.
- Data Transformation: Ownership of the processing, cleaning, modeling, and aggregation of raw data into consumable data assets, metrics, and KPIs.
- Data Output: Accountability for serving curated data, metrics, and KPIs to other domains and data consumers as high-quality data products.
- Data Contracts: Management of formal agreements defining the structure, quality, and semantics of data exchanged between data providers and data consumers.
Just as Hephaestus crafted intricate, beautiful works, Data Domains provide a clear and intuitive structure that makes discovery, scalability, and ownership not only possible but natural.
The Art of Data Craftsmanship
Data Governance teams often analyze several approaches when structuring domains. These can be derived from existing organizational blueprints, such as business functions, process maps, or business capability maps. Each approach has trade-offs:
- By Operation **(Function or Business Unit-Based) **— aligns with company structure, but risks silos.
- By Data Pipeline **(Flow-Based) **— great for simple flows, but grows messy fast.
- By Core Business Objects (Entity-Based) — models real-world entities like Customer, Partner or Payment; hardest to implement but most scalable, also, offers high cohesion within domains and low coupling across them, enabling stronger collaboration and discovery.
Among the proposed options, Entity-Based approach often provides the strongest alignment with business reality, supporting adaptability and scalability, especially in larger more complex businesses with multiple value streams. I would emphasize that this is a journey and all organisations search their own way. This is just one possible example that could be explored.
In an ideal world, Data Domains are independent, but sometimes compromises have to be made.
Even greater benefits can be achieved when both analytical and backend systems operate within the same Entity-Based domain model and share identical domain boundaries. This alignment enables the Data Governance team to establish true end-to-end data lineage — from the line of backend code that generates an event, all the way to the analytical dashboard where that event ultimately appears. Such transparency significantly improves traceability, compliance, and trust across the entire data ecosystem.
Building Domains That Create, Not Constrain
Tools for Autonomous Teams
Moving away from the approach where the one central team serves to all data consumers and stakeholders is made by adopting Federated Data Architecture built on Data Mesh principles.
In this model ownership and maintenance of data assets and data pipelines is distributed across Domain Owners. The teams who are responsible for serving trusted and reliable information and are the domain experts.
Building independent Data Domains requires tools. This is why a Data Platform with Self-Serve Data Infrastructure is becoming a critical element in equipping teams with the necessary capabilities. In this environment, teams can move at their own fast-pace and never run into bottlenecks at the central gate.
As Hephaestus gave heroes the tools to act on their own quests, self-serve platforms provide teams with the instruments to craft and innovate — without waiting in line at a single workshop.
Sub-Domains: Balancing Growth and Governance
As the size of the DWH and Domains inside increase, managing of the capacity can become a challenge. Data Domains must be flexible enough to break them down into segments, sub-domains, for practical reasons, such as:
- Managing Complexity: Breaking large entities into smaller, more focused units.
- Refining Ownership: Clarifying accountability across subsets of data.
- Supporting Scalability: Allowing independent growth without creating friction.
One of the methods that can be applied is to structure sub-domains along the journey stages of an entity (e.g., Awareness, Acquisition, Retention, Support, etc.). Each stage becomes its own forge, shaping data into products tailored for that particular context.
Cohesion Within, Autonomy Between
Crucially, implementing domain-driven architecture doesn’t mean isolation. High cohesion inside domains means that data assets, dashboards, and pipelines work together as tightly related “friends” within the same boundary. At the same time, low coupling between domains ensures freedom and autonomy across the organization.
This balance mirrors how Hephaestus’s creations — whether armor, weapon, or tools — were distinct in purpose yet tied within the grand mythological story.
Epilogue: Building a Data Mesh Mindset
The transition to federated, domain-driven data architecture is a recognition that centralized control cannot meet modern enterprise demands. By defining Data Domains, enabling self-serve Data Platforms, and introducing sub-domains (if they are needed) to manage complexity, organizations move from being slow and constrained by bottlenecks to operating like distributed forges of innovation.
Just imagine how each domain becomes a workshop of expertise, shaping data into precise products that empower the business. That’s a future I’d be proud to build — and I hope you would too.
Thanks for reading and stay tuned. See you in the next article.