Crafting Generative CSS Worlds

There’s something about isometric projections that evokes a cozy, nostalgic feeling. Most likely, the culprit is the wave of ’90s pixel-art classic games that etched the aesthetic into our collective memory, from Populous to Transport Tycoon.

In this article, we’ll explore how to recreate that same charm with modern CSS. More specifically, we’ll look under the hood of the newly released Layoutit Terrain Generator to learn how stacked grids and 3D transforms can be combined to create a fully addressable 3D space in the browser.

*(If you want to dive deeper into how the grid 3D structure works under the hood, the CSS Voxel Editor article explores…

(If you want to dive deeper into how the grid 3D structure works under the hood, the CSS Voxel Editor article explores it in detail.)

Behold! A 3D terrain built entirely with stacked grids and transformed HTML elements: no canvas, no WebGL, just CSS doing its magic.

Setting the scene

After wrapping up the CSS Voxel Editor, I wanted a new challenge, something that pushed the limits of the stacked grid technique. That’s how I landed on a terrain generator, especially because it means expanding the shape grammar: to make it possible, we’ll need to build angles and slopes. But before all that can happen, we have to set the stage properly.

The** **.scene element acts as our camera mount: it’s where depth begins with the perspective property. By assigning a generous value (of 8000px) we get an almost-isometric look with a slight, natural distortion. Every child of this parent container inherits transform-style: preserve-3d, which basically ensures that the 3D transforms work as expected.

The .floor element defines the world’s tilt. By applying transform: rotateX(65deg) rotate(45deg), we angle the entire space into view, establishing the camera’s orientation. On top of this base, multiple .z elements are stacked vertically with translateZ(25px * level). That way, each layer acts as a grid slice at a specific height (a unique Z level) while the rows and columns define the X and Y coordinates.

Examining the stacked grids in devtools highlights the coordinate system that powers our 3D layout.

Together, these elements create the 3D grid where we will position our shapes. From this foundation, our terrain can start to rise!

<div class="scene">
<div class="floor">
<div class="z" style="transform: translateZ(0px);"></div>
<div class="z" style="transform: translateZ(25px);"></div>
<div class="z" style="transform: translateZ(50px);"></div>
<div class="z" style="transform: translateZ(75px);"></div>
</div>
</div>

.scene { perspective: 8000px; }

.scene * { transform-style: preserve-3d; }

.floor { transform: rotateX(65deg) rotate(45deg); }

.z {
display: grid;
grid-template-columns: repeat(32, 50px);
grid-template-rows: repeat(32, 50px);
}

Expanding the shape grammar

Beyond simple cubes, our world requires new primitives: we call them flats, ramps, wedges, and spikes, and they are the minimal units of our terrain generation.

Each shape tilts one or two planes to define its form. They follow a 2:1 dimetric system, where every unit of height equals two units of depth. In practice, this results in cells measuring 50×50×25px. The common face tilt of arctan(0.5) ≈ 26.565° keeps geometry consistent across tiles, ensuring clean shading transitions and seamless slopes between neighboring cells.

Let’s take a closer look at how each shape comes together:

Flat shape

Flat stays horizontal; it is only a plane translated in the Z dimension by 25px, and rotated to match its cardinal orientation.

.tile.flat {
transform: translateZ(25px) rotate(0deg);
}

Ramp shape

Ramp reuses the same flat container, but adds one rectangular face pseudo element tilted 26.565° to create the slope.

.tile.ramp {
transform: translateZ(25px) rotate(0deg);
}

.tile.ramp::before {
content: "";
position: absolute;
inset: 0;
transform-origin: top left;
transform: rotateY(26.565deg);
}

Wedge shape

Wedge combines the ramp’s sloped face with a mirrored one turned 90 degrees, creating a concave junction between them.

.tile.wedge {
transform: translateZ(25px) rotate(0deg);
}

.tile.wedge::before,
.tile.wedge::after {
content: "";
position: absolute;
inset: 0;
transform-origin: top left;
}

.tile.wedge::before {
transform: rotateY(26.565deg);
}

.tile.wedge::after {
transform: rotate(-90deg) scaleX(-1) rotateY(26.565deg);
}

Spike shape

Spike mirrors the ramp to form a peak. It combines two opposing slopes: the front ramp leans inward, and a mirrored one rises until they meet in a convex ridge.

.tile.spike {
transform: translateZ(25px) rotate(0deg);
}
.tile.spike::before,
.tile.spike::after {
content: "";
position: absolute;
inset: 0;
transform-origin: top left;
}

.tile.spike::before {
transform: rotateY(26.565deg);
transform-origin: bottom left;
}

.tile.spike::after {
transform: translateZ(-25px) rotateX(26.565deg);
}

Textures and lighting

Since our shapes are just normal DOM elements, we can style them easily with CSS. In this case, using background-image or background-color is the best choice, as that doesn’t add new nodes (like <img> or <svg> would). As a middle ground, adding inline SVGs to select shapes can make sense when animations or interactions are needed.

Lighting in this engine is directional and baked into the textures. We fix a light source to the west (180°) and classify each visible face into one of four brightness bands based on its angle to that light. Each shape receives a light-level class (.l-1 to .l-4) based on its orientation to the light source. The result is believable shading that remains consistent even as the scene rotates.

A closer look at Layoutit Terra’s sprites for different shapes, biomes and lighting levels.

Making some noise

A terrain is a heightmap: a set of 2D arrays of elevation values built from noise and shaped into a rough landmass. The initial raw field comes from a library like simplex-noise, followed by many refinement passes. This smooths out speckles, terraces steep areas, and limits how much steepness can vary. One of the golden rules of this world is that tiles can’t differ by more than one height level, which keeps slopes consistent and prevents cliffs from forming.

On the user side, two main knobs are exposed: landmass coverage, which controls the percentage of water that fills the map, and terrain type, which sets the ceiling for elevation in the scene.

A raw look at the heightmap array, where dots mark water and numbers land elevation.

Once the heightmap is built, a classifier decides which shape fits each cell. Tiles can have up to eight possible neighbors, each with four rotation states, which quickly adds up to hundreds of combinations. To handle all that complexity, a rulebook defines how shapes should meet at every cardinal point. And when those rules still fall short (like on sharp intersections or extreme slopes) a set of manually curated overrides steps in to clean things up and keep the terrain stable.

Performance notes

One of the main bottlenecks with stacked grids is how many DOM elements they can hold. Every tile, face, and layer adds up, and by the time we render a large terrain, the browser is already juggling thousands of nodes. A 32×32×12 grid is roughly the safe limit for most modern systems; beyond that, rendering becomes unpredictable, frame rates drop, and tiles may flicker or disappear altogether.

The Rendering panel in DevTools can reveal layer borders, paint flashing and frame rendering stats, an invaluable toolkit for working on 3D CSS scenes.

The real pain point came from using clip-path to draw the triangular faces for wedges and spikes. It looked clean and purely CSS, but it forced the browser to repaint every time the scene rotated, dragging performance down. The fix was to switch to pre-cut PNG sprites with transparent backgrounds. Until browsers properly optimize clip-path in 3D contexts, sprites remain the most reliable choice.

Next steps

Beyond being a great technical challenge, this project proved that the stacked grid technique can go far beyond cubes. Adding slopes and angles opens up a new kind of depth: 3D volumes that actually feel shaped by light and form, even though everything is still just CSS.

Swapping a single class on the .scene container instantly changes the biome, updating the background-image textures across every voxel shape.

From here, there are plenty of paths to explore. Isometric web games are an obvious one, but also lightweight, interactive experiences that live right in the browser. The goal isn’t to replace WebGL, but to explore a different way of building 3D projects that stay simple, readable, and inspectable.

As for my next 3D grid project, it could involve turning the terrain inside out: mirroring two vertical grids, using duplicate heightmaps to form a single continuous volume. Maybe that’s how we reach a true CSS sphere.