Optimizating Rendering for a Big Procedural City
Summary
This is a long post where I describe the procedural generation algorithm we used for generating our city, then diagnose its performance problems and explain my shader-based rendering optimization.
Introduction
I have been working with a friend on a Unity game that takes place in a big procedurally generated city. This is what our city currently looks like:
In this post, I want to document some of the optimization work I've done on this project. Our city started off very small, but we ran into performance issues as we started to scale it up. I spent about 2 weeks working out the optimizations could do. It took a lot of fiddling to get there. In the end, the biggest changes I made involved adjusting the way we rendered buildings so that all of the distant buildings can be rendered mostly on the GPU. I would like to elaborate a bit more on this, but first, I need to explain how our city was generated.
How We Are Procedurally Generating a City
The road structure of our city is generated with a recursive division algorithm, inspired by maze generation algorithms of the same name. We start with a grid of road and non-road tiles, with solid lines of road running horizontally and vertically across a grid:
This big grid then divided into smaller regions recursively, while blocking off roads at each division to create organic looking road structures. In more technical words, at each recursive step:
- a random line of road tiles across the grid is chosen as the dividing line.
- then, a random road tile on the dividing line is converted into a non-road tile
- then, recursively apply these steps to the two regions of grid on the two sides of the dividing line (unless they are too small to be divisible)
The result of the recursive division results in a network of roads that has a mix of long, straight paths and short, twisty paths:
From here, buildings are placed on non-road tiles: going along the grid horizontally and vertically, one building is placed on every other tile. Then, these buildings are stretched/scaled where possible to fill up any extra non-road tiles that were created during the recursive division process.
Building Models and Variations
Each building uses a random model, selected from one of the three building models we've created. Each building also has a random orientation, color, and height. To reach the desired height, multiple instances of the base model will be stacked on top of each other. Those instances will also be stretched vertically as needed. The resulting buildings look like this:
To add more visually interesting landmarks, we added groups of taller buildings. We also mega-sized certain buildings. Megabuildings take up a 5x5 set of tiles.
We still had other features left to add, like big billboards, highways, and other cyberpunk elements. But first, we needed to make sure our city can scale up to tremendous sizes while maintaining good performance. We wanted a BIG city, after all!
Performance Problems Our Procedural City
All of the screenshots above were taken with a render distance of 80 tiles. That translates to about 1200 buildings within render distance, which is far too small for our fantasy of a BIG city. Here is what the view looks like with a render distance of 360 tiles (~25k buildings within render distance):
The view looks much more "rich". There is now a sea of smaller, distant buildings filling in the empty space in the background. No more void; there is only buildings until you stare far into the distance. Ideally, we want to turn up the render distance even farther, but we ran into clear performance problems at this point.
Using the profiler, I noticed that rendering performance seemed to scale directly with the # of batches/triangles/vertices, and that the most time-consuming functions reported by the profiler was related to mesh batching and culling. From this, I concluded that the insane number of meshes in the scene was the problem.
A few things to note:
- I am running on a Ryzen 7 3700X and Radeon RX 6600
- we are using the Unity 2022.3 LTS
- we are using Unity's Universal Render Pipeline with the SRP Batcher enabled
- each building mesh is controlled by an LOD Group component, which switches between the detailed building model and a simplified building model based on distance/visibility
In order to confirm the problem, I tried adjusting the code for our building generation to reduce the number of meshes in the scene. One thing that was contributing to the high mesh count was having thousands of buildings in our scene, where each one of them could be comprised of up to 32 instances of the base building mesh, stacked on top of each other to reach the desired height. To reduce the mesh count per building, I created meshes that are comprised of multiple stacks of the base building models, so that each building will be comprised of only a single mesh instead of up to 32. With this optimization, our performance improved drastically:
The number vertex and triangle count became much higher. This was because I didn't adjust the LOD thresholds, so a lot of the buildings were being shown in much higher detail than what was needed. This would get fixed later, but seeing it not cause any problems on my computer confirmed that the number of meshes in the scene was the real issue.
Optimizing Our Procedural City
The plan for optimizing the number of meshes is as follows:
- Our buildings are very rectangular and box-like, so when they are far, they can be replaced with imposters. In our case, each building can be replaced with a simple box with each side textured to look like the actual building.
- The imposter would just be a textured box, which has less vertices and triangles than both the fully detailed and less detailed LOD versions of our building models.
- More importantly, the imposter version of each building can be rendered from the exact same box mesh. It's just the texturing on each side of the box that would be different.
Given all that, what I thought to do was:
If it works, then we can easily reduce the number of batches contibuted by faraway buildings by a factor of 100 (or however many buildings the single mesh will replace).
In order for the shader to render the boxes like the buildings they are replacing, it needs to know each building's base model, color, size, orientation, etc. We don't have this information in the shader. Most of this data is being generated on the CPU at startup, deterministically from the building's position. It would technically be possible to reproduce this data inside the shader, or get the CPU to pass it all onto the GPU at runtime, but doing so came with performance complications.
Instead, I figured we can calculate all of that data beforehand and encode them into textures that the shader can read from. Each pixel in the texture would store the data required for one tile in-game. In order to store all the required data to reproduce a 2048x2048-tile-sized city, we needed five 2048x2048 RGBA textures. The resulting textures looked like this:
The textures above contain data relating to:
- top left: structure type (red), building orientation (green), building horizontal scale (blue), building position offset (alpha).
- top right: building base model ID (red), building # of stacks (green), building height (blue), other building rotation data (alpha)
- bottom left: building window color (RGB, alpha = intensity).
- bottom right: building accent color (RGB, alpha = intensity).
The main downside of this approach is that we would no longer be able to generate an infinite-sized city, our city could only be as big as the size of the textures. However, a 2048 by 2048 tiles feels big enough. If we wanted to restore infinite procedural generation in the future, we can probably work around this limitation (maybe by randomly mixing together random segments from these textures instead of reading it as is).
The Shader
With that said, we can utilize the mesh of boxes and the data textures with a shader like this:
The shader is divided into these sections:
- For each vertex, find the building it belongs to and the position of that building.
- Converting the building's position into the corresponding UV coordinates of its pixel on the data textures.
- Reading the data textures for the pixel containing the building's data. There are 5 textures being read here (4 of which were shown above).
- Sections 4-8 are for decoding the information that were encoded in the data textures, like the building's type, orientation, base model ID, number of stacks, and colors.
- For megabuildings, moving the vertices to enlarge the box to the size of a megabuilding.
- For normal buidlings, moving the vertices to account for any horizontal stretching to fill empty non-road space.
- Moving the vertices vertically according to the desired height of the building.
- Passing many of the decoded values to the fragment shader as interpolators.
- Calculating the UV cordinates of the fragment, accounting for the building's orientation and stack count.
- Sampling the imposter building textures (shown below) and mixing in the building's colors.
- This just adds some fancy textures for the windows.
Imposter Building Textures:
Here is a comparison of the fully detailed models, their less detailed LOD versions, and the imposter versions rendered with the shader above:
With this, I was able to greatly optimize the mesh count of our city by replacing distance chunks of buildings with imposter versions. Here is the performance after making this optimization:
Here is the new look of the city with a render distance now turned up to 810 tiles (still with good performance!), and a screenshot of what is was like before, for comparison:
The batch count being in the thousands still felt uncomfortable, but it was no longer a bottleneck and so was left as a problem for future me.
Other Optimizations
There were a few other optimizations I did that improved the performance of our game (though not as much as the building imposters did). These were:
- flattening the hierarchy of the scene, which reduced the cost of propagating transform changes when I moved the chunks and buildings around (for object pooling/recycling).
- makings similar imposters for the roads. Each road tile was also an individual mesh, which was optimized by replacing many of them with a single plane that is textured to replicate the same look. This would lose detail on the road tiles when viewed up close, so I only did it for the distant roads. The really distant roads were not rendered at all since they were obstructed by all the buildings.
There were also a few other optimizations that I considered or tried, but did not end up using. These were:
- using fog to avoid rendering distant buildings — while we still have fog to fade out distant buildings, we wanted players to feel like they can see clear and far. Adding heavy fog to avoid rendering distant buildings was not the artistic direction we wanted to go in.
- using the skybox or large background billboards to fake distant buildings — while having a skybox that mimics distant lights can help with our aesthetic, I found that players like to use distant buildings as landmarks to orient themselves as they ran around the city. Thus, we needed those distant buildings to be real or players would get confused.
- occlusion culling: initially, occlusion culling seemed like the ideal solution; many building were being hidden behind other buidlings and could theoretically be occlusion culled. However, the worst performance happens when the player gets a high-angle view of the city, where a tremendous amount of buildings are visible and occlusion culling is the least effective. This makes occlusion culling seem not worth its cost, although I will probably consider it again when more optimization is needed.
Future Work
There is plenty of room for improvement. In particular, I am interested in trying out True Imposters, which might allow for buildings with more complex shapes than the mostly rectangular ones we have now. Another thing that we would want to add later on is more building decorations and accessories, like signs, pipes, wires, antennas, etc. I suspect most of these details will be too small to be worth rendering on distant buildings, but in the case that they aren't, we'll have to figure out something for them.
That is all for now. We still have a game to make inside of our city.