With Oculus’s recent announcement regarding requirements and specs for the consumer version of their HMD (https://www.oculus.com/blog/powering-the-rift/), I figured it was the perfect time to write that performance bit I teased about the last time around. Let’s see how we’re dealing with optimization on FATED! First, some math to have a clear vision of what we’re trying to achieve.
Know Your Numbers!
FATED is pretty much fillrate bound (http://en.wikipedia.org/wiki/Fillrate), and it’s safe to assume that most early VR games will also be. This is why the following info is important.
A current generation game will generally push about 124 million pixels per second when running at 60 fps in 1080p. FATED is currently running on a GTX 970 (the recommended card for the consumer version of the Oculus) at ~305 million pixels per second.
1920X1080 upscaled by 140% = 2688X1512 * 75(Hz) = ~305 million
The CV1 native resolution and refresh rate looks like this:
2160X1200 * 90(Hz) = ~233 million
Oculus’ Atman Binstock revealed that “At the default eye-target scale, the Rift’s rendering requirements go much higher: around 400 million shaded pixels per second.” After some math, you can figure out that this pretty much means a 130% upscale seems to be ‘the norm’ for Oculus.
2160X1200 upscaled by 130% = 2808X1560 * 90(Hz) = ~394 million
While we don’t have the final hardware yet, we can have an estimate of what it will need in terms of processing power. The closest we can get using the DK2 is by pushing the screen percentage to around 160. This is what we are aiming to achieve!
Trade-offs A.K.A. Battle with the Art Team
Unreal can be pretty daunting when you first enter it, mainly because the engine cranks everything to 11 by default. The first thing you want to do is take a look at what is costing so much and turn off what you don’t need. Sometimes, the art team will hate you for asking to remove that extra post-process, but remember that a vomit-free experience is more important!
With FATED, we decided to go with a ‘stylized’ look so we could remove some of Unreal’s cost-heavy features while keeping the visuals as stunning as possible. We are also making the conscious decision to have each scene as tightly contained as possible. We want to control which objects are seen at each moment (limit draw calls!) and so we design the levels in consequence. These assumptions allowed us to remove some of the features of the engine without overly affecting our visual target. Here are some decisions we made early on:
- No dynamic shadows, everything is baked (with exceptions…)
- No dynamic lights either (with exceptions…)
- Limit post-processes: No DOF, motion blur, or lens flare
Console Command Options
Here are some interesting commands that we used to disable some costlier features:
r.HZBOcclusion 0: Disables hardware occlusion
r.TranslucentLightingVolume 0: Disables translucent lighting
r.AmbientOcclusionLevels 0: No ambient occlusion! It’s a nice feature, but we don’t need it; remove that!
r.SSR.Quality 0: Screen space reflection is also cool, but it’s costly and we don’t need it; delete!
Profiling Limits: Dig Deeper with GPA
At one point, it’s hard to pinpoint what is really happening GPU-side using only the GPU Profiler. To really get a sense of what is going on, we need something to dig even deeper! We’re using a free tool called Intel GPA.
We won’t go in depth on how to use the tool, but there is one important thing to know: it won’t work in ‘Direct to HMD’ mode. So, to start a capture, you need to be in ‘Extend to Desktop’ mode. The quickest way we found to take a capture was to open the editor, set GPA to ‘Auto-detect launched app’, and then run the game in ‘Standalone’.
Now for the Juicy Tips!
Analyze your scene: I talked about the ‘show’ commands and ‘stat scenerendering’ in my last post; this is where you want to use them to determine what your bottleneck is.
Instancing Static Meshes + Foliage: If you have too many draw calls, this could be a life saver! Foliage is especially great if you want to have a dense forest or lots of smaller meshes. But keep in mind that the level of detail in foliage can easily multiply your draw calls. Also, instancing is not always the best option, so make sure it’s really going to help. Don’t hesitate to compare using GPA!
Particle System Bounds: While profiling FATED, I found out that a lot of particle systems we were not supposed to see where being rendered. Turns out the culling of particle systems is not set by default!
Project rendering settings – Screen Clear: This is a minor optimization, but every microsecond is worth it! If you always render something on each pixel (you have a Skybox, say) this is worth setting to ‘No Clear’. Be aware that this should only be set for actual builds, since it will cause weird artifacts in the editor asset viewer viewports.
Project rendering settings – Early Z Pass: This is one of the best helpers for the fillrate. This will do more draw calls, but it’s such a huge help for the number of pixels drawn that it is worth enabling. Some frames got as much as 25% speed gain by enabling that!
Disable post-processes when not using them: We got some really nice post-processes for some features in our game, but they are not always used. Be sure to remove those from the ‘Blendables’ array when they’re not needed!
Shipping Builds: It’s good to remember that your shipping build is going to run a bit faster than your dev build.
We’re always looking for ways to improve performance, and we’re not done optimizing, but this should give you a basic idea of how we’re working on FATED: always profile, add one feature at a time, and look for more ways to make the game run ever more smoothly. There is whole section dedicated to performance in the Unreal 4 documentation (https://docs.unrealengine.com/latest/INT/Engine/Performance/index.html); I highly recommend it to those who want further insight!
Meanwhile, if you have tips to share, or any questions or comments, send them in and I’ll be happy to address them! ‘Til next time!
Mick / Lead Programmer