About Me

Hi. I'm Josh Ols, aka n00body. Just a wannabe game developer trying to take it seriously.

View Joshua Ols's profile on LinkedIn
Recent Comments
Blog Search
Meta
Powered by Squarespace
Sunday
08Nov2009

Screen-Space Ambient Occlusion

SSAO seems like a must these days, especially if you want a game that has fully dynamic content. The benefits to the overall look of the game may be subtle, but they are nonetheless substantial. So right from the start of my project, I have been determined to integrate this technology into my renderer.

 

Requirements:

My current test hardware is crappy, there is no denying that. My development platform is about six years old, and has an AGP GeForce 6800 GT GPU. Needless to say, this severely limited the variety of techniques I could try. Still, I didn't want technical limitations to cost me quality, and needed something that would provide such at a reasonable performance cost.

Keeping those restrictions in mind, here are the criteria I sought:

  • Normals influence occlusion
  • Handles self-occlusion
  • Eliminates "halos"
  • Looks decent without blurring
  • Doesn't use branching
  • Requires few samples

After much research and experimentaion, I settled on an approach developed by nullsquared (gamedev, ogre3d), which meets all my criteria beautifully. For the most part it follows the typical steps of any SSAO implementation, but differs in how it determines when a sample is occluded. Since I only vaguely understand how it works and don't care to explain it, I will refer you to read about his implementation.

Links:

Ogre3D forums: SSAO Compositor

Ogre3D forums: SSAO Demo + Source

Ogre3D forums: SSGI

 

Results:

For the test scene, the geometry was illuminated using one hemisphere light + six point lights. In order to get the best results, I do the ambient pass first, then apply SSAO to the light buffer via multiplicative blending. Then, I proceed to add the point lights on top of the ambient contributions. All calculations are handled in linear space, and receive color correction before being rendered to the back-buffer.

The nature of my deferred lighting renderer allows me to apply SSAO correctly to the ambient lighting. Most games would incorrectly apply it to the final rendered image as a post-process. While this can look good enough, it is not exactly correct since it affects all lighting (direct/indirect, diffuse/specular/reflection, etc).

So here are the results of this setup:

(SSAO) SSAO buffer (SSAO) SSAO & Ambient (SSAO) SSAO + ambient + point lights (SSAO) ambient + point lights (SSAO) difference

Figure 1. 1, SSAO; 2, SSAO & Ambient; 3, SSAO & lighting; 4, lighting; 5, difference

 

Performance:

My tests showed that a scene of about 180,000 polygons rendered at 1280 x 720 ran at roughly 20fps, while the same scene rendered at 800x600 ran at 30fps. This was without any optimizations, such as using downsampled normals/depth and render targets for the SSAO. Plus, I haven't added a smart blur to the SSAO contributions yet, partly because the result look good enough as is.

In the future, I will be look at options for reducing the cost, such as using even fewer samples, and/or downsampled buffers. This will necessitate blurring, in order to cover up the pixelation of the occlusion, as well as good upsampling. These are areas which can also stand to benefit from optimizations tricks. ;)

Monday
26Oct2009

Deferred-lighting renderer

Okay, I've officially decided to switch over to Deferred Lighting for my renderer. During my tests, I found its benefits over traditional deferred shading more than made up for the extra drawing pass for all my geometry. The reduction in VRAM, the added material/shading options, and the possibility of MSAA on SM 4.0 hardware are all far too tasty for me to pass up.

After much experimentation, my new renderer is at a point where it uses RGBA8 buffers, the native depth information, and no MRT. On SM 4.0 hardware, I will be able to make it sample the hardware depth buffer, rather than having to copy its contents. Ideally, I could also do something like what Insomniac did with Resistance 2, where they store their normals and gloss in the back-buffer, to save some extra memory.

Needless to say, I have been doing a lot of tinkering with this new approach, and will be doing far more in the coming weeks. ;)

 

Lighting pipeline:

I'm planning to use a lighting pipeline similar to the one employed by CryEngine3. The order will be ambient lighting, multiplied by SSAO, and then adding direct lighting. This ensures that SSAO only affects ambient lighting, and direct illumination will correctly show up in the occluded regions. Shadows will mask a light's contribution, so that ambient lighting will dominate the shadowed region.

Image-based lighting will be combined in the lighting buffer during the ambient phase. It will be used for ambient lighting and glossy/metallic reflections. Handling reflections here will nicely combine them with specular lighting. This also keeps with the mentality of keeping lighting in the lighting phase, and avoiding pushing that burden on the material phase.

 

Material Benefits:

This pipeline will keep the material shaders independent/ignorant of the lighting phase implementation. All they have to care about is that the lighting comes to them through two standard channels (diffuse, specular). So the material shaders can decide how they treat the lighting information.

Letting materials, rather than lights, decide how to combine the illumination and material properties opens up a whole host of shading possibilities. For example, they can approximate lighting models from the diffuse illumination, such as Minnaert shading and a sub-surface scattering approximation. The same goes for the specular contribution, allowing things like metallic or glassy reflections.

Ideally, I could nicely separate lighting and material shading from each other. However, I will have to make some exceptions for things like rim-lighting for "fuzzy" materials, or anisotropic highlights for things like hair. Since these are material-specific, they can't conveniently be combined with the light buffer. However, they are both quite necessary for a diverse range of common materials.

Time will tell if I have to make any other exceptions. However, the pipeline I have outlined covers the vast majority of materials I might want, and nicely separates most of the stages so that they can be optimized.

 

Issues:

One thing I felt I should mention for anyone else who tries to implement a similar renderer. Be careful about how you store and recover your normals. I lost two weeks trying to fix a nasty artifact that resulted when I made the switch to low-precision buffers. An artifact that turned out to be a trivial fix.

In my case, I was storing view-space normals in an RGBA8 buffer, just packing it to the range [0,1] for storage, then upacking to [-1, 1] for shading. This works fine for diffuse lighting, but you will have nasty artifacts for specular lighting due to quantization making the vectors not unit length. To deal with this, do not forget to normalize() the unpacked normal to correct these errors.

Keep things like this in mind when transitioning from high-precision buffers to low precision buffers.

Saturday
10Oct2009

Shaders Fixed!

Okay, I've found a workable solution to my shader situation. This will allow me to avoid switching to CgFX, and still support the GLSL profiles. Now my codebase will run on ATI machines with all features supported! =D

Basically, I read about a function called cgCombinePrograms(), that allows me to combine multiple programs into a single program. This arrangement allows GLSL to work correctly, since it combines them into a single handle like a GLSL program. Admittedly, I lose the ability to mix-n-match individual vertex, geometry, and pixel shaders in the render function. However, that was never a huge deal for me, so its loss is acceptable for the sake of compatibility.

Saturday
03Oct2009

Prelighting & Log-Lights (Ver. 2)

Prelighting:

Prelighting, deferred lighting, light prepass rendering...They all refer to what is essentially the same concept. It is a form of deferred shading that defers the lighting phase, but keeps the material phase as forward rendering. All for the purpose of avoiding the fat g-buffers that are typical of deferred shading, and allowing more varied materials. Who came up with it first is a bit unclear, but to my knowledge the first public definition was developed by Wolfgang Engel (link).

When compared to the approach I am currently considering, this approach offers many appealing benefits. Firstly, how it allows deferred shading without explicitly requiring MRT support. Then there is the possibility of MSAA on platforms that support explicit control of how samples are resolved. Finally, how it enables more material variety than a standard deferred renderer, since it doesn't force all objects to use only the provided material channels of a g-buffer.

Sadly, it has its downsides as well. Because it is halfway in-between a deferred renderer and a z-prepass renderer, it inherits many problems of both approaches. On the z-prepass side, it forces you to divide material properties between two passes, possibly sampling the same texture multiple times. Not to mention, having to draw all object at least twice, potentially limiting the maximum number of objects that you can draw. On the deferred shading side, it forces one lighting model for all objects. There is also the age-old issue of translucency, but that is an issue for any renderer that isn't single pass.

So to set up for the second part of this post, I will outline the renderer used to generate my example images. As far as I know, the most common approach for the lighting buffer in this kind of system is to store the diffuse illumination in the RGB channels, and the specular component in the A channel. To compensate for the monochrome specular, it gets multiplied with the chromacity of the diffuse light buffer to approximate colored specular lighting. Finally, during the material phase the light contributions are combined with the appropriate textures, and added together with other material properites before the final output.

Further reading:

Insomniac's Prelighting (GDC 2009)

Engel's Prepass Renderer (SIGGRAPH 2009)

Deferred Lighting Approaches

ShaderX7, Chapter 8.5

 

(prelight) diffuse (prelight) diffuse chromacity (prelight) specular [monochrome] (prelight) specular [colored] (prelight) diffuse + colored specular

Figure 1. 1. Diffuse accumulation, 2. Diffuse chromacity, 3. Specular accumulation [monochrome], 4. Specular accumulation [diffuse chromacity], 5. Material pass results

 

Log Lights:

Here's the real meat of this post, directly following from my consideration of Prelighting. For the best results, we want to perform linear HDR accumulation of lights. This usually mandates that we use at least an FP16 RT, since integer formats don't have the necessary range or precision. However, fp16 RTs eat memory, are slower to render, and can have restricted support on some platforms. So if we are limited to RGBA8 RTs, how much can we get out of them?

When used with a prelighting renderer, I have found that RGBA8 can do a decent job. Storing the raw lighting data, it handles linear light accumulation fairly well when they don't overlap much. However, it does suffer from some visual artifacts on dim lights. Though more than anything, it is limited to the range [0,1]. So when lights pile up on a surface, they will soon saturate at 1.0 and begin to lose normal-mapping detail. These limitations are simply unacceptable, but how can we do better?

The concepts of Log Lights was first introduced to me in the gamedev post "Light Prepass HDR" by a fellow named drillian (link). The idea is to use the properties of power functions to make best use of an integer RT's precision. It allows a form of linear light accumulation with extended range, while still taking advantage of hardware blending. Plus, it does all this in the confines of individual RGBA8 components, so it doesn't have to steal another one to add precision.

This trick works by exploiting the fact that when power functions are multiplied, their exponents are added together. This allows us to add light values by storing them in the exponent, and using multiplicative blending on the output of the light shader. After that, it is a simple matter to extract the value from the exponent via a logarithm. To make sure that the power function stays in the range [0, 1] for storage in and RGBA8 RT, the light values are negated before being stored in the exponent. This requires that the recovered values be negated again to be correct. Easy enough, right?

Now, it does come with some deficiencies. For starters, darker light values show slightly worse artifacts than straight RGBA for dim lights. Then there is the issue that this doesn't provide a true HDR range of values, only really providing MDR. Finally, you are restricted in the kinds of operations you can perform on the light buffer, since it is no longer storing raw rgb data. There is also the issue that this technique only looks decent for accumulation of raw light values. I have found that multiplying each contribution by a texture will produce nasty visual artifacts. So this technique wouldn't be good for accumulating lights in a standard deferred renderer.

After much testing, I have concluded that this technique can nicely fill the niche of the light accumulation phase of a prelighting renderer. The added range may not be the best, but is substantially better than using raw RGBA values. Yet it uses the exact same storage media, and only requires a small amount of overhead to (un)pack the values. Overall, it is a useable solution to the problems I have outlined, depending on what you are trying to do.

Pseudo-code:

// Needed for multiplicative blending, since values need to start at one

ClearColor(white)

Blend(DST_COLOR, ZERO, DST_ALPHA, ZERO)

// Light shader output

outputLight = exp2(-lightContribution);

// Material shader input

recoveredLight = -log2(lightAccumulation);

 

(log light) diffuse buffer [exp2] (log light) specular buffer [exp2]

Figure 2. 1. Diffuse accumulation [exp2], 2. Specular accumulation [monochrome, exp2] 

 

(rgba) 0.1 (log-light) 0.1 (rgba) 1.0 (log-light) 1.0 (rgba) 2.0 (log-light) 2.0 (rgba) 3.0 (log-light) 3.0

Figure 3. Alternating brightness values, RGBA first, then Log-Light. 0.1, 1.0, 2.0, 3.0

 

(log-light) artifacts [standard deferred]

Figure 4. 1. Standard deferred artifacts

 

Demo Update:

I almost have the detail mapping demo code to where I want it, so that part shouldn't take too much longer. Right now, I am waiting on a request I made to an artist friend to make some cool geometry/textures so it looks good, and doesn't use textures I don't own. I promise I will release it as soon as I feel everything is ready, so please bear with me.

Thursday
24Sep2009

Shader Fix?

Okay, here's where I am on the shader front. I have managed to find a potential solution, but encountered some new problems.

After tinkering with the offical Nvidia examples, I found that when using CgFX where the profiles are set to glslv and glslf it will compile and run correctly. Now I am still a bit concerned about the speed issues involved with CgFX, but it might still prove to be a usable solution.

Now I have the problem with the ATI glsl compilers. Apparently it throws a fit when you say "#extension GL_ARB_draw_buffers : require", even when it says it supports that extension. Supposedly it won't complain if it is "#extension GL_ATI_draw_buffers : require". So I may have to have some ATI-specific compiler options for Cg.

I also found out that cgGLGetLatestProfile() returns arbvp, arbfp on ATI gpus. So I would have had to force it to GLSL anyway.

I would really prefer to continue using Cg, but these problems are kind of a pain. :(

 

EDIT:

It looks like Nvidia bypassed ATI's bug with Cg 2.2, by making it so that shaders needing MRT will say "#extension GL_ATI_draw_buffers : require" instead. Since they support the extension, it will work just as well with an Nvidia GPU. While it annoys me that Nvidia had to compensate for ATI's bug, I am glad that their efforts have saved me from having to deal with it.