About Me

Hi. I'm Josh Ols. Lead Graphics Developer for RUST LTD.


View Joshua Ols's profile on LinkedIn


Entries in Blog (28)


OGL -> D3D

For my project, I wanted my renderer to use OpenGL only, so I'd have a mostly unified codebase for all the potential PC platforms (Win, Mac, & Linux). Sadly, I'm finding that things just aren't that simple in practice when it comes to getting the best performance. For Windows 7, in particular, I'm just not going to get the best performance unless I use Direct3D.

So I've decided to develop my codebase like I would for a cross-platform console game. I will make high-level platform-neutral abstraction, with low-level platfrom-specific code. That means using the APIs that provide the best performance and feature access for the platform. Thus I will use DirectX on all things Windows, and OpenGL/OpenAL/etc everywhere else (barring console development).

Other than that, aside from platform-specifics I'm only going to use APIs that don't have a licensing fee attached to them.


A new direction?

Almost a month and a half without a post...Thought I should do something to remedy that. Sadly, I've been hard-pressed to think of what I should write about next. I've been in a bit of a creative rut, since I am not quite sure how I want to advance my project. So here's where I am, and where I plan to go...


Current Results:

My deferred lighting renderer has demonstrated the merit of the approach, and could probably do well with further optimization. Even with the restriction of RGBA8 targets and no MRT, it produces acceptable lighting and allows more material variety than a standard deferred renderer.

Sadly, I have found that lots of small lights with Blinn-Phong shading is about the best it can do. Plus, the lighting quality is far from the best, due to limited dynamic range and banding artifacts resulting from the low-precision storage. Plus, having a few large lights really killed my performance, which can be a pain for effect & fill lights. It was most pronounced when the camera neared an object that filled the screen, and was covered by several large lights.

As the quality and limitations began to bug me, I felt it prudent to step back, and seriously evaluate what I want this renderer to do.


New Goals:


In an attempt to have the game design inform the engine technology, I stopped to evaluate what I planned to do with my game. Since I plan to do something loosely like a traditional RPG, I will have mostly static worlds. So that implies mostly self-contained set pieces that are connected by choke points that allow the rest of the level to stream in as you approach said point. It also necessitates having good quality lighting at close views, particularly for scenes with characters.

Originally, I wanted to have a completely dynamic world with destructable objects, and long draw distances where you could see everything (ala Crysis, Bad Company, etc). Eventually, I realized that these require fancier tech and higher specs. Both of which are prices I prefer not to pay, especially for features I don't strictly need. Really, I only wanted them in the first place because I thought they were the future of rendering tech for games, and that I just had to have them.



Material variety is something I have decided I need. Blinn-Phong with rim-lights and clever texture usage can cover an impressive range of materials, but it is not one size fits all. I would like to support custom BRDFs and diffuse lighting functions for the common materials that this combination can't cover (hair, skin, brushed metal, extremely rough surfaces, etc). There needn't be a lot of dynamic lights, so long as the ones that are present emphasize the materials characteristics.

Lighting quality was also an issue I kept bumping into with my deferred renderers. Banding and saturating at 1.0 due to limited dynamic range and low-precision storage are both unacceptable. So, I want to have full float precision for my light accumulation in order to support linear HDR calculations.

Pre-computed lighting is something I have eschewed up until this point more for political reasons than on actual merit. I liked the sound of "completely dynamic" lighting that deferred shading could provide. However, precomputation can provide higher quality lighting effects at much lower runtime cost. Sure the world becomes static, and the storage requirements go up, but I can live with those for better visual quality.

Low-precision storage and compression will be especially important as a result of the prior requirement. So I will need to get better quality using encoding schemes (RGBM, etc) with standard compression schemes (S3TC). At runtime, I will have to avoid use of anything larger than RGBA8 for my render targets.

Antialiasing is something I'd like to support, since aliasing is quite noticeable in modern games. Granted, I have often observed that HD resolution with post-processing effects like Depth of Field and Motion Blur can make aliasing far less noticeable. However, rendering at sub-HD resolutions brings them out in full-force, and they start to become distracting. To this end, I plan to include support for the standard MSAA, and Nvidia's CSAA. 


New Restrictions:

My desire to support these new features, particularly AA on SM 3.0 hardware, has imposed more than a few limitations on the options I have for my new renderer.

The lighting approach is the first compromise, because MSAA will limit me to RGBA8 targets. Since HDR lighting is a must, I will have to use an MSAA compatible HDR encoding scheme. To my knowledge, this limits me to RGBM and NAO32, with the latter being more appealing for having quality on par with RGBA16F. However, single-pass lighting will be my only option, since encoding my output ensures that hardware blending will not be possible.

Translucency will be difficult as a result of being unable to blend into my encoded framebuffer. So I will need a workaround like compositing all my alpha into a non-encoded buffer, and combining it with the encoded buffer in a shader. Linear alpha-blending will be difficult as well, due to my restriction to RGBA8 render targets. I can only hope that the translucent nature of the object will help cover up any banding artifacts introduced by the low-precision buffer.

Now the hard part, how do I get single pass lighting that can take a fairly arbitrary number of lights without producing absurd numbers of shader combinations?


New Approach:

I'm looking at the dynamic Spherical Harmonic lighting approach that has been employed by a variety of new AAA games (Halo 3, Gears of War 2, God of War 3). In this system, every moving object has its own SH coefficients, which the CPU uses to composite all the lights that affect the object. Then the coefficients are uploaded to the objects shader, which doesn't have to care about the exact number of lights. This drastically cuts down on shader permutations, while allowing very predictable shading performance for an arbitrary number of lights. Plus, I can extract the dominant light direction and color from the basis for use in custom lighting equations (BRDFs, SSS approximations, etc), allowing for material variety.

Like any other approach, this one comes with more than a few downsides. Due to the low-frequency nature of SH, lighting will not be pixel-perfect. Directional lights will work well enough, but point lights will lose definition, and spot lights will probably look completely incorrect. Also, the trick to extract a directional light limits custom lighting functions to just that light, with the rest being "ambient" lighting. This light is also the only one that can be shadowed, so material defintion will be lost in shadowed areas. Finally, this approach may need a more beefy CPU, in order to composite all the lights that affect an object each frame.

These limitations are annoying, but potentially more survivable. If I can get better lighting quality and more material variety, then that will solve most of my problems.


Wrapping up:

So I have decided to segway into this different approach, to maybe get my creative juices flowing again. This will also provide a good excuse for me to rewrite my now cluttered code base using what I have learned from using my hacky hodge-podge of abstractions. Hopefully the net result will be code that is easier to maintain, and a renderer that will do a better job meeting my needs.


Shaders Fixed!

Okay, I've found a workable solution to my shader situation. This will allow me to avoid switching to CgFX, and still support the GLSL profiles. Now my codebase will run on ATI machines with all features supported! =D

Basically, I read about a function called cgCombinePrograms(), that allows me to combine multiple programs into a single program. This arrangement allows GLSL to work correctly, since it combines them into a single handle like a GLSL program. Admittedly, I lose the ability to mix-n-match individual vertex, geometry, and pixel shaders in the render function. However, that was never a huge deal for me, so its loss is acceptable for the sake of compatibility.


Shader Fix?

Okay, here's where I am on the shader front. I have managed to find a potential solution, but encountered some new problems.

After tinkering with the offical Nvidia examples, I found that when using CgFX where the profiles are set to glslv and glslf it will compile and run correctly. Now I am still a bit concerned about the speed issues involved with CgFX, but it might still prove to be a usable solution.

Now I have the problem with the ATI glsl compilers. Apparently it throws a fit when you say "#extension GL_ARB_draw_buffers : require", even when it says it supports that extension. Supposedly it won't complain if it is "#extension GL_ATI_draw_buffers : require". So I may have to have some ATI-specific compiler options for Cg.

I also found out that cgGLGetLatestProfile() returns arbvp, arbfp on ATI gpus. So I would have had to force it to GLSL anyway.

I would really prefer to continue using Cg, but these problems are kind of a pain. :(



It looks like Nvidia bypassed ATI's bug with Cg 2.2, by making it so that shaders needing MRT will say "#extension GL_ATI_draw_buffers : require" instead. Since they support the extension, it will work just as well with an Nvidia GPU. While it annoys me that Nvidia had to compensate for ATI's bug, I am glad that their efforts have saved me from having to deal with it.


Progress update...Sort of

Okay, that code I promised is going to be a bit delayed. Sadly, I have found that Cg is not going to work out as my shading language. It doesn't seem to work when I switch it over to the glsl profiles, and that's a big problem. After checking the same configuration against the official Cg examples, and finding they had the same behavior, I am afraid that this is beyond the scope of my own code.

If I want my code to be accessible to ATI users, I have to have glsl compatability. So it's looking like I will have to remove Cg entirely, and start using a GLSL shader pipeline. This is going to require time to re-author shaders and change binding/loading code. ;_;