## Partial Derivative Normal Maps (Article Ver. 2)

**UPDATE:**

In the comments section, *the_best_flash* wrote a really in-depth explanation of the math that makes this technique work. Be sure to check it out!

**UPDATE2:**

I recently discovered a little trick to shave off another instruction from PDN shader recovery and combination. This now makes it have the same cost as uncompressed for a single map, and makes it cheaper than uncompressed for combining maps!

Basically, all you do is use the scale bias MADD and a swizzle to assign a 1 to the z component, so that you don't need a MOV instruction later.

Example:

pdn.xy = h4tex2D(normalMap, texcoord.st).ag;

tangentN.xyz = normalize(pdn.xyy * half3(-2.0h, -2.0h, 0.0h) + half3(1.0h, 1.0h, 1.0h));

**Original:**

Okay, I'm going to try this again, and hopefully not have to make any further revisions to the article afterward.

For my project, I wanted to have sufficiently crisp normal maps so that when the camera neared a surface, it would not lose too much detail. Problem is, the kinds of texture resolutions needed to accomplish this goal would quickly become a problem for storage, both offline and at runtime. Then I came across the idea of detail mapping which, in the form I am researching, involves combining a tiled detail normal map with a surface's base normal map.

This approach, while not ideal, can work surprisingly well if you break down your surfaces by material type, and share the corresponding detail texture between multiple surfaces. Depending on the resolution of the detail map and the tiling frequency, you can produce rich surface details that will look sharp even when the camera comes in for a very close inspection. Even a 2048x2048 texture would eventually start to look blurry, while a detail-mapped surface can potentially go much further.

Now the problem inherent with this tactic is that it makes the shader more complicated. You have more textures to sample, you have to use more code to recover the normals from their compressed state, and you then have to actually combine them. This can become surprisingly expensive in no time at all, since there are usually a bunch of sqrt(), normalize(), and misc simple ops when all is said and done.

So, I looked into the options on how I could reduce the complexity of recovering and combining compressed normal maps. To my surprise, I stumbled across an approach that greatly reduces the instruction counts, while not requiring drastic changes to the textures. Partial Derivative Normal maps, as they are called, were first introduced by Insomniac Games for Ratchet and Clank Future: Tools of Destruction (link, pg 27).

According to their description, PDNs use essentially the same code whether they are a single normal map, or a base map and a detail map. They also work well with the standard DXT5 compression trick, not showing any worse compression artifacts than tangent-space normal maps using the same scheme.

So let's take a look at some comparisons of this technique versus some more standard approaches, and see how it holds up.

**Authoring:**

Fortunately, they seem to have no real quality loss when being converted from a standard tangent-space normal map. However, it is best to convert them from the floating point tangent-space data that is generated before it gets stored in an RGB8 texture. Otherwise, you will be losing quality by having low-precision integers for input/output.

Formula

// C implementations, need to clamp to [0, 1]

pdn.xy = -tangentNormal.xy / tangentNormal.z;

**Figure 1. ***1, base bump map; 2, base tangent-space normal map; 3, base PDN map, **4, detail bump map; 5, **detail** tangent-space normal map; 6, **detail** PDN map*

**Recovery:**

As I mentioned earlier, they come into play when you want to use compressed normal maps. For the sake of comparison, I will also examine the "standard" approaches using tangent-space normal maps.

Please note, I will be using 'half' types/functions extensively in my code examples, as this has proven to reduce instruction counts on Nvidia hardware. This is particularly important, since it reduces all normalize() calls from 3 instructions to 1 instruction. For comparison's sake, I have included the instruction counts of code using float types/functions as well.

As far as I know, the compressed tangent-space normal map code doesn't need a normalize(). This is a consequence of the Pythagorean theorem fomula, which should produce a Z value that will make the normal a unit vector.

Those who read the Insomniac article may notice the negation op missing from my PDN code. I actually found a way to remove that instruction by handling it in the scale-bias operation. So instead of the standard ** 2.0 - 1.0*, I use ** -2.0 + 1.0* to get the sample into the correct range, and perform the negation. This trick only shaves off one instruction, but every little bit helps.

So let's take a look at some pseudo-code:

Tangent-space (uncompressed)

tangentN.xyz = h3tex2D(normalMap, texcoord.st).xyz * 2.0 - 1.0;

tangentN.xyz = normalize(tangentN.xyz);

~5 instructions (float)

~3 instructions (half)

Tangent-space (compressed)

tangentN.xy = h4tex2D(normalMap, texcoord.st).ag * 2.0 - 1.0;

tangentN.z = sqrt(1.0 - dot(tangentN.xy, tangentN.xy));

~7 instructions (float)

~7 instructions (half)

Partial Derivative

pdn.xy = h4tex2D(normalMap, texcoord.st).ag * -2.0 + 1.0;

tangentN.xyz = normalize(half3(pdn.xy, 1.0));

~6 instructions (float)

~4 instructions (half)

**Figure 2. ***1, recovered from XY **(tangent-space)**; 2, **recovered from PDN** **(tangent-space)**; 3, difference between the two*

If you squint, you can see that they aren't perfect, and show some distortion for normals beyond 45 degrees from the Z axis. However, this artifact has proven to be negligable during my observations.

**Detail Maps:**

Okay, here's the area that mattered most, and the primary reason I considered this technique in the first place. Now, for comparison, I will be looking at two approaches that can be used for tangent-space normal maps, as well as the one approach for partial derivative normal maps.

The more "correct" approach averages the normals, but produces slightly flattened normals that seem to lose depth when shaded. I've seen some implementations use lerp(), and this would offer more control. However, since I use a value of 0.5, I can save an instruction by just averaging them together.

The other approach, which I believe is used by Unreal Engine 3 games, seems to remedy the flattened look (link, link). This one works by preserving the XY components of both the base and detail maps, while adjusting each one's contribution to the Z. In the simplest case, it seems that they just throw away the detail map's Z entirely.

Originally, when I read the article from Insomniac, I thought that when recovering and combining PDNs, you had to set each one's Z component to 1.0. So when you added them together, it would be 2.0. However, this seemed to be producing some flattened normals as well. After some experimenting, I came to the conclusion that the 1.0 is part of the final step, where you add together all the XY components of the PDNs. So it should be *float3(pdn1.xy + pdn2.xy, 1.0)*.

Pseudo-code:

Tangent-space (uncompressed)[standard]

tangentN1.xyz = h3tex2D(normalMap1, texcoord.st).xyz * 2.0 - 1.0;

tangentN2.xyz = h3tex2D(normalMap2, texcoord.st).xyz * 2.0 - 1.0;

tangentN.xyz = normalize((tangentN1.xyz + tangentN2.xyz) * 0.5);

~8 instructions (float)

~6 instructions (half)

Tangent-space (uncompressed)[UE3]

tangentN.xyz = h3tex2D(normalMap1, texcoord.st).xyz * 2.0 - 1.0;

tangentN.xy += h2tex2D(normalMap2, texcoord.st).xy * 2.0 - 1.0;

tangentN.xyz = normalize(tangentN.xyz);

~9 instructions (float)

~7 instructions (half)

Tangent-space (compressed)[standard]

tangentN1.xy = h4tex2D(normalMap1, texcoord.st).ag * 2.0 - 1.0;

tangentN1.z = sqrt(1.0 - dot(tangentN1.xy, tangentN1.xy));

tangentN2.xy = h4htex2D(normalMap2, texcoord.st).ag * 2.0 - 1.0;

tangentN2.z = sqrt(1.0 - dot(tangentN2.xy, tangentN2.xy));

tangentN.xyz = normalize((tangentN1.xyz + tangentN2.xyz) * 0.5);

~17 instructions (float)

~15 instructions (half)

Tangent-space (compressed)[UE3]

tangentN.xy = h4tex2D(normalMap1, texcoord.st).ag * 2.0 - 1.0;

tangentN.z = sqrt(1.0 - dot(tangentN.xy, tangentN.xy));

tangentN.xy += h4htex2D(normalMap2, texcoord.st).ag * 2.0 - 1.0;

tangentN.xyz = normalize(tangentN.xyz);

~12 instructions (float)

~10 instructions (half)

Partial Derivative

pdn1.xy = h4tex2D(normalMap1, texcoord.st).ag * -2.0 + 1.0;

pdn2.xy = h4tex2D(normalMap2, texcoord.st).ag * -2.0 + 1.0;

tangentN.xyz = normalize(half3(pdn1.xy + pdn2.xy, 1.0));

~9 instructions (float)

~7 instructions (half)

**Figure 3. ***1, tangent-space(uncompressed)[standard]; 2, **tangent-space(uncompressed)[UE3]; **3, tangent-space(compressed)[standard]; 4, **tangent-space(compressed)[UE3]; 5, PDN*

**Figure 4. ***tangent-space compressed-PDN difference*

**Conclusions:**

So I think that just about covers everything. Hopefully you can see how partial derivative normal maps can be used efficiently for both a single and detail mapped surface. Seeing the negligible difference of quality, but considerable difference of performance vs standard compressed tangent-space normal maps.

Considering their benefits, I'm surprised they haven't seen more wide-spread use. Suffice it to say, they have been working well for my project, thus far. I'm curious to see if anyone else will give them a try after reading this post. ;)

My current plan is to release a demo application comparing all these techniques in simple lighting environment. I'm not sure when this will be available, but I will keep everyone posted. Please try to forgive my almost OCD attempts to revise this article. However, I wanted the most complete set of information that was available.

If you have any comments, critiques, questions, please feel free to speak up! ;)

## Reader Comments (5)

A comment on the 1.0 in the z when combining the derivatives. This comes from the basic normal map formula:

left = height of pixel to the left

right = height of pixel to the right

top = height of the pixel to the top

bottom = height of the pixel below

normal = normalize( left - right, top - bottom, 1.0 );

or more formally:

normal = normalize( dx, dy, 1.0 );

Where dx is the change in height of the height map along the x axis, and dy is the change in the hight of the height map along the y axis.

Therefore when the z coordinate of the normal map is 1.0 the x and y coordinates contain the partial derivatives with respect to each axis. Since the idea behind this method is to modify the partial derivatives, the z coordinate must be 1.0 before normalizing.

This also explains your first formula, normal.xy = normal.xy / normal.z is converting the normal to the (dx, dy, 1.0) form.

Secondly, this also helps to explain the idea behind this:

You are adding the derivatives together so:

normal = (dx1 + dx2, dy1 + dx2, 1.0);

also from above:

dx1 = left1 - right1

dx2 = left2 - right2

where left1 represents a pixel from height map #1 and left2 represents a pixel from height map #2

so:

dx1 + dx2 = (left1 - right1) + (left2 - right2)

after some simple algebra:

new dx = (left1 + left2) - (right1 + right2)

What is the importance of this? Well the above equation looks very similar to:

dx = left - right

if left = left1 + left2 and right = right1 + right2

Meaning that at a basic level adding together the partial derivatives is the same as generating a new normal map from a height map formed by adding two height maps together.

So, basically this method is the same as adding your two original height maps together and generating a normal map from that. Just with less math.

So, we could use 3 or 4 detail partial derivatives just by adding them together

normal = (dx1 + dx2 + dx3 + dx4, ... ect.)

And finally, this last example explains why averaging the two generated 'flatter' normals.

Since adding the derivatives together is the same as adding the height maps together, then your equation:

normal = ( (dx1 + dx2) * 0.5, (dy1 + dy2) * 0.5, 1.0 );

is the same as multiplying the resulting height map by 0.5 (this could be proven using the equations above), and multiplying a height map by numbers below 1.0 will make the height map 'flatter'. Similarly, multiplying height maps by numbers greater than 1.0 will make them 'bumpier'.

So, you could modify the 'intensity' of normal maps in the following way:

normal = (dx * intensity, dy * intensity, 1.0);

So to conclude, another interesting application of all this would be to allow alpha blending of the detail normal.

For example:

new dx = blend * dx1 + (1.0 - blend) * dx2;

would blend between the two normals if 0 <= blend <= 1.0

Using equations above, this is the same as blending together the two height maps.

Thanks to http://gimp-savvy.com/BOOK/index.html for the blending operation.

I think this might be similar to lerp( dx1, dx2, blend) but I'm not too sure.

The above would only blend by "switching" between the two (think of blending from one image to another. A blend of 0 would be the original normal and a blend of 1 would be the second normal), you could also:

lerp( dx1, dx1 + dx2, blend)

or new dx = blend * dx1 + (1.0 - blend)(dx1 + dx2)

This would allow you to control how 'intense' the affect of the detail normal would be by using an alpha channel. Similar to the alpha blending of color detail textures. (A blend of 0 would be the original texture, and a blend of 1 would be a combination of the two.) This basically is the same as blending between the original height map and the sum of the two height maps.

I believe "Care and Feeding of Normal Vectors" by Joern Loviscach in ShaderX6 discusses looking at normal vectors as derivatives and various other applications. That is where I first heard about this idea.

Very interesting read. Hope some of this helps to explain why you were getting some of the results.

By the way, I added a mention of your comment in the main article, so that other readers may benefit. Thanks for enriching my blog. ;)

normalize(vector * scalar) = normalize(vector)