Introduction
“The Technology behind the Unreal Engine 4 Elemental Demo” about how they implement SSAO. Their technique can either use only the depth buffer or with the addition of perpixel normal. And I tried to implement both version with a slight modification:
Using only the depth buffer
The definition of ambient occlusion is to calculate the visibility integral over the hemisphere of a given surface:
To approximate this in screen space, we design our sampling pattern as paired samples:
paired sample pattern 
So for each pair of samples, we can approximate how much the shading point is occluded in 2D instead of integrating over the hemisphere:
The AO term for each given pair of samples will be min( (θleft + θright)/π, 1). Then by averaging the AO terms of all the sample pairs (in my case, there are 6 pairs), we achieve the following result:
Dealing with large depth differences
As seen from the above screen shot, there is dark halos around the knight. But the knight should not contribute AO to the castle as he is too far away. So to deal with the large depth differences. I adopt the approach used in Toy Story 3. If one of the paired sample is too far away from the shading point, say the red point in the following figure, it will be replace by the pink point, which is on the same plane as the other valid paired sample:
So we can interpolate between the red point and the pink point for dealing with the large depth difference. Now the dark halo has gone:
The above treatment only handle if one of the paired sample is far away from shading point. What if both of the samples have large depth differences?


In this case, it will result in the dark halo around the sword in the above screen shot. Remember we are averaging the all the paired samples to compute the final AO value. So to deal with this artifact, we just assign a weight to each paired samples and then renormalize the final result. Say, for each paired sample, if both of the samples are within a small depth differences, that sample pair will have a weight of 1. If only 1 sample is far away, that pair will have a weight of 0.5. And finally if both of the samples is far away, the weight will be 0. This can eliminate most(but not all) of the artifacts:
Approximating arccos function
In this approach, the AO is calculated by using the angle between the paired samples, which need to evaluate the arccos function which is a bit expensive. We can approximate acos(x) with a linear function: π(1x)/2.
And the resulting AO looks much darker with this approximation:


Note that the maximum error between the two function is around 18.946 degree.
This may affect the AO for the area of a curved surface with low tessellation. You may either need to increase the bias angle threshold or switch to a more accurate function. So my second attempt is to approximate it with a quadratic function: π(1 sign(x) * x * x)/2.
And this approximation shows a much similar result to the one using the arccos function.


And the maximum error of this function is around 9.473 degree.
Using perpixel normal
We can enhance the details of AO by making use of the perpixel normal. The perpixel normal is used for further restricting the angle to compute the AO where the angle θleft, θright are clamped to the tangent plane :
And here is the final result:
Conclusion
The result of this AO is pleasant by taking total 12 samples per pixel and with 16 rotation in 4×4 pixel block at half resolution. I did not apply bilateral blur to the AO result, but applying the blur may gives a softer AO look. Also approximating the arccos function with a linear function although is not accurate, but it gives a good enough result for me. Finally more time are need to spend on generating the sampling pattern in the future where the pattern I currently used is nearly uniform distributed (with some jittering).
References
[1] The Technology behind the Unreal Engine 4 Elemental Demo
[3] ImageSpace HorizonBased Ambient Occlusion
[5] The models are export from UDK and extracted from Infinity Blade using umodel.exe