Comments on: Texture Compression Using The Discrete Cosine Transform For us the priority is to bring down the internet bandwidth (because we pay for it). We'll re-compress the textures on load to achieve to goals you mention. (on-chip-memory and bus-bandwidth) For us the priority is to bring down the internet bandwidth (because we pay for it). We’ll re-compress the textures on load to achieve to goals you mention. (on-chip-memory and bus-bandwidth)

]]>
By: Simon F/2011/05/18/texture-compression-using-the-discrete-cosine-transform/#comment-4893 Simon F Thu, 26 May 2011 10:11:05 +0000 We uncompress to a RGBA-texture. We uncompress to a RGBA-texture.

]]>
By: Simon F/2011/05/18/texture-compression-using-the-discrete-cosine-transform/#comment-4802 Simon F Tue, 24 May 2011 12:46:54 +0000 Looks interesting, but I'm not convinced: 1. Seems like it's much slower to decode than JPEGs. (http://code.google.com/p/webp/issues/detail?id=53) 2. Seems like no alpha support. (http://muizelaar.blogspot.com/2011/04/webp.html) Looks interesting, but I’m not convinced:

1. Seems like it’s much slower to decode than JPEGs. ()

]]>
By: GoogleFan/2011/05/18/texture-compression-using-the-discrete-cosine-transform/#comment-4666 GoogleFan Sat, 21 May 2011 00:56:32 +0000

]]>
By: Jonas-Norberg/2011/05/18/texture-compression-using-the-discrete-cosine-transform/#comment-4653 Jonas-Norberg Fri, 20 May 2011 16:51:57 +0000 Interesting stuff. About re-compressing to DXT: An interesting goal is to see if you can produce an exactly equivalent output DXT texture to what you would have got if you just compressed your original texture into DXT. i.e. doesnt necessarily have to be bit-identical, but colors should all be an exact match except for maybe 1-bit differences in the intermediate colors of each block (the non-endpoint colors). Why is this interesting? Well, the main drawback of "real-time" DXT compressors compared to good offline compressors is that they aren't as good at choosing optimal endpoint colors. The result is slightly reduced quality from the real-time compressor. But if you can "condition" your inputs to the DCT algorithm so that it produces a set of output colors from which the real-time DXT compressor will choose the two "endpoint" colors to be exactly the same as what the offline compressor would have used, then the result will have the same quality as the offline DXT compressor would have had. In effect you could "condition" your data to make it give better results from the realtime DXT compressor than it normally would! All you need to do is get the two "endpoint" colors correct in each block, and make sure that none of the other pixels look like a more appropriate "endpoint" color to the realtime DXT algorithm. Its fine if that distorts the colors of the other pixels in the block a bit: the realtime DXT compressor will replace them with interpolated colors anyway; as long as the pixel isnt distorted too much, it would still pick the correct interpolated color to replace it with. Anyway I havent tried this but I've been thinking about it for a while. Interesting stuff. About re-compressing to DXT: An interesting goal is to see if you can produce an exactly equivalent output DXT texture to what you would have got if you just compressed your original texture into DXT. i.e. doesnt necessarily have to be bit-identical, but colors should all be an exact match except for maybe 1-bit differences in the intermediate colors of each block (the non-endpoint colors).

Why is this interesting? Well, the main drawback of “real-time” DXT compressors compared to good offline compressors is that they aren’t as good at choosing optimal endpoint colors. The result is slightly reduced quality from the real-time compressor. But if you can “condition” your inputs to the DCT algorithm so that it produces a set of output colors from which the real-time DXT compressor will choose the two “endpoint” colors to be exactly the same as what the offline compressor would have used, then the result will have the same quality as the offline DXT compressor would have had. In effect you could “condition” your data to make it give better results from the realtime DXT compressor than it normally would! All you need to do is get the two “endpoint” colors correct in each block, and make sure that none of the other pixels look like a more appropriate “endpoint” color to the realtime DXT algorithm. Its fine if that distorts the colors of the other pixels in the block a bit: the realtime DXT compressor will replace them with interpolated colors anyway; as long as the pixel isnt distorted too much, it would still pick the correct interpolated color to replace it with. Anyway I havent tried this but I’ve been thinking about it for a while.

]]>
By: Jonas-Norberg/2011/05/18/texture-compression-using-the-discrete-cosine-transform/#comment-4613 Jonas-Norberg Fri, 20 May 2011 01:06:35 +0000 To be precise, you're not doing <em>run-length encoding</em> or huffman encoding. Also to clarify, the JPEG standard allows huffman or arithmetic encoding, although not a lot of decoders support the arithmetic encoding due to (now irrelevant) patent issues. You are using a different colorspace, but from a practical perspective that's largely irrelevant--the core algorithm is identical, so I think Peter's summary of the article is pretty accurate. You're basically using JPEG with an alpha channel treated like a second luma channel, minus entropy encoding, and using YCoCg instead of YCbCr (which, from the encoder's perspective, is also irrelevant). Also, the zig-zag encode is really designed to be used in conjunction with RLE, so you might not be getting the most out of your entropy encode by skipping that step unless what you're using already does RLE. Also, it's not clear to me how you conclude JPEG2000 still has patent issues; a lot of sources for that are from several years ago, and as time marches on, patents expire. This is basically the only reason JPEG is safe (note that JPEG was the subject of patent nonsense in the late 90s)--it's old. It's also the reason the arithmetic encoding is no longer an issue from a patent standpoint -- see http://en.wikipedia.org/wiki/Arithmetic_coding#US_patents Honestly, it's 2011. I'd like to see someone do a recent assessment of the patent issues in JPEG2000 to conclude if it's really still a problem. To be precise, you’re not doing run-length encoding or huffman encoding. Also to clarify, the JPEG standard allows huffman or arithmetic encoding, although not a lot of decoders support the arithmetic encoding due to (now irrelevant) patent issues.

You are using a different colorspace, but from a practical perspective that’s largely irrelevant–the core algorithm is identical, so I think Peter’s summary of the article is pretty accurate. You’re basically using JPEG with an alpha channel treated like a second luma channel, minus entropy encoding, and using YCoCg instead of YCbCr (which, from the encoder’s perspective, is also irrelevant). Also, the zig-zag encode is really designed to be used in conjunction with RLE, so you might not be getting the most out of your entropy encode by skipping that step unless what you’re using already does RLE.

Also, it’s not clear to me how you conclude JPEG2000 still has patent issues; a lot of sources for that are from several years ago, and as time marches on, patents expire. This is basically the only reason JPEG is safe (note that JPEG was the subject of patent nonsense in the late 90s)–it’s old. It’s also the reason the arithmetic encoding is no longer an issue from a patent standpoint — see Well, it's similar to JPEG, but we are using a different color-space. Also, we don't do Huffman-encoding. We do not do anything against the blocking artifacts. And jpeg2000 is patented and slower. (see the van Waveren paper) Well, it’s similar to JPEG, but we are using a different color-space. Also, we don’t do Huffman-encoding.

We do not do anything against the blocking artifacts. And jpeg2000 is patented and slower. (see the van Waveren paper)

]]> By: Jonas-Norberg/2011/05/18/texture-compression-using-the-discrete-cosine-transform/#comment-4540 Jonas-Norberg Thu, 19 May 2011 10:05:04 +0000 It's a bit of a long explanation to say "We're basically using JPEG treating the alpha channel as a second lum channel". Do you do anything against the typical JPEG/MPEG block artifacts? JPEG2000's main selling point is not having those. Given that you're cpu-decoding anyway, JP2K might be a better solution because of that. It’s a bit of a long explanation to say “We’re basically using JPEG treating the alpha channel as a second lum channel”.

Do you do anything against the typical JPEG/MPEG block artifacts? JPEG2000′s main selling point is not having those. Given that you’re cpu-decoding anyway, JP2K might be a better solution because of that.

]]>
By: Kevin Gadd/2011/05/18/texture-compression-using-the-discrete-cosine-transform/#comment-4528 Kevin Gadd Thu, 19 May 2011 08:28:54 +0000

]]>
By: Jonas-Norberg/2011/05/18/texture-compression-using-the-discrete-cosine-transform/#comment-4524 Jonas-Norberg Thu, 19 May 2011 07:06:52 +0000 Great write-up Jonas! I'm confused why the compression time is an issue for you. Obviously it's nice to have quick pack times but a bit more time in compression that saves download and decompression time for your end users seems like a big win to me. Just FYI, the quantization matrices in the JPEG spec come from the results of psycho-visual studies, as opposed to some mathematical derivation or whatever. So, feel free to tweak them a bit if you can squeeze more compression out! Also the derivation of the DCT approach in JPEG came from the assumption that images are perfectly random Guassian noise, which is a poor assumption for many textures and most real world images as you noted. For that reason, it's fun to play with wavelets like the Daubechies biorthogonal 7/9 pair, for example, which do a great job at picking out and preserving edges, even at high compression rates. Since you are concerned about issues of paging and cache hits, you might try using the smaller length 4 Haar wavelet filters instead of DCT's on NxN blocks in your image, if you have time to experiment! There was even some amazing work done <a href="http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.56.1332&rep=rep1&type=pdf" rel="nofollow">here</a> on progressive image transmission where the order of wavelet coefficients is determined based on minimizing the distortion of the output image at specific bandwidths in specific regions of interest. I suspect you're not actually trying to use the image until the transmission is complete though! Again, great write-up. Thanks for taking the time to share this with the world. Great write-up Jonas!

I’m confused why the compression time is an issue for you. Obviously it’s nice to have quick pack times but a bit more time in compression that saves download and decompression time for your end users seems like a big win to me.

Just FYI, the quantization matrices in the JPEG spec come from the results of psycho-visual studies, as opposed to some mathematical derivation or whatever. So, feel free to tweak them a bit if you can squeeze more compression out!

Also the derivation of the DCT approach in JPEG came from the assumption that images are perfectly random Guassian noise, which is a poor assumption for many textures and most real world images as you noted. For that reason, it’s fun to play with wavelets like the Daubechies biorthogonal 7/9 pair, for example, which do a great job at picking out and preserving edges, even at high compression rates. Since you are concerned about issues of paging and cache hits, you might try using the smaller length 4 Haar wavelet filters instead of DCT’s on NxN blocks in your image, if you have time to experiment!

There was even some amazing work done here on progressive image transmission where the order of wavelet coefficients is determined based on minimizing the distortion of the output image at specific bandwidths in specific regions of interest. I suspect you’re not actually trying to use the image until the transmission is complete though!

Again, great write-up. Thanks for taking the time to share this with the world.

]]> By: Cory Bloyd/2011/05/18/texture-compression-using-the-discrete-cosine-transform/#comment-4502 Cory Bloyd Wed, 18 May 2011 23:59:39 +0000 Reading your approach and implementation was pretty interesting and you've provided a very well explained, presented, accessible article which is always very welcome :) It's great for someone who doesn't know much about the area, like me, to read all the reasoning in your implementation and the discussion that arises from exposing an article on #altdevblogaday Thanks Reading your approach and implementation was pretty interesting and you’ve provided a very well explained, presented, accessible article which is always very welcome :) It’s great for someone who doesn’t know much about the area, like me, to read all the reasoning in your implementation and the discussion that arises from exposing an article on #altdevblogaday
Thanks

]]>
By: Jonas-Norberg/2011/05/18/texture-compression-using-the-discrete-cosine-transform/#comment-4489 Jonas-Norberg Wed, 18 May 2011 21:04:58 +0000

]]>
By: Jonas-Norberg/2011/05/18/texture-compression-using-the-discrete-cosine-transform/#comment-4488 Jonas-Norberg Wed, 18 May 2011 20:53:09 +0000 1. I did not compare with the two JPEGs. It would work, but it would add complexity and some extra shuffling of data in the decompression step. I do the same quantization as JPEG, same zig-zag reorder. After that I store all coefficients up to the last non-zero, this provides the compression. No huffman or arithmetic. As I mention in the post, I rely on our streaming system that does zlib compression. Since the stored file has a lot of zero-coefficients it compresses fairly well. 2. We need to have this running on the CPU for now. For a general CPU implementation, the memory access patterns of a block based approach seems more efficient. Also using a block based compression scheme will enable us to re-compress each block to the smaller (4x4) DXT blocks more naturally. 3. I have no good data on this yet. We are targeting platforms with many cores to ensure smooth streaming of data (like textures) in the background. 1. I did not compare with the two JPEGs. It would work, but it would add complexity and some extra shuffling of data in the decompression step. I do the same quantization as JPEG, same zig-zag reorder. After that I store all coefficients up to the last non-zero, this provides the compression. No huffman or arithmetic. As I mention in the post, I rely on our streaming system that does zlib compression. Since the stored file has a lot of zero-coefficients it compresses fairly well.

2. We need to have this running on the CPU for now. For a general CPU implementation, the memory access patterns of a block based approach seems more efficient. Also using a block based compression scheme will enable us to re-compress each block to the smaller (4×4) DXT blocks more naturally.

3. I have no good data on this yet. We are targeting platforms with many cores to ensure smooth streaming of data (like textures) in the background.

]]>
By: Cory Bloyd/2011/05/18/texture-compression-using-the-discrete-cosine-transform/#comment-4484 Cory Bloyd Wed, 18 May 2011 19:29:35 +0000
The compression section of the lionhead GDC 2011 presentation (slides 30-36) might be of interest. It seems similar / ‘jpeg -style’ – they base the method off of Rico Malvar’s PTC scheme

]]>
By: Compressor/2011/05/18/texture-compression-using-the-discrete-cosine-transform/#comment-4479 Compressor Wed, 18 May 2011 18:51:29 +0000 So, let's read up on downsampling: http://en.wikipedia.org/wiki/Downsampling "Filter the signal to ensure that the sampling theorem is satisfied. This filter should, theoretically, be the sinc filter. Let the filtered signal be denoted g(k). Reduce the data by picking out every Mth sample: h(k) = g(Mk). Data rate reduction occurs in this step." "My" filter may not be a sinc filter but it's a much better approximation of mathematical downsampling than a 2x2 box filter. And it doesn't shift the image in any direction. However, I may have forgotten to take into account the fact that depending on whether UVs represent corners or centers of texels, that will cause a shift too between the mipmap levels when texturing, and perhaps the 2x2 filter can counteract this shift. Darn it. But my mipmaps looked pretty good with no noticable shift, last I tried this. Anyway, I think gamma correct filtering makes more of a difference to perception of sharpness than the specific filter kernel. So, let’s read up on downsampling:
I really like the ascii visualization of the down-sampling! The shift you describe is only toward the center of the pixel and would globally "cancel itself out". I am thinking about a shift to the left. I can't wait for the visualized down-sampling using the "121" filter-kernel you described. I expect either a shift or additional blurring. I really like the ascii visualization of the down-sampling!

The shift you describe is only toward the center of the pixel and would globally “cancel itself out”. I am thinking about a shift to the left.

I can’t wait for the visualized down-sampling using the “121″ filter-kernel you described. I expect either a shift or additional blurring.

]]>
By: Jonas-Norberg/2011/05/18/texture-compression-using-the-discrete-cosine-transform/#comment-4467 Jonas-Norberg Wed, 18 May 2011 17:08:42 +0000

Judging from their success, It is definitely a task well suited for the GPU. We did not try it though.

]]>
By: Jonas-Norberg/2011/05/18/texture-compression-using-the-discrete-cosine-transform/#comment-4462 Jonas-Norberg Wed, 18 May 2011 16:42:50 +0000

In my implementation I “clamp” the values in the matrix so that a result from quantization always fits in a byte. This is what currently keeps my implementation from reaching the high image quality.

]]>
By: Jonas-Norberg/2011/05/18/texture-compression-using-the-discrete-cosine-transform/#comment-4460 Jonas-Norberg Wed, 18 May 2011 16:36:58 +0000 Okay, points on the sharpening, I have actually not tried that. But about the filter, in fact, it's the 2x2 box filter that causes a shift in the image. A filter like mine introduces no shift. This can be seen as follows, downsampling the 1-D texture "00008000" a few times with a 2x1 box filter: <code> 0.0.0.0.8.0.0.0. .0...0...4...0.. ...0.......2.... << .......1........ </code> Note that in the next-to-last level, the "center of gravity" is between the 6th and 7th texel, as seen from the highest mipmap, while it was at the 5th pixel in the first one. A centred filter like mine does not cause this issue. Okay, points on the sharpening, I have actually not tried that.

But about the filter, in fact, it’s the 2×2 box filter that causes a shift in the image. A filter like mine introduces no shift. This can be seen as follows, downsampling the 1-D texture “00008000″ a few times with a 2×1 box filter:


0.0.0.0.8.0.0.0.
.0...0...4...0..
...0.......2.... <<
.......1........

Note that in the next-to-last level, the “center of gravity” is between the 6th and 7th texel, as seen from the highest mipmap, while it was at the 5th pixel in the first one. A centred filter like mine does not cause this issue.

]]>
By: Jonas-Norberg/2011/05/18/texture-compression-using-the-discrete-cosine-transform/#comment-4455 Jonas-Norberg Wed, 18 May 2011 16:25:20 +0000 Hi Hendrik, you make some good points. First off. the idea of filtering + sharpening is pretty widespread: <a href="http://developer.valvesoftware.com/wiki/Photoshop_VTF_Plugin" rel="nofollow">http://developer.valvesoftware.com/wiki/Photoshop_VTF_Plugin</a> It's true that you could build the sharpening in to the downsample-filter, but that would produce a larger filter-kernel It would be interesting to know how you would use the filter kernel you describe. My first guess it that you'd use it to down-sample, if so, it seems like you would introduce a shift of the image. Hi Hendrik, you make some good points.

First off. the idea of filtering + sharpening is pretty widespread:
Good work! You might be surprised at how many products use this: http://www.videolan.org/developers/x264.html It's worth a look. Good work!

You might be surprised at how many products use this: Did you try to implement this on a GPU? I would expect it to run faster there. Did you try to implement this on a GPU? I would expect it to run faster there.

]]> By: Simon F/2011/05/18/texture-compression-using-the-discrete-cosine-transform/#comment-4450 Simon F Wed, 18 May 2011 15:35:34 +0000 should be a link to the paper

]]> By: MM/2011/05/18/texture-compression-using-the-discrete-cosine-transform/#comment-4444 MM Wed, 18 May 2011 14:50:37 +0000 Have you considered JPEG XR ( About the mipmap generation: as you state, simple box filtering will deliver blurry mipmaps. The correct thing to do though is not necessarily to sharpen, but to use a slightly better quality filter. Even a filter as simple as this little fake gaussian, along with gamma correction, will improve the mipmaps noticeably: 121 242 121 About the mipmap generation: as you state, simple box filtering will deliver blurry mipmaps. The correct thing to do though is not necessarily to sharpen, but to use a slightly better quality filter. Even a filter as simple as this little fake gaussian, along with gamma correction, will improve the mipmaps noticeably:
121
242
121

]]>