Lossless Photo Compression

Last week, I had the privilege of giving a lecture on image coding and compression for an IEEE course on Image Processing, organized by the Boston chapter of the Signal Processing Society. My lecture slides (PDF) are available for download.

The attendees were very engaged, and I got lots of great questions. When we got into wavelet-based JPEG2000 compression, there was one question in particular that I couldn’t quite answer at the time: if wavelet transforms are like digital FIR filters, how can they be reversible to provide lossless compression?

It’s a great question. The wavelet transform is based on tree-like cascading filter banks, where the signal is sent through a low-pass and high-pass filter at each stage, as shown here:

Wavelets are like a tree of cascading filter banks

Wavelets are like a tree of cascading filter banks
Source: http://en.wikipedia.org/wiki/Discrete_wavelet_transform

In the diagram above, the filter denoted g[n] represents a low-pass filter, and h[n] a high-pass filter. In lossless image compression, the wavelet transform is carried on for a particular number of stages, and the resulting 2-D signal is compressed using entropy coding (which is lossless). In order for the original image to be recovered, the wavelet transform, and by association the filters g[n] and h[n], must be reversible.

In general, a digital Finite Impulse Response (FIR) filter is not reversible. Even filter banks, which pass a single input signal through a number of FIR filters, are generally not reversible. However, the Cohen-Daubechies-Feauveau (CDF) 5/3 wavelet employed by JPEG2000 in lossless mode has a couple interesting properties of the wavelet transform used in JPEG2000 lossless compression that make the filters reversible.

First, the CDF 5/3 wavelet is designed to be an integer-to-integer map, such that the output can be expressed with finite precision. This is achieved in part by approximating the filter coefficients with rational numbers—in fact, each coeffient is a dyadic fraction (i.e. fraction whose denominator is a power of two). Second, the low-pass and high-pass coefficients are designed such that the two filter outputs can be combined to recover the original input. In other words, the two filters are complementary. Actually, the two sub-band signals are generated using a lifting scheme, as described here.

In short, to call the wavelet filters high-pass and low-pass FIR filters is an over-simplification. The CDF 5/3 wavelet is a carefully designed reversible transform that makes it well-suited to lossless image compression.

(Wavelets are a dense, complex topic. If you’re interested in really understanding them, I would suggest Wavelets and Filter Banks by Strang and Nguyen. There are a host of other books available, but I’ve found their book to be the most readable.)

Image Compression in 60 Minutes

I’ve been invited to give a lecture on Image Coding and Compression next week. The audience is mostly IEEE members—engineering professionals working in various fields looking to bolster their background in image processing. I get one hour to cover this huge topic area. So what are the most important bits to cover?

I like lectures that tell a story, so I’m approaching the topic chronologically. I got into computers as a kid in the 90s, and back then, web graphics were all about GIFs and JPEGs. These formats are based on very different compression techniques, and it was (and still is) important to choose the right one based on your image content. Beyond the web, many engineering applications require lossless photo compression, and JPEG2000 has long been the de facto standard. More recently, new vectorized formats (Scalable Vector Graphics, in particular) have taken off on the web, largely driven by devices with very high display resolution, such as Apple’s retina display.

With that narrative in mind, here’s my outline of the most important 60 minutes in Image Coding and Compression:

  • How color works
  • Simple image coding (e.g. run length encoding)
  • LZW compression (GIF)
  • Compression with the discrete cosine transform (JPEG)
  • Compression with wavelet transforms (JPEG2000)
  • New graphics formats (e.g. SVG)

Of course, there are a lot of interesting topics that didn’t make the cut (most notably, perhaps, predictive coding). What are your thoughts?