Sample Quantization

Once you’ve processed (e.g. resized) an image and determined the ideal sample values for the target image, you need to actually create the target image. The output “device” will usually be an image file, or a bitmap to display on the screen, and will usually support only a finite set of discrete color shades. The color value you calculated might be, say, 0.3692704 on a linear brightness scale from 0.0 to 1.0, and you need to convert it to an integer value from 0 to 255 for a device that uses the sRGB colorspace. How exactly do you go about doing that?

A simple but imperfect method is to convert to the target color space, and then choose the numerically nearest available color shade. The problem is that it can be off by one shade. That’s not much of a problem if the output device allows for 256 shades, but it can be a significant problem if there are only a few shades available.

Here’s an example. Suppose we have an image consisting of 10 pixels, 3 black and 7 white. (Assume an sRGB colorspace.)

I’m going to shrink it to just a single pixel, using a pixel-mixing algorithm (or, equivalently, a box filter). If the target image supported lots of colors, it would have a particular light gray color, like this:

But let’s assume our target image format only supports 4 gray shades, evenly spaced in the sRGB colorspace, and numbered from 0 to 3. Here are those colors, and their values in both a linear and sRGB colorspace:

CodeLinearsRGB
0 0.00000000.0000000
1 0.09084170.3333333
2 0.40197770.6666666
3 1.00000001.0000000

To resize the image, we first convert to a linear colorspace, which in this case changes nothing because all the samples values are either the minimum or maximum brightness. The average color value of the 10 pixels in our image is 0.7, in a linear colorspace. If we convert linear 0.7 to sRGB, we get 0.8543058. Since that’s not one of the available colors, we’ll have to either round it down to shade #2, or up to shade #3. But which?

In the target sRGB colorspace, 0.8543058 is 56.29% of the way from code 2 (0.6666666) to code 3 (1.0), so it’s closer to 3. So perhaps the image should look like this:

But in a linear colorspace, 0.7 is 49.83% of the way from code 2 (0.4019777) to code 3 (1.0), so it’s closer to 2. So perhaps the image should look like this:

Which is best? Should you choose the color that looks the most similar to the ideal color (), or the color that really is the most similar ()?

If the image is all one color, or the pixel is part of a large region of that exact color, then you could make the argument that you should choose the color that looks the most similar. But that’s a special case, which should be reserved for advanced adaptive image processing algorithms. For general purposes, the best option is to use the color that really is most similar. Otherwise, as your eye averages together neighboring pixels, there will be a bias that makes the image too light or too dark. So the best answer is the latter: the nearest color in a linear colorspace.

Now the question becomes, how is your computer program going to figure that out? How does it figure out the next-higher and next-lower available shades, given that they aren’t evenly spaced?

You could make a lookup table of all (in this case 4, but in most cases 256) available colors. But keep in mind that the key number you need to find in the table is a floating point number, so it isn’t going to appear in the table exactly. You’ll have to do a binary search to find which table entries it lies between. If you are not dithering (see below), you can optimize this by using a slightly modified lookup table containing the boundary values at which the nearest color changes. Like this:

LinearsRGBCode
0.0000000 to 0.0454208 0.0000000 0
0.0454208 to 0.2464097 0.3333333 1
0.2464097 to 0.7009888 0.6666666 2
0.7009888 to 1.0000000 1.0000000 3

For example, the 0.0454208 value is (0.0000000+0.0908417)/2.

ImageWorsener uses this method in many cases.

If you don’t want to do a binary search, you could make a big linear-to-nearest-available-color table with thousands of entries, round your value to the nearest entry in the table, and accept the fact that it’s not quite perfect.

Or you could do without a lookup table: convert your value to sRGB, find the next-lower and next-higher available sRGB colors, convert both of those colors backwards to the linear colorspace, and choose the nearer of them. That can be slow, though. ImageWorsener falls back to this method in situations where it doesn’t support a lookup table.

No doubt there also exist algorithms that are both fast and accurate, but which are more complex.

The first image in the table below is the example image used above. You may be able to use it to test your image scaling software, though this requires very specific test conditions. Restrict the target image to four color intensity levels, and use a box filter to resize to 1×1. The second image should also be tested, to make sure the software supports color correction at all. (Yes, they should both result in the same resized image.)

ImageIdeal average color for 2-bit grayscale, sRGB
(170,170,170)
(170,170,170)

Commands for ImageWorsener:
imagew 7of10.png result1.png -filter mix -width 1 -height 1 -cc 4
imagew 4of10.png result2.png -filter mix -width 1 -height 1 -cc 4

Posterization

I wanted to mention the term posterization, because that’s essentially what we’re doing here. I think the term only really applies when the number of colors is small enough that the target image will have visible “banding”, but from a programming standpoint, there’s no difference.

Dithering

Instead of always choosing the nearest available color, you might instead use some type of dithering. Dithering means that you sometimes choose a color that is not the nearest color, in order to make the average color among nearby pixels be closer to the ideal color. A good dithering algorithm might take our “49.83%” calculation above, and tell us to use code 2 about 50.17% of the time, and code 3 about 49.83% of the time.

In fact, you should probably think of the “always choose the nearest color” strategy as just a special degenerate form of dithering.

I’m not going to discuss the various dithering algorithms here. Just be aware this is the place in your image processing pipeline to apply one if you’re going to.

Predefined or optimized palettes

The strategies described on this page only work when your color channels can be treated independently of each other. For a color RGB image, this means that if you imagine a three-dimensional space of R, G, and B coordinates, the available colors form the shape of a 3-dimensional rectangular lattice. The available color intensities don’t have to be evenly spaced, but your choice of, say, red value must not affect the available options for the green or blue values.

If you have to use a user-defined color palette, or you need to compute an optimal color palette, then that won’t be the case, and you’ll have to use a more sophisticated color quantization algorithm.

I’m not going to discuss this topic here, in part because it can be very complex, and in part because I don’t know much about it. Search the web for “color quantization” or “palette optimization”.