Additional technical documentation about ImageWorsener ====================================================== This file contains extra information about ImageWorsener. The main documentation is in readme.txt. Web site: Acknowledgments --------------- Some of the inspiration for this project came from these web pages: "Gamma error in picture scaling" http://www.4p8.com/eric.brasseur/gamma.html "How to make a resampler that doesn't suck" http://www.virtualdub.org/blog/pivot/entry.php?id=86 Information about resampling functions and other algorithms was gathered from many sources, but ImageMagick's page on resizing was particularly helpful: http://www.imagemagick.org/Usage/resize/ Alternatives ------------ There are many applications and libraries that do image processing, but in the free software world, the leader is ImageMagick (http://www.imagemagick.org/). Or you might prefer ImageMagick's conservative alter-ego, GraphicsMagick (http://www.graphicsmagick.org/). Installing / Building from source --------------------------------- Dependencies (optional): libpng zlib libjpeg libwebp Here are four possible ways to build ImageWorsener: * Prebuilt Visual Studio 2008 project files Open the imagew2008.sln file in Visual Studio 2008 or newer. To compile without libwebp: Edit the project settings to not link to libwebp.lib, and change the line in src/imagew-config.h to "#define IW_SUPPORT_WEBP 0". * Generic Makefile In a Unix-ish environment, try typing "make -C scripts". It should build an executable file named "imagew" or "imagew.exe". To compile without libwebp: Set the "IW_SUPPORT_WEBP" environment variable to "0" (type "IW_SUPPORT_WEBP=0 make"). * Using autotools Official source releases contain a file named "configure". In simplest form, run ./configure then make Many options can be passed to the "configure" utility. For help, run ./configure --help Suggested options: CFLAGS="-g -O3" ./configure --disable-shared If there is no "configure" file in the distribution you're using, you need to generate it by running scripts/autogen.sh You must have GNU autotools (autoconf, automake, libtool) installed. To clean up the mess made by autogen.sh, run scripts/autogen.sh clean * Using CMake (deprecated?) CMake is a utility that generates Makefiles and project files. If you don't have CMake installed, you can download it from . In a Unix-ish environment: $ mkdir build $ cd build $ cmake .. $ make Using CMake from Windows is not recommended at this time, but you can try it if you want. First, set your environment variables correctly by running a command prompt via Start -> All Programs -> Microsoft Visual [whatever] -> Visual Studio Tools -> Visual Studio Command Prompt. If you can't find such a utility, look for a script named VCVARS32.BAT and run that. To build from the command line: > cd > mkdir build > cd build > cmake .. > nmake To build from the IDE: > cd > mkdir build > cd build > cmake -G "Visual Studio 9 2008" .. Now open the imagew.sln file. Instead of "Visual Studio 9 2008", you can name any "generator" supported by CMake. Consult the CMake documentation. Here are some examples: "Visual Studio 7 .NET 2003" "Visual Studio 8 2005" "Visual Studio 8 2005 Win64" "Visual Studio 9 2008" "Visual Studio 9 2008 Win64" "Visual Studio 10" "Visual Studio 10 Win64" This is not meant to imply that IW is guaranteed to work with all of the compilers listed above. Philosophy ---------- ImageWorsener attempts to have good defaults. The user should not have to know anything about gamma correction, bit depths, filters, windowing functions, etc., in order to get good results. IW tries to be as accurate as possible. It never trades accuracy for speed. Really, it goes too far, as nearly everyone would rather have a program that works twice as fast and is imperceptibly less accurate. But there are lots of utilities that are optimized for speed, and there would be no reason for IW to exist if it worked the same as everything else. I don't intend to add millions of options to IW. It is nearly feature complete as it is. I want most of the options to have some practical purpose (which may include the ability to imitate what other applications do). Admittedly, some fairly useless options exist just for orthogonal completeness, or to scratch some particular itch I had. I've taken a lot of care to make sure the resizing algorithms are implemented correctly. I won't add an algorithm until I'm sure that I understand it. This isn't so easy. There's a lot of confusing and contradictory information out there. IW's command line should not be thought of as a sequence of image processing commands. Instead, imagine you're describing the properties of a display device, and IW will try to create the best image for that device. For example, if you tell IW to dither an image and resize it, it knows that it should resize the image first, then dither it, instead of doing it in the opposite order. IW does not really care about the details of how an image is stored in a file; it only cares about the essential image itself. For example, a 1-bit image is treated the same as an 8-bit representation of the same image. If you resize a bilevel image, you'll automatically get high quality grayscale image, not a low quality bilevel image. Architecture ------------ IW has three components: The core library, the auxiliary library, and the command-line utility. The core library does the image processing, but does not do any file I/O. It knows almost nothing about specific file formats. It has access to the internal data structures defined in imagew-internals.h. It does not make any direct calls to the auxiliary library. The auxiliary library consists of the file I/O code that is specific to file formats like PNG and JPEG. It does not use the internal data structures from imagew-internals.h. The public interface is completely defined in the imagew.h file. It includes declarations for both the core and auxiliary library. The command-line utility is implemented in imagew-cmd.c. It uses both the core library and the auxiliary library. The core and auxiliary libraries are separated in order to break dependencies. For example, if your application supports only PNG files, you can probably (given how most linkers work) build it without linking to libjpeg. Files in core library: imagew-internals.h, imagew-main.c, imagew-resize.c, imagew-opt.c, imagew-api.c, imagew-util.c Files in auxiliary library: imagew-png.c, imagew-jpeg.c, imagew-webp.c, imagew-gif.c, imagew-miff.c, imagew-bmp.c, imagew-tiff.c, imagew-zlib.c Files in command-line utility: imagew-cmd.c, imagew.rc, imagew.ico Other files: imagew.h (Public header file, Core, Aux., Command-line) imagew-config.h (Core, Aux., Command-line) Double-precision floating point? -------------------------------- IW can be compiled to use any available floating point type for its internal representation of samples. (Unfortunately, it's impractical to make this a run-time option.) Its default is currently set to be "double", which is usually an 8-byte floating-point number. This may seem like overkill, and I admit, it probably is. Using double precision shouldn't have much effect on performance, especially if it's compiled as a 64-bit application. But it will use a lot more memory. The real reason I haven't switched to single-precision is simply because I haven't found any particular reason to do so. IW is intended to be used on reasonably modern PCs, and it does not aim for low memory use or the highest possible performance, so this might not be much of an issue. For 8-bit target images, switching to 4-byte floating point affects very roughly one pixel in every 50,000. For 16-bit target images, it's more like one pixel in every few hundred. Not that that means anything -- the image processing algorithms being used aren't even "correct" to that degree. 4-byte floating-point numbers give you about 7 significant digits, which in extreme cases may not be quite enough. Particularly for 16-bit target images, when working in a linear colorspace, bright samples are much, much brighter than the dimmest samples. If IW has to add a huge number of dim pixels together with just a few bright pixels, 7 significant digits might not be enough to do the kind of accurate calculations it strives for. Unicode ------- Text files like this one notwithstanding, I've had enough of ASCII, and I want to support Unicode even in an application like this that does very little with text. IW supports Unicode filenames, and will try to use Unicode quotation marks, arrows, etc., if possible. If IW does not correctly figure out the encoding you want, you can explicitly set it using the "-encoding" option. In a Unix environment, Unicode output can also probably be turned off with environment variables, such as by setting "LANG=C". The encoding setting does not affect the interpretation of the parameters on the command line. This should not be a problem in Windows, because Windows can translate them. But on a Unix system, they are always assumed to be UTF-8. All strings produced by the library (e.g. error messages) are encoded in UTF-8. Applications must convert them if necessary. Rationale for the default resizing algorithm -------------------------------------------- By default, IW uses a Catmull-Rom ("catrom") filter for both upscaling and downscaling. Why? For one thing, I don't want to default to a filter that has any inherent blurring. A casual user would expect that when you "resize" an image without changing the size, it will not modify the image at all. This requirement eliminates mitchell, gaussian, etc. The "echoes" produced by filters like lanczos(3) are too weird, I think; and they can be too severe when using proper gamma correction. When upscaling, hermite, triangle, and pixel mixing just don't have acceptable quality. That really only leaves catrom and lanczos2. I somewhat arbitrarily chose catrom over lanczos2 (they are almost identical). When downscaling, the differences between various algorithms are much more subtle. Hermite and pixel mixing are both reasonable candidates, and are nice in that they have no ringing at all. But they're not quite as sharp as catrom, and can do badly with images that have thin lines or repetetive details. Colorspaces ----------- Unless it has reason to believe otherwise, IW assumes that images use the sRGB colorspace. This is the colorspace that standard computer monitors use, and it's a reasonable assumption that most computer image files (whether by accident or design) are intended to be directly displayable on computer monitors. It does this even if the file format predates the invention of sRGB, and/or the file format specification says that, by default, colors have a gamma of 2.2 (which is similar, but not identical, to sRGB). IW does not support ICC color profiles. Full or partial support for them may be added in a future version. TIFF output support ------------------- IW mainly sticks to the "baseline" TIFF v6 specification, but it will write images with a sample depth of 16 bits, which is not part of the baseline spec. It writes transparent images using unassociated alpha, which is probably less common in TIFF files than associated alpha, and may not be supported as well by TIFF viewers. TIFF colorspaces ---------------- When writing TIFF files, IW uses the TransferFunction TIFF tag to describe the colorspace that the output image uses. I doubt that many TIFF viewers read this tag, and actually, I don't even know how to test whether I'm using it correctly. You can disable the TransferFunction tag by using the "-nocslabel" option. GIF screen size vs. image size ------------------------------ Every GIF file has a global "screen size", and a sequence of one or more images. Each image has its own size, and an offset to indicate its position on the screen. By default, IW treats the screen size as the final image size, and paints the GIF image (as selected by the -page option) onto the screen at the appropriate position. Any area not covered by the image will be made transparent. If you use the -noincludescreen option, it will instead ignore the screen size and the image position, and extract just the selected image. MIFF support ------------ IW can write to ImageMagick's MIFF image format, and can read back the small subset of MIFF files that it writes. MIFF supports floating point samples, and this is intended to be used to store intermediate images, in order to perform multiple operations on an image with no loss of precision. MIFF support is experimental and incomplete. Some features, such as dithering, may not be supported with floating point output. To use ImageMagick to write a MIFF file that IW can read, try: $ convert -define quantum:format=floating-point -depth 32 \ -compress Zip Nonsquare pixels ---------------- If you use one of the scaling options that doesn't change the aspect ratio, IW always writes an image with square pixels. Example: Suppose the input image is a fax with an X density of 204dpi and a Y density of 96dpi. It will scale the Y dimension by a factor that's 204/96 times larger than the X dimension's scaling factor. "Color" of transparent pixels ----------------------------- In image formats that use unassociated alpha values to indicate transparency, pixels that are fully transparent still have "colors", but those colors are irrelevant. IW will not attempt to retain such colors, and will make fully- transparent pixels black in most cases. An exception is if the output image uses color-keyed transparency, in which case it uses a different strategy. Box filter ---------- It's not obvious how a box filter should behave when a source pixel falls right on the boundary between two target pixels. There seem to be several options: 1. "Clone" the source pixel, and put it into both "boxes" (target pixels). 2. "Split" the source pixel, and put it into both boxes, but with half the usual weight. 3. Arbitrarily select one of the two boxes (which could be the left box, the right box, or some other strategy like selecting the box nearest to the center of the image). 4. Ignore the problem, in which case the algorithm may behave unpredictably, due to the intricacies of floating point rounding. It may sometimes clone, sometimes round, and sometimes skip over a pixel completely. IW arbitrarily selects the left (or top) box. To make it select the right (or bottom) box instead, you could translate the image by a very small amount; e.g. "-translate 0.000001,0.000001". To use the "clone" strategy, use a very small blur; e.g. "-blur 1.000001". Nearest neighbor ---------------- When using the nearest neighbor algorithm, if a target pixel is equally close to two source pixels, it will be given the color of the one to the right (or bottom). This is the same tiebreaking logic as is used for the box filter. (It may sound like it's the opposite, but it's not: image features are shifted to the left in each case.) As with a box filter, you can change this by translating the image by a very small amount. PNG sBIT chunks --------------- If a PNG image contains the rarely-used sBIT chunk, IW will ignore any bits that the sBIT chunk indicates are not significant. Suppose you have an 8-bit grayscale image with an sBIT chunk that says 3 bits are significant. If the app that wrote the file was not defective, there will probably be only 8 colors in the image. The image might contain these colors: 00000000 = 0/255 = 0.00000000 00100100 = 36/255 = 0.14117647 01001001 = 73/255 = 0.28627450 01101101 = 109/255 = 0.42745098 10010010 = 146/255 = 0.57254901 10110110 = 182/255 = 0.71372549 11011011 = 219/255 = 0.85882352 11111111 = 255/255 = 1.00000000 IW, though, will see only the significant bits, and will interpret the image like this: 000 = 0/7 = 0.00000000 001 = 1/7 = 0.14285714 010 = 2/7 = 0.28571428 011 = 3/7 = 0.42857142 100 = 4/7 = 0.57142857 101 = 5/7 = 0.71428571 110 = 6/7 = 0.85714285 111 = 7/7 = 1.00000000 So, the interpretation is slightly different (e.g. 0.14285714 instead of 0.14117647). Ordered dithering + transparency -------------------------------- Ordered (or halftone) dithering with IW can produce poor results when used with images that have partial transparency. If you ordered-dither both the colors and the alpha channel, you can have a situation where all the (say) darker pixels are made transparent, leaving only the lighter pixels visible, and making the image much lighter than it should be. This happens because the same dither pattern is used for two purposes (color thresholding and transparency thresholding). Obscure details about clamping, backgrounds, and alpha channel resizing ----------------------------------------------------------------------- "Clamping" is the restricting of sample values to the range that is displayable on a computer monitor. This must be done when writing to any file format other than MIFF. But if you use -intclamp, it will also be done at other times. Essentially, it will be done as often as possible, after every dimension of every resizing operation. If a background is applied after resizing, clamping will be done individually to both the alpha channel and the color channels, then the background will be applied. If you don't use -intclamp, no clamping will be done, except as the very last step. If IW applies a background after resizing the image, the alpha channel will not be clamped first, so it could actually contain negative opacity values. That's hard to envision, but the math works out, and you generally get the same result as if you had applied the background before resizing. Currently, the only time IW applies a background before resizing is when a channel offset is being used. This means that using -offset can have unexpected side effects if you also use -intclamp. Cropping -------- IW's -crop option crops the image before resizing it, completely ignoring any pixels outside the region to crop. This is not quite ideal. Ideally, any pixel that could have an effect on the pixels at the edge of the image should be kept around until after the resize, then the crop should be completed. This is not difficult in theory, but coding it would be messy enough that I haven't attempted it. To do ----- Features I'm considering adding: - More options for specifying the image size to use; e.g. "enlarge the image only if it's smaller than a certain size". - More options for aligning the input pixels with the output pixels. - Support for reading BMP files. - Ability to maintain PNG and GIF background colors. - Hilbert curve dithering. - Support for ICC color profiles. Contributing ------------ I may accept code contributions, if they fit the spirit of the project. I will probably not accept contributions on which you or someone else claims copyright. At this stage, I want to retain the ability to change the licensing terms unilaterally. Of course, the license allows you to fork your own version of ImageWorsener if you wish to.