informatique -solutions series: May 2011

Thursday, May 5, 2011

Images: Scaling JPEGs and PNGs

The Problem

Images -- especially JPEG images -- can be huge.

Last weekend for example: I went to a Demetri Martin booksigning. I got my photo taken with the author with my phone. I tried to upload it, and I was surprised to see it was 800+ KB, at about 2500 x 2000 pixels.

On the one hand: this is great, right? I had no idea my phone had that level of resolution.

On the other hand: can your Java app effortlessly open up and work with this image? Ours can't. We have a painting application where the canvas size is around 1,000 pixels. The problem is: we have to open the JPEG at its original size and then scale it down:

(2500 pixels wide) x (2000 pixels tall) x (4 bytes*) = 19 MB

... but what we ultimately want is a 1000 x 800 pixel image: about 3 MB. We have to churn through 22 MB of memory to get what we want. And while this image is bigger than I expected, it's not "that big". Our testing team found an image to test with that's 20,000 x 15,000 pixels. That's 1.1 GB of memory uncompressed. Our app can't spare 1.1 GB of RAM to open (or create a thumbnail of) an image.

(This article is a little longer than I'd like, but I promise it ends well. Feel free to skip to the charts below if you're in a hurry.)

*I used 4 bytes because although a JPEG is opaque and only requires 3 bytes, in our apps we nearly always convert it to a TYPE_INT_RGB. This is backed by an int-based raster, so whether we use the 4th byte or not: it's allocated.

The Context

I was still in college when Java 1.4 was released. But even then I remembered discovering the ImageIO classes: why would anyone use anything else? In fact when I did start work a year or two later: I had to support some legacy Java 1.3 code. Loading java.awt.Image objects instead of java.awt.image.BufferedImage objects seemed ridiculous: why would you want to use an Image when you could use a BufferedImage?

But now I've come to the conclusion that the Image class is hugely underrated. (And I'm surprised to see it still does its job so well, several years later when I really needed it.)

The Existing Architecture

The Image architecture is intentionally abstract -- and perhaps intentionally cryptic. There are 3 important classes related to images:

ImageObserver: this simply receives updates about an image. Things like the width and height, and whether the image is fully loaded.

ImageProducer: this is the big hairy abstract entity that actually creates pixel data. Every Image has an ImageProducer behind it.

ImageConsumer: this listens to an ImageProducer and makes sense of the incoming data. We could use this to construct a BufferedImage, for example. This is the object we're interested in for this article.

The Solution

What I want to do now is devise an ImageConsumer that does not buffer an image in memory: instead it pipes the pixel data through efficiently and scales the image in one pass.

There is a very important red flag here: according to the very loose relationship between an ImageProducer and an ImageConsumer: the pixels may be delivered in any order. They could be tiles. They could come in non-continuous rows. You don't know. It so happens in all my tests that the default JPEG decoder does deliver the pixel data in a single pass -- one row at a time. But this is not guaranteed, and it may change at any time. (This is a behind-the-scenes detail that you don't have any control over unless you write your own JPEG reader.) I'll come back to this point later.

To address the vague contract of the ImageConsumer: I wrote the PixelIterator for my own use a while ago: it promises pixel data will be delivered one consecutive row at a time. (Either top-to-bottom or bottom-to-top.) Then I used this class in this article to dynamically scale images: the ScalingIterator filters a large image and scales it on-the-fly to a smaller set of dimensions. The resulting image is antialiased, although the rounding is vague and not strictly adhering to any set of interpolation rules.

So what is missing in this system is the class that converts a java.awt.Image to a PixelIterator so the ScalingIterator can process it. To fill this niche I wrote the GenericImageSinglePassIterator. The ImageProducer passes data to this object in one thread, and as calls to myIterator.next(...) are made this data is processed and passed on to the caller. The ImageProducer thread is blocked as pixel data is released until that data is requested: this guarantees data is not kept in memory unnecessarily. (Either thread will timeout after 5 seconds of no feedback, though, and report than an error has occurred.)

Results

I wrote a class called ScalingJPEGDemo to demonstrate this class. It uses this image from a recent vacation. This image is 2750 x 2063 pixels, or 21 MB uncompressed.

I wrote a handful of other scaling methods to compare against. In the ScalingJPEGDemo class I defined this interface:

static interface ScalingReader {
 public Image create(Dimension originalSize,Dimension maxSize) throws Throwable;
   public String getName();
 }
}

(The new code I developed for this project can handle incoming images much larger than 2750 x 2063, but some of the other contenders can't.)

These are the different readers I compared:

SinglePassReader

This is the new class I wrote (described above). It is fast and antialiased, but most importantly it leaves a low memory footprint.

ScalingReader singlePassReader = new ScalingReader() {
 public Image create(Dimension originalSize,Dimension maxSize) throws Throwable {
   URL url = getImageURL();
   return GenericImageSinglePassIterator.createScaledImage( url, maxSize);
 }

 public String getName() {
   return "singlePassReader";
 }
};

scalingGraphicsReader

When I searched online, this is one of the two most commonly recommended approaches to scaling graphics. This relies on the Graphics2D interpolation rules to scale the incoming image. The problem is: if the scaling factor is less than 50%, then the resulting image looks aliased.

ScalingReader scalingGraphicsReader = new ScalingReader() {
 Component component = new JPanel();
 MediaTracker mediaTracker = new MediaTracker(component);
 public Image create(Dimension originalSize,Dimension maxSize) throws Throwable {
   Dimension newSize = Scaling.scaleDimensionsProportionally(originalSize, maxSize);

   URL url = getImageURL();
   Image image = Toolkit.getDefaultToolkit().createImage(url);
   mediaTracker.addImage(image, 0);
   mediaTracker.waitForAll();

   BufferedImage thumbnail = new BufferedImage(newSize.width, newSize.height, BufferedImage.TYPE_INT_ARGB);
   Graphics2D g = thumbnail.createGraphics();
   g.setRenderingHint(RenderingHints.KEY_INTERPOLATION,  RenderingHints.VALUE_INTERPOLATION_BILINEAR);
   g.setRenderingHint(RenderingHints.KEY_ANTIALIASING, RenderingHints.VALUE_ANTIALIAS_ON);
   g.drawImage(image, 0, 0, newSize.width, newSize.height, null);
   g.dispose();

   mediaTracker.removeImage(image);
   image.flush();

   return thumbnail;
 }

 public String getName() {
   return "scalingGraphicsReader";
 }
};

transformReader

This is practically the same as the previous model. The only difference is this uses an AffineTransform. You never know how graphics pipelines process these requests, and over the years I've discovered some bizarre performance issues when you paint something two different ways. However in this case: these two image-scaling methods performed identically.

ScalingReader transformReader = new ScalingReader() {
 Component component = new JPanel();
 MediaTracker mediaTracker = new MediaTracker(component);
 public Image create(Dimension originalSize,Dimension maxSize) throws Throwable {
   Dimension newSize = Scaling.scaleDimensionsProportionally(originalSize, maxSize);

   URL url = getImageURL();
   Image image = Toolkit.getDefaultToolkit().createImage(url);
   mediaTracker.addImage(image, 0);
   mediaTracker.waitForAll();

   BufferedImage thumbnail = new BufferedImage(newSize.width, newSize.height, BufferedImage.TYPE_INT_ARGB);
   Graphics2D g = thumbnail.createGraphics();
   double scale = ((double)newSize.width)/((double)originalSize.width);
   g.scale(scale, scale);
   g.setRenderingHint(RenderingHints.KEY_INTERPOLATION, RenderingHints.VALUE_INTERPOLATION_BILINEAR);
   g.setRenderingHint(RenderingHints.KEY_ANTIALIASING, RenderingHints.VALUE_ANTIALIAS_ON);
   g.drawImage(image, 0, 0, null);
   g.dispose();

   mediaTracker.removeImage(image);
   image.flush();

   return thumbnail;
 }

 public String getName() {
   return "transformReader";
 }
};

scaledInstanceReader

This is the approach most commonly insulted as "the old-fashioned way" to scale images. But it's still a legitimate approach. In my opinion it also looks better (because it is antialiased) than the two approaches above.

ScalingReader scaledInstanceReader = new ScalingReader() {
 Component component = new JPanel();
 MediaTracker mediaTracker = new MediaTracker(component);
 public Image create(Dimension originalSize,Dimension maxSize) throws Throwable {
   Dimension newSize = Scaling.scaleDimensionsProportionally(originalSize, maxSize);

   URL url = getImageURL();
   Image image = Toolkit.getDefaultToolkit().createImage(url);
   mediaTracker.addImage(image, 0);
   mediaTracker.waitForID(0);

   Image image2 = image.getScaledInstance(newSize.width, newSize.height, Image.SCALE_SMOOTH);
   mediaTracker.addImage(image2, 1);
   mediaTracker.waitForID(1);

   mediaTracker.removeImage(image);
   mediaTracker.removeImage(image2);
   image.flush();

   return image2;
 }

 public String getName() {
   return "scaledInstanceReader";
 }
};

replicateScaleFilter

This is an interesting class I had never used before. Although the API promises it will be an aliased image, I was interested to see how quickly it performed.

ScalingReader replicateScaleFilter = new ScalingReader() {
 Component component = new JPanel();
 MediaTracker mediaTracker = new MediaTracker(component);
 public Image create(Dimension originalSize,Dimension maxSize) throws Throwable {
   Dimension newSize = Scaling.scaleDimensionsProportionally(originalSize, maxSize);

   URL url = getImageURL();
   Image image = Toolkit.getDefaultToolkit().createImage(url);
   mediaTracker.addImage(image, 0);
   mediaTracker.waitForAll();

   ImageProducer producer = image.getSource();
   ImageFilter scalingFilter = new ReplicateScaleFilter(newSize.width, newSize.height);
   ImageProducer scaledProducer = new FilteredImageSource(producer, scalingFilter);
   Image image2 = Toolkit.getDefaultToolkit().createImage(scaledProducer);

   mediaTracker.addImage(image2, 1);
   mediaTracker.waitForID(1);

   mediaTracker.removeImage(image);
   mediaTracker.removeImage(image2);

   return image2;
 }

 public String getName() {
   return "replicateScaleFilter";
 }
};

areaAveragingScaleFilter

This is similar to the replicateScaleFilter, except this filter antialiases pixels. It will be more expensive, but the results will be better.

ScalingReader areaAveragingScaleFilter = new ScalingReader() {
 Component component = new JPanel();
 MediaTracker mediaTracker = new MediaTracker(component);
 public Image create(Dimension originalSize,Dimension maxSize) throws Throwable {
   Dimension newSize = Scaling.scaleDimensionsProportionally(originalSize, maxSize);

   URL url = getImageURL();
   Image image = Toolkit.getDefaultToolkit().createImage(url);
   mediaTracker.addImage(image, 0);
   mediaTracker.waitForAll();

   ImageProducer producer = image.getSource();
   ImageFilter scalingFilter = new AreaAveragingScaleFilter(newSize.width, newSize.height);
   ImageProducer scaledProducer = new FilteredImageSource(producer, scalingFilter);
   Image image2 = Toolkit.getDefaultToolkit().createImage(scaledProducer);

   mediaTracker.addImage(image2, 1);
   mediaTracker.waitForID(1);

   mediaTracker.removeImage(image);
   mediaTracker.removeImage(image2);

   return image2;
 }

 public String getName() {
   return "areaAveragingScaleFilter";
 }
};

imageIOReader

As far as I can tell: the ImageIO package doesn't support scaling, but it does support subsampling. In addition to antialiasing, this presents a new problem none of the other readers face:

Subsampling only works in discrete intervals. Suppose you want to subsample a 400 x 400 image with a subsample factor of 2: you'll get a 200 x 200 pixel image. Now suppose you use a subsample factor of 4: you'll get a 100 x 100 pixel image. Now suppose you wanted an image that was 150 x 150 pixels? You're out of luck. There is no subsample factor that gives you this result. The reader below calculates a subsample factor, but still provides a BufferedImage of the requested size: the remaining pixels are filled in with black.

The thumbnail to the right is "lucky", because it only introduces 1 row and column of black. But in other cases, or with other pixel dimensions, this extra padding can be several dozen pixels.

final ImageReader reader = (ImageReader)ioReaderList.get(a);
ScalingReader imageIOReader = new ScalingReader() {
 public Image create(Dimension originalSize,Dimension maxSize) throws Throwable {
   URL url = getImageURL();
   Dimension newSize = Scaling.scaleDimensionsProportionally(originalSize, maxSize);
   InputStream in = url.openStream();
   try {
     ImageInputStream imageIn = ImageIO.createImageInputStream( in );
     reader.setInput(imageIn);
     int w = reader.getWidth(0);
     int h = reader.getHeight(0);

     int sub = 1;
     if (w>h)
       sub = MathG.ceilInt(w/maxSize.getWidth());
     else
       sub = MathG.ceilInt(h/maxSize.getHeight());

     BufferedImage bi = new BufferedImage(newSize.width, newSize.height,BufferedImage.TYPE_INT_RGB);

     ImageReadParam param = reader.getDefaultReadParam();
     param.setSourceSubsampling(sub,sub,0,0);
     param.setDestination(bi);
     return reader.read(0,param);
   } finally {
     in.close();
   }
 }

 public String getName() {
   if(ioReaderList.size()==1) {
     return "imageIOReader";
   } else {
     String s = reader.getClass().getName();
     if(s.indexOf('.')!=-1) {
     s = s.substring(s.lastIndexOf('.')+1);
     }
     return "imageIOReader<"+s+">";
   }
 }
};

I tested these 7 methods with my test image, measuring the time and memory allocation each used. The memory allocation measurement is shaky because I'm simply referring to Runtime.getRuntime().freeMemory(), but it should give us a general idea. (However an unplanned garbage collection can skew the results.) In the tests I slowly incremented the destination image width by 100 pixels (from a 100 pixel wide copy to a 2000 pixel wide copy).

I collected results from 6 different environments (including XP, Mac, Windows 7, and Linux). All the results here, but for brevity's sake I'll just show off the results from Windows 7. (Not because it's remarkably better or worse than other OS's, but because I assume Windows 7 is incredibly common, and to show all the results would bloat this article.)

Here is a graph of the time each method required:

The slowest methods here are areaAveragingScaleFilter and scaledInstanceReader. Based on this chart I'd suggest they're the same function (that is: I believe Image.getScaledInstance() uses an AreaAveragingFilter). But other than my new class: these are the only two models that can antialias if you scale more than 50%.

Next in line is the replicateScaleFilter. This is the slowest of the aliased methods.

Next is the new class I wrote: it's around .8 seconds. It's by no means the fastest method, but it is the fastest antialiased method.

The other results are a little garbled. I'd suggest the transformReader and the scalingGraphicsReader are also identical under the hood, based on these numbers. The imageIOReader appears to jump around at specific intervals: I assume this relates to when the subsample factor increases.

But none of that is terribly important to me: I'm OK with asking the user to tolerate a 2-second delay when they give me a megapixel JPEG. What I'm interested in is the memory allocation:

Just like before: the transformReader and scalingGraphicsReader overlap so much its hard to notice there are 2 results. And the areaAveragingScaleFilter and scaledInstanceReader results are also so close they also seem identical.

There are two clear winners in this chart: imageIOReader and the new singlePassReader.

The imageIOReader, however, doesn't let you scale to arbitrary dimensions. For example: when we attempt to scale the 2750 pixel wide image to 2000 pixels: we use a subsample factor of 2. So the resulting image is 1375 pixels wide. The remaining 625 pixels (approximately 30% of the width) are black.

That leaves the singlePassReader as my winner. Was this contest fair? Probably not. I created the tests, and I created a custom-tailored solution to pass the tests. But I don't think my goals (low memory allocation and antialiasing) are asking too much; and I'm sharing this solution because I believe other developers might want the same results.

I could discuss the results of each OS in detail, but the broad conclusions are the same in each case. Some other charts show strong dips in memory at certain levels: I assume this is unsolicited garbage collection at work. This raises the interesting point that although other methods may churn through several MB of memory: they don't necessarily keep it all in use. So that's promising. I haven't tried to study exactly how much they would require if they had constant garbage collection. (For example: if you were in an environment with a fixed amount of memory and frequent garbage collection, which methods could still scale the incoming image?)

Conclusion

These results are great. This lets me efficiently create thumbnails and open images at a maximum resolution without burning through half my app's memory.

You can download the app that created the output above here. Note it includes a large embedded JPEG to test with. You can download a smaller package that includes these classes here.

This article focused on talking about JPEGs, but the images all came from the abstract Toolkit.createImage(...), which supports JPEGs as well as PNGs. And PNGs are stored in an orderly row-based model, so they should automatically work with this architecture, too.

However. It is important to note that this code is not guaranteed to work because it assumes the image data can be collected in a single pass. Which is to say: I still definitely plan on using this class, but I'll always wrap it in try/catch clauses in case it fails.

Someday I'd like to also develop the GenericImageRandomAccessIterator. For this I'm envisioning a RandomAccessFile-based model. This will work with arbitrary passes, and it may require even less memory: but it will be noticeably slower because it relies on the file system.