informatique -solutions series: March 2010

Monday, March 29, 2010

Shapes: Bezier Control Points & Data Models

This article explores a GUI that manipulates a shape with bezier shape segments.

The end result is the following applet. To use this applet:

Click the "Create New Path" button.
Click and drag in the large empty JComponent below the controls.
Repeat the previous step until you like your shape.
Double-click, or click "Create New Path" to leave path-creation mode.
Interact with the square and circular handles in the workspace.
Try fiddling with the scale and constraints.

You can download this jar (source included) here.

Data Models

The GeneralPath (or Path2D in Java 1.6) is the simplest way to represent an arbitrary shape in Java.

However for the applet above we need something a lot more specific. We need:

A mechanism to change curve data after it has been added.
A model that emphasizes a shape as a list of nodes, with optional control points before and after each node.

Also the book About Face mentions:

The closer the represented model comes to the user's mental model, the easier he will find the program to use and to understand.

In my case "the user" is actually the developer who wants to use the classes I write. I want to obscure from the developer that a PathIterator is really going to define a cubic segment #N with 3 points:

The second control point for the (n-1)th point.
The first control point for the nth point.
The nth point.

To mimick the GUI presented in the applet I defined the CubicPath. With this class you can iterate through the data in a shape in a couple of ways:

for(int nodeIndex = 0; nodeIndex < myPath.getNodeCount(); nodeIndex++) {
 myPath.getNode(nodeIndex, dest);
 //do something
}

... or:

for(int pathIndex = 0; pathIndex < myPath.getPathCount(); pathIndex++) {
 for(int nodeIndex = 0; nodeIndex < myPath.getNodeCount(); nodeIndex++) {
  myPath.getNode(pathIndex, nodeIndex, dest);
  //do something
 }
}

Similarly there are methods for getNextControlForNode(nodeIndex) and getPrevControlForNode().

(In this case "next" means "defined after" and "prev" means "defined before".)

It is possible for control points to be null. In the absence of control points on either side of a segment: that segment will look like a line.

(Also, deep down the SimplifiedPathIterator is used so if possible a cubic segment will be converted to a line in the PathIterator. But this is exactly the kind of implementation detail I'm trying to bury, so pretend I didn't say anything.)

Scaling Factor

I applied the first draft of the CubicPath to a project at work, and I was surprised that one my boss's first critiques was: the handles are too sensitive. That is: in order to get certain interesting curves you have the drag the handles outside of the JComponent.

At work we've modeled cubic-based shapes before, and a long time ago (in a very different, complicated architecture) we did in fact scale the vector the control points form with the actual end point. I don't think we scaled it by very much, and I was impressed my boss immediately noticed this contrast in the two models.

So to address this problem I added the scalingFactor field in the CubicPath. (See setScaleFactor and getScaleFactor). I'd have to fumble around for the right words to mathematically/programmatically express exactly what it does: but if you play around with this feature in the applet above it should be pretty self explanatory.

It in no way affects the underlying shape; but it does affect the points returned by getNextControlForNode() and getPrevControlForNode(). This lets you adjust the "sensitivity" of the handles.

Continuous Curves

In order to maintain continuous curves: the two control points surrounding a node must be collinear. Their magnitudes can vary, but their angle needs to be the same.

To help in this area I added a couple extra methods (see the "Constraint" feature in the applet to try these out):

setNextControlForNodeFromPrev(int nodeIndex,boolean includeDistance);
setPrevControlForNodeFromNext(int nodeIndex,boolean includeDistance);

So if you just defined the previous control point for a node, you can call setNextControlForNodeFromPrev to make sure the next control point matches the previous (and vice versa).

These methods will always force the angles of the control points to match. The boolean includeDistance controls whether the magnitude is supposed to match, too.

Conclusion

Like all my blog articles: this may be revised in coming days/weeks after I first publish it. But it is my hope that this is a solid interface for designing java.awt.Shapes that will really simplify my life. My goal is to never have to delve into this business again. Hopefully if this meets your needs you can use it and save some time, too.

Sunday, March 21, 2010

Images: Reading JPEG Thumbnails

The problem: I want to create thumbnails for a folder of JPEGs. Many of these JPEGs came from a digital camera, so they're several thousand pixels wide. Loading the entire image and then scaling down to a 64x64 thumbnail sounds pretty wasteful: what else can I do? I'm 99% sure most digital cameras these days are embedding thumbnails inside the JPEG files for just this kind of problem...

Discussion

As far as I can tell: ImageIO won't read JPEG thumbnails unless you have JAI installed?

I googled "java example: ImageIO reading thumbnail" to see what I could find on the subject:

The first link was from Sun, and it said this:

3.3.4 Reading "Thumbnail" Images

Some image formats allow a small preview image (or multiple previews) to be stored alongside the main image. These "thumbnail" images are useful for identifying image files quickly, without the need to decode the entire image.

Applications can determine how many thumbnail images associated with a particular image are available by calling:
reader.getNumThumbnails(imageIndex);
If a thumbnail image is present, it can be retrieved by calling:
int thumbailIndex = 0;
BufferedImage bi;
bi = reader.readThumbnail(imageIndex, thumbnailIndex);

Right. Good. This makes sense. The problem is: it doesn't seem to really work.

The second link was much more insightful. A developer wrote:

I should have remarked that the JAI Image I/O Tools JPEG reader supports via the thumbnail method calls all thumbnails embedded in the JFIF APP0, JFXX APP0, and EXIF APP1 marker segments. Please see this javadoc for more information:

http://download.java.net/media/jai-imageio/javadoc/1.1/overview-summary.html#JPEG

I think that the only thumbnails supported by the Java SE Image I/O JPEG reader via the thumbnail method calls are those in the JFIF and JFXX marker segments. If you are unable to use JAI Image I/O Tools for some reason you could however derive the EXIF thumbnail by parsing the contents of the "unknown" node in the image metadata corresponding to the EXIF APP1 marker segment.

I think he just summed up the problem I was seeing. If I run this code (without JAI installed), I get an exception:

Iterator iterator = ImageIO.getImageReadersBySuffix("jpeg");
while(iterator.hasNext()) {
 ImageReader reader = (ImageReader)iterator.next();
 try {
  reader.setInput( ImageIO.createImageInputStream(jpeg) );
  BufferedImage thumbnail = reader.readThumbnail(0, 0);
 } catch(Exception e) {
  e.printStackTrace();
 }
}

(The exception I get is:

java.lang.IndexOutOfBoundsException: No such thumbnail at com.sun.imageio.plugins.jpeg.JPEGImageReader.readThumbnail(JPEGImageReader.java:1354)

. Which makes sense based on what the second link said.)

Now I have nothing against JAI itself. But when I went to grab a copy just now: the jar was over 5 MB!

This is ridiculous. For widgets I present in this blog I'm not going to try to attach a 5 MB jar. (Is that even legal? I didn't look up the licensing.) Or worse: I'm not going to insist that the user separately download other packages to get my software to run well. It should run well straight out of the box.

I decided to try to extract the thumbnails myself. I did not want to write a fully-functional JPEG decoder (what are the odds I could compete with OS-specific optimized code that already exists?), but I might benefit from my own metadata retrieval.

Research

The first thing I needed was test cases. Lots of and lots of test cases. In addition to all the JPEGs lying around on my computer, I wanted some public domain images I could write unit tests against. I explored Flickr, and found a collection of images with no known copyright. I saved a few of them here.

Next I'd need to roll up my sleeves and study file specifications. My first search result pointed to this specifications PDF, but it is dated to 1992. I did more research and found that this PDF is a better fit.

A JPEG is (until the image data begins) clearly separated into discrete chunks of data. So modeled after the ZipInputStream, I wrote the JPEGMarkerInputStream to read each chunk separately.

According to the specifications: there are two types of markers that are supposed to provide all our thumbnails: APP0 and APP1.

The JAI documentation confirms this:

The [JAI] JPEG reader supports thumbnails. These may be derived from JFIF APP0, JFXX APP0, or EXIF APP1 marker segments.

About the APP0 Block

This marker was introduced in the original 1992 specification. It includes lots of basic metadata (resolution, version info), and can optionally include a thumbnail.

However none of my sample images use this marker to embed a thumbnail. I even branched out and scanned every JPEG on my computer: not one in over 11,000 JPEGs embedded a thumbnail this way. Technically the package I introduce later should support these thumbnails, but since I have zero test cases to work with: I can't be certain I parse them correctly.

At this point I believe it is safe to say: these are (at best) extremely rare in the real world.

About the APP1 Block

This block is the trove of information. This includes designated (but optional) fields for everything you can imagine: width, resolution, aperture value, the encoding software name, copyright info, etc. Also it includes a pointer to a mini-JPEG file (still in the APP1 block) to serve as a thumbnail.

A Surprise Block

... but as I was exploring my test files, I found several other markers. These were often application-specific. I don't have specifications for most of these, and I didn't invest too much time in trying to figure them out. However one started to stand out: the APP13 marker.

This marker is supposedly written by Adobe. One page refers to this as the "Adobe IRB" block, where I assume (from googling) that "IRB" stands for "Image Resource Block". What's interesting about this block is that it also appears to contain a mini-JPEG. In fact: if a test image contained only 1 thumbnail, it was several times more likely to contain a thumbnail in an APP13 block than an APP1 block.

So I invested time in parsing this block, too. (If I ignored this block I'd be missing nearly 1/3 of my possible thumbnails!)

To be fair: no one collection of test cases can represent "the majority" of JPEGs in the world. It might be the case that my collection is extremely skewed, and on your computer not a single JPEG has an APP13 marker. But still: this seemed like a lead worth following.

To be safe, I wanted to check to see if other markers also existed that embedded mini-JPEG thumbnails. I added a unit test that searched all my test files for the number of occurrences of the marker 0xFFD8 (the start-of-image marker): the results indicate that I'm recording all the mini-JPEGs embedded in my test cases. (But if a thumbnail is encoded in another format -- such as a PNG -- then I may still be missing it.)

Results

Finally I put all this together in the JPEGMetaData class. You can download this class (source included) here

I casually added methods to retrieve properties and comments, too, because they were trivial to parse. Unfortunately what isn't easy to parse is: the dimensions of the actual image! (Or maybe someone reading this can cue me in to where to find this information? For now I'll continue to use my ImageSize
class to fetch this information without actually loading the image.)

Monday, March 15, 2010

Shapes: Parametric Equations (and Spirals!)

This article is split up in 2 parts:

Converting a parametric graph into a PathIterator
An applet demoing a Spiral2D class

Introduction

A coworker recently wanted to paint a spiral. He found a programming book that included some sample code to generate spirals, but there were two catches:

For a small company like ours: use of that code required a $500 license fee.

Even then: the spiral he rendered was basically a polyline. (That is: it was rendered as a series of lines, and had no curvature.)

Both of these restrictions made that implementation a bad idea, so I decided to flesh out a better model. What I ended up with is a model that helps convert parametric equations to java.awt.Shapes, and a subclass that caters specifically to spirals.

Parametric Equations and PathIterators

Suppose you have a parametric graph expressed as x(t) and y(t): how do you draw that in Java?

At its most basic level: drawing a shape in Java requires a java.awt.geom.PathIterator.

Preferably we'll then package this iterator in a java.awt.Shape, but with just an iterator you can use the following code to create a fillable/drawable shape:

GeneralPath path = new GeneralPath();
path.append(myPathIterator, false);

A PathIterator breaks a path up into a series of bezier segments: so our first task is to bridge the gap between a parametric expression and bezier curve.

Mathematically this is trivial to compute, but it may require a couple of pages of scrap paper to derive/understand everything.

Converting Parametric Equations into Bezier Curves

The most flexible bezier segment type that Java supports is the cubic segment, so that's the segment type we want to focus on. A cubic bezier segment has 4 control points: two for each end of the segment, and two to define the tangent slopes at each endpoint. Our goal is break our parametric function into a series of cubic bezier segments. That is: we're starting out with the parametric functions x(t) and y(t), and we have to extract from that a series of segments, where each segment is defined by 4 points.

Before we dig in to the math here, we need to make an important distinction: we're starting with a parametric equation that uses a variable t. We're going to convert that into a series of adjacent bezier curves, but a bezier curve is also a parametric equation. So I'm going to refer to those segments as using t_bezier for their t-values. So we have "t" (for the master equation) and "t_bezier" (for the individual segments used to replicate the master equation).

If we want our bezier curve to fit a range of t-values from [ta, tb], then we know that:

x_bezier(0) = x(ta);
dx_bezier(0)/dt = dx(ta)/dt;
x_bezier(1) = x(tb);
dx_bezier(1)/dt = dx(tb)/dt;

This only matches our bezier curve to the endpoints. The closer those endpoints are: the better a match this piecewise solution is going to give us. (If they are infinitesimally close: they will be a perfect match.) This is discussed more below. For now: assume that "ta" and "tb" are reasonably spaced.

Since a cubic bezier curve can be expressed in terms of a cubic polynomial, we also know that:

x_bezier(t_bezier) = ax*(t_bezier^3) + bx*(t_bezier^2) + cx*(t_bezier) + dx;
x_bezier(0) = ax*(0^3) + bx*(0^2) + cx*(0) + dx = dx;
dx_bezier(0) = 3*ax*(0^2) + 2*bx*(0) + cx = cx;
x_bezier(1) = ax*(1^3) + bx*(1^2) + cx*(1) + dx = ax + bx + cx + dx;
dx_bezier(1) = 3*ax*(1^2) + 2*bx*(1) + cx = 3*ax + 2*bx + cx;

So it follows that:

x(ta) = dx;
dx(ta)/dt = cx;
x(tb) = ax + bx + cx + dx;
dx(tb)/dt = 3*ax + 2*bx + cx;

So now we've mapped our source equation to the polynomial coefficients we need. But a cubic bezier curve is traditionally expressed as a series of 4 control points -- not as 4 coefficients. The article "Shapes: an Introduction" explains most of the math needed to make this conversion. Eventually if we solve for the control points in terms of the original parametric equation, then we'll end up with:

end1 = x(ta);
end2 = x(tb);
ctrl1 = (dx(ta)/dt+3*x(ta))/3;
ctrl2 = (3*x(tb)-dx(tb)/dt)/3;

That is all the information we need to define a cubic bezier segment that replicates our original graph from [ta, tb]. (We also have to apply exactly the same logic to the y-equations/coefficients, but the end result is the same system using "x" instead of "y".)

This logic has been built in to the abstract ParametricPathIterator, so there is a PathIterator that breaks a parametric equation up into thousands of smaller bezier segments as needed. The most crucial piece of information we're lacking now is: what are the values for "ta" and "tb"? If they are too widely spaced apart: then the piecewise segments will not match the graph well. If they are too close together: then the shape becomes unnecessarily complex. This spacing is the responsibility of the subclass, because the varying complexity of the parametric graph we're plotting will affect the interval between t values.

Spirals

Now that we have a model to convert parametric graphs to PathIterators: creating a Spiral2D should be easy. First we have to identify the parametric equations involved. Traditionally a spiral is expressed as:

x(t) = centerX+coilGap*t*cos(2*pi*t)
y(t) = centerY+coilGap*t*sin(2*pi*t)

After working with this format, though, I quickly realized I wanted at least two other variables:

angularOffset: to affect the initial angle.

coilOffset: to affect the coil offset.

My revised definition of a spiral looks like this:

x(t) = centerX+coilGap*(t + coilOffset)*cos(2*pi*t + angularOffset)
y(t) = centerY+coilGap*(t + coilOffset)*sin(2*pi*t + angularOffset)

We'll also need to know the derivative. (Of course we could also just numerically approximate the derivative, but let's calculate it correctly since this article is supposed to be instructive...) After refreshing my memory regarding the product rule, I ended up with these equations:

dx/dt = coilGap*cos(2*pi*t+angularOffset)-2*pi*coilGap*(t + coilOffset)*sin(2*pi*t+angularOffset)
dy/dt = coilGap*sin(2*pi*t+angularOffset)+2*pi*coilGap*(t + coilOffset)*cos(2*pi*t+angularOffset)

The Spiral2DPathIterator is the finished product. It extends the ParametricPathIterator and implements the spiral equations above. Each coil (from 0 to 360 degrees) is broken up into 8 bezier segments; this is not a magic number -- it is simply what empirically looked "good". Also (in keeping with Java conventions) there is a separate Spiral2D object that implements the java.awt.Shape interface for your convenience.

Here is an applet demonstrating the final implementation:

This applet (source included) is available here.

p.s. Try clicking in the applet to define the endpoint for the spiral. (Also try using the shift key.)

Friday, March 12, 2010

Shapes: An Introduction

I try to sprinkle in some fun open-source code snippets in most of my articles -- but this article is different. This article is simply an overview of the java.awt.Shape object. Several other articles build on (or assume knowledge of) what I cover here.

What is a `java.awt.Shape`?

When I refer to a "shape" in Java, I'm probably referring to the java.awt.Shape interface. The most interesting method in this class is getPathIterator(AffineTransform) -- the results of all the other methods can be derived from this method.

OK, so what is a `java.awt.geom.PathIterator`?

This is the heart of shape information in Java. This iterator is similar to the human gesture of dragging a pen from one point in space to another. A path is a series of segments, and there are 5 basic types of segments:

SEG_MOVETO: this is the initial position of a path. Each path must begin with this segment. If a MOVETO segment is not followed by any other segments, though: that path is considered empty. You might get a dot if you called Graphics2D.draw(shape), but you won't get anything if you call Graphics2D.fill(shape).
SEG_LINETO: this segment requires an x-coordinate and a y-coordinate. This connects a line from the last path location to the point indicated.
SEG_QUADTO: this segment requires two points: the end point and a control point (more on control points later.)
SEG_CUBICTO: this segment requires three points: two control points and an end point.
SEG_CLOSE: this is an optional segment that connects the previous segment to the SEG_MOVETO that began this shape. Even if you draw 4 sides of a square, you need to add a SEG_CLOSE for some java.awt.Strokes to correctly render the tips of all corners.

A single PathIterator can contain any number of MOVETO's. Each MOVETO is equivalent to lifting your pen off a piece of paper and repositioning it to start a new segment.

Bezier Control Points

Quadratic and cubic segments use control points. These are ingenious points that are used to tug an existing path in certain directions. There's a great wikipedia page that describes this concept in detail (with diagrams and everything). What I want to do is focus on the math. I'll break this up into the quadratic and the cubic cases:

Quadratic Parametric Equations

The javadocs are the place to start when you want to study the math here. They describe a quadratic segment as:

P(t) = B(2,0)*CP + B(2,1)*P1 + B(2,2)*P2
       0 <= t <= 1

B(n,m) = mth coefficient of nth degree Bernstein polynomial
       = C(n,m) * t^(m) * (1 - t)^(n-m)
C(n,m) = Combinations of n things, taken m at a time
       = n! / (m! * (n-m)!)

This sounds a lot more complex than it actually is if you're willing to break it down. We can get rid of C(n,m):

B(n,m) = mth coefficient of nth degree Bernstein polynomial
       = [ ( n! / (m! * (n-m)!) ) * t^(m) * (1 - t)^(n-m) ]

And then expand the first equation:

P(t) = [ ( 2! / (0! * (2-0)!) ) * t^(0) * (1 - t)^(2-0) ]* CP +
       [ ( 2! / (1! * (2-1)!) ) * t^(1) * (1 - t)^(2-1) ] * P1 +
       [ ( 2! / (2! * (2-2)!) ) * t^(2) * (1 - t)^(2-2) ] * P2

Simplify everything:

P(t) = (1 - 2*t + t^2) * CP +
       [ (2*t - 2*t^2) ] * P1 +
       [ t^(2) ] * P2

This is still not really useful to me, though. What I want is an expression in terms of t:

P(t) = (CP-2*P1+P2)*(t^2) +
       (-2*CP+2*P1)*t +
       CP

In my code this usually manifests itself something like this:

double ax = lastX-2*x1+x2;
double bx = -2*lastX+2*x1;
double cx = lastX;

Likewise you can replace "x" with "y" to get the y coefficients. From here you can treat your bezier segment as a good ole parametric equation. You can always spot-check your math (in both the quadratic and cubic case below) by checking the values at t=0 and t=1. At t=0 you get lastX -- which is the right starting point -- and at t=1 you simply add all those terms together -- and the terms cancel to x2.

I hardly ever use quadratic segments, though. If the user needs more than lines, the cubic segment gives you a lot more flexibility...

Cubic Parametric Equations

Again, we'll start by looking at the javadocs for a cubic curve:

P(t) = B(3,0)*CP + B(3,1)*P1 + B(3,2)*P2 + B(3,3)*P3
0 <= t <= 1

B(n,m) = mth coefficient of nth degree Bernstein polynomial
       = C(n,m) * t^(m) * (1 - t)^(n-m)
C(n,m) = Combinations of n things, taken m at a time
       = n! / (m! * (n-m)!)

Expand that:

P(t) = [ 3! / (0! * (3-0)!) * t^(0) * (1 - t)^(3-0) ] * CP +
       [ 3! / (1! * (3-1)!) * t^(1) * (1 - t)^(3-1) ] * P1 +
       [ 3! / (2! * (3-2)!) * t^(2) * (1 - t)^(3-2) ] * P2 +
       [ 3! / (3! * (3-3)!) * t^(3) * (1 - t)^(3-3) ] * P3

Now simplify:

P(t) = [ (1 - t)^3 ] * CP +
       [ 3 * t^(1) * (1 - t)^2 ] * P1 +
       [ 3 * t^(2) * (1 - t)^1 ] * P2 +
       [ t^(3) ] * P3

Expand the exponents:

P(t) = [ (1 - 3*t + 3*t^2 - t^3) ] * CP +
       [ (3*t - 6*t^2 + 3*t^3) ] * P1 +
       [ (3*t^2 - 3*t^3)^1 ] * P2 +
       [ t^(3) ] * P3

But what we really want is the equation in terms of t:
Lastly:

P(t) = [-CP+3*P1-3*P2+P3]*(t^3) + 
       [3*CP-6*P1+3*P2]*(t^2) +
       [-3*CP+3P1]*t +
       CP

So now we have the cubic equation expressing this curve in terms of T. Here is how I'd usually integrate this into my code:

double ax = -lastX+3*x1-3*x2+x3;
double bx = 3*lastX-6*x1+3*x2;
double cx = -3*lastX+3*x1;
double dx = lastX;

... and you can likewise define ay, by, cy and dy.

This concludes the introduction; look up other articles that are prefaced with "Shape" to see possible uses for this.