Trig on my mind

I’m still exploring the world of treemaps. The squarified treemaps that I used to make a map Ben Nadel’s commenters and a map of my recent musical taste are all fine and good for flat data. It turns out that they really aren’t all that great for hierarchical data—since the squarification process requires the boxes to be order of descending area, that doesn’t work so well on shallow-structured datasets.

And what kind of data do you think I’m working on visualizing? Hierarchical and shallow, of course.

My next stop was to dive into Voronoi treemaps. The definitive paper on the subject is a gory, math-filled evisceration (PDF) of the topic. It’s not an easy read. The good news is that Voronoi treemaps provide pretty and intuitive visualization of hierarchical data, whether deep or shallow, balanced or extreme.

The bad news is that generating weighted Voronoi tilings is NP-hard—meaning that there’s no one perfect algorithm to always get the right answer quickly. Now, ColdFusion is great, but you generally don’t want to throw something NP-hard at it. It can do it, but maybe it’s not the best solution.

I wrote a little CF code to see how hairy it looked like it would get. Briefly:

  1. Take a best guess of where each of your data points would be the center of their final shape and plot them there.

  2. Build a Voronoi tiling around those dots. (Fortune’s algorithm is tricky, but not horrible.)

  3. Measure the areas of each tile, and see how far off of the actual areas you are.

  4. Adjust your data point placement to be close to the center of the tile and start over again.

  5. Once you get to within a certain tolerance, you can call it close enough and be done with it.

  6. If your data set is hierarchical, then you do it recursively with smaller and smaller areas.

Yeah. That in no way sounds like something I want to tackle in ColdFusion.

When I moved on, I found this 2008 MIT paper on Circular Paritions. While maybe not as organic and pretty as Voronoi treemaps, this scheme is almost as nice:

Of course, that paper is just as hard to read as the Voronoi paper. I’m no slouch in math, but I have to admit defeat when it comes to most graduate-level theory. I’ve re-read it, bit by bit, probably a dozen times now. I think I’ve got it mostly in my head, but I keep having to read it, walk away, and let my subconscious chew on it for a few hours.

Visually, the algorithm seems simple enough:

  1. Start with a convex polygon, such as a square, and the desired area of the first tile. In this example, we’re going to make 4 tiles each with ¼ area.

  2. A squarified layout would place a square in the top left with the desired ¼ area. We’re not going to do that.

  3. Instead, the circular partitioning scheme works by making a single cut to divide (partition) the polygon into two pieces. It works out that you end up with the most visually-pleasing pieces if you make a cut perpendicular to the bisector of the smallest angle of the polygon. That is, you lay the polygon out so that its ratio of width to height is as oblong as it can be, then make a vertical cut where you will get the area that you want off to your left. When cutting a square, just choose any corner to dog-ear.

  4. See how the Superman-shield is rotated? Notice how the top-most and bottom-most points are equidistant from the horizontal center-line of the polygon, and the smallest angle is left-most and vertically-centered? That’s the rotation you’re going for. Notice that the partition is vertical.

  5. Find the new smallest angle, rotate, and cut again.

  6. Repeat until you run out of data points.

Not nearly as bad as that close enough try-and-repeat crap from the Voronoi algorithm, right?

But, really, I kind of lied. If I’m reading the paper correctly, you can’t just use the bisector of the smallest angle. That part about the most oblong orientation is more important for some cuts, but the bisector is more important for others. And how do you figure out what the optimum rotation is? Do you have to do it iteratively, or is there some kind of shortcut formula? That’s the part that I’m still having trouble interpreting from the paper.

But I’ll get there. And damn, will it be cool to code up in ColdFusion.

By Rick Osborne

I am a web geek who has been doing this sort of thing entirely too long. I rant, I muse, I whine. That is, I am not at all atypical for my breed.