Struct vs Array read times (sparse arrays considered harmful!)

A comment by Jeremy French on a post by Ray Camden asserted:

You could code it as a struct, but an array is a faster lookup.

While it’s easy to say that this is intuitively true, just how true is it? I coded up a quick test to see. There’s nothing mindboggling in my code: make an array and a structure with the name number of elements and data, then see how fast each one of them handles random access. Then, make sparse versions of the array and structure and wrap each access in a check to see if the element exists first, with either StructKeyExists or arrayDefinedAt. Each test performed 10000 lookups in a tight loop and cftimer to time them.

Note that the sparse versions of the array and struct are only missing 1% of their elements. This is important for later!

The results, in accesses-per-second (bigger is better):

ColdFusion MX 7 access time comparison for structures and arrays

It’s pretty easy to see that access times for complete arrays beat everything else hands down. It looks like there is a penalty for small arrays, but I would actually chalk that up more to the inaccurate granularity of the timer. Complete structures start to lag quickly, being almost an order of magnitude slower by the time you hit 10,000 elements. I did a further test with 100,000 elements that isn’t shown on this graph, and the trend continued for both complete types.

When looking at sparse structures, you see about what you would expect: the check introduces some amount of overhead, but nothing radical. Because, really, ColdFusion has to check to see if the element exists to do the assignment anyway, so doing it twice is no big loss. And really, the more sparse you make the structure, the faster the test would run.

But look at the times for sparse arrays! Even with only 1% of their elements missing, they are still one thousand times slower than complete arrays. I actually started off the test with only 50% complete containers and the array test took so long and chewed up so much of the processor (with only 100 elements!) that I had to bounce my ColdFusion server.

Why? Well, look at the code for arrayDefinedAt: all it does is wrap an access in a <cftry> block. Anyone who has done serious programming in Java or C++ or any other OO language with exceptions can tell you that try/catch exception handling is wicked slow. This test is just reflecting that axiom. For each exception, the runtime has to do a ton of work, so the more sparse the array is, the slower the test runs!

In Ray’s defense, it’s not his code’s fault — it’s a limitation of ColdFusion. But that’s why Scorpio is getting an arrayIsDefined function. Until then, consider sparse arrays harmful! Just say struct!

By Rick Osborne

I am a web geek who has been doing this sort of thing entirely too long. I rant, I muse, I whine. That is, I am not at all atypical for my breed.