ColdFusion 9 ORM and relationships

Thanks to the ORM discussion over on Brian Kotek’s blog (original, follow-up), I’ve been doing some extensive CF9 ORM debugging and logging.

Count of Related

When trying to fetch a count of related items, such as Ray Camden and Jon Hartmann have shown, and It turns out that the laziness of the relationship comes into play.

Both Ray and Jon assert that the naive way to get a count is inefficient:

artistCount = arrayLen(album.getArtists())

This is true … most of the time. For a relationship with lazy="false" or lazy="true", this naive code fetches all Artists, not just a count. This example assumes a many-to-many relationship with a link table named album_artists:

SELECT link.album_id, link.artist_id, artists.id
FROM album_artists AS link
    LEFT OUTER JOIN artists
        ON (link.artist_id = artists.id)
WHERE (link.album_id = ?)

But, if the relationship is defined with lazy="extra", the query is optimized into:

SELECT COUNT(artist_id)
FROM album_artists
WHERE (album_id = ?)

Of course, this isn’t magic—the ORM isn’t really noticing that you only want the count. It’s just checking to see how many there are so it can make an array with the correct number of placeholder objects, which won’t be populated until you actually need them.

Still, it’s certainly an argument for adding lazy="extra" where possible. According to the lazy loading documentation, you can use it for one-to-many and many-to-many relationships.

Inverse Relationships

Here I must apologize to several people, including Brian Kotek and Rupesh Kumar. In my work with ORM, I’d been playing around but hadn’t built anything really big with it. In doing that, I had formed some opinions that worked in my little test documents, but were really just edge cases and bad tests. I spouted some assertions that just didn’t hold up. Sorry, guys.

Given my example case above, with artists and albums having a many-to-many relationship, I had thought that if you did not specify inverse="true" for the relationship, but instead defined the relationship fully in each direction, that the ORM would automatically mirror any assignments to either side of the relationship.

This assertion was based on the fact that it sometimes appears to work, depending on how your code is set up:

laundry = entityLoad("Album", { Title = "Laundry Service" }, true);
cover = entityNew("CoverImage");
cover.setFileName("laundry.jpg");
cover.setAlbum(laundry);
entitySave(cover);
writeDump(laundry.getCoverImages());

This code works, but only in a sort of quantum-flux-Shroedinger’s-cat sort of way. If your objects are set up with extra lazy relationships, and you have a fresh ORM session, and none of the objects have been loaded before … it looks like the single setAlbum statement causes not only the cover object to see the laundry object, but the other way around at the same time. That is, it appears to have done this, too:

laundry.addCover(cover);

But, it’s a damnable lie. It only looks that way because of the lazy loading.

When we first load the laundry object, the extra lazy loading tells the ORM to not bother looking at the cover images until someone wants them. The writeDump statement enumerates the cover images when it is inspecting the array returned from getCoverImages, so that is the first time the ORM has to load the covers—after we’ve saved the new cover image. We’d see a completely different result if we had done anything with the laundry object’s cover images before we saved the cover entity.

Long story short, as goofy as it may seem, you really do have to use inverse="true" and double-set your relationships:

cover.setAlbum(laundry);
laundry.addCover(cover);

With inverse="true" on one side, one of those function calls is reduced to a no-op.

That double-setting seems dumb to me and just begs for someone to screw it up. It seems to me like the ORM should be able to notice that the relationship is two-way and should magically mirror the assignment for you. That is, if you called cover.setAlbum it should implicitly call album.addCover for you, and vice-versa. Having to resort to goofy hacks to avoid the performance implications of double-setting when you are explicitly double-setting seems … well, hackish.

Logging ORM Queries

Rupesh has a great post on how to log CF9 ORM SQL, but there’s one slight gotcha that I’ve run into: it doesn’t mix perfectly well with <cflog> and writeLog. There’s obviously some sort of race-condition buffering thing going on.

I had thought it would be rather clever to use writeLog statements to the same hibernatesql.log file that the ORM writes its query debugging information to. This worked … for a while. Eventually, only the writeLog statements showed up, as if the ORM could no longer write to the file.

Also, when the ORM and writeLog were both writing to the file, one would occasionally get ahead of the other. If I had a writeLog statement followed by an entityLoad statement, the SQL might come before the log message, or vice-versa. They would both have the same timestamp (as the resolution is only a whole second), but clearly you can’t entirely trust the exact order of the log lines when doing something like this.

Long story short, you probably shouldn’t use the CF logging tags and functions to write to the same log file the ORM is using.

Published by

Rick Osborne

I am a web geek who has been doing this sort of thing entirely too long. I rant, I muse, I whine. That is, I am not at all atypical for my breed.

8 thoughts on “ColdFusion 9 ORM and relationships”

  1. I’m not sure why the idea of setting up both sides of this relationship is seen as something odd. The fact that Hibernate is being used to persist the objects is intended to be hidden, as much as possible, from the objects themselves.

    In other words, if you WEREN’T using Hibernate, and you wanted to set up this two-way relationship, how would you do it? Certainly you can see that you’d have to add the Cover to the Album’s collection of Covers, AND set the Album on the Cover. There’s nothing magic here: if both objects need to know about each other, that’s only going to happen if you tell them each about the other. That’s the case whether you’re using Hibernate or not.

    The point of having logic in a method like the setter to manage the other side of the relationship is to help make this easier, if you choose to go that route. And again, this would be the case regardless of whether you’re using Hibernate.

  2. That’s true, and a valid point — if I created both of those objects I’d have to tell both objects about it.

    But that’s a programmer’s way of thinking about the solution. Each object is an island, and thus each relationship is that one object’s view of the world. That’s imperative. That’s a Java solution.

    To a database guy, whose world is declarative, it’s redundant. A table isn’t an island or black box. A relationship by definition has two sides, both of which see the exact same thing. A database guy doesn’t set up a foreign key constraint from both sides — just the one.

    My argument is that ColdFusion isn’t meant to be Java — it’s not meant to be imperative. It can be imperative, but at its heart it is a declarative language.

    Having to worry about the relationship from both sides goes against that declarative nature. I want to be able to say “there’s a relationship here” and have ColdFusion figure out the details, just like I can say “give me back the results for this query” and expect the same sort of abstraction.

    You and I understand how to solve the problem, and how to debug the code enough to distinguish between what we expect to happen and what actually happens. But not everyone does, and certainly not the non-programmer developers that ColdFusion is intended to target.

    Having ORM in CF is a huge boon for the non-programmer, declarative-minded type of developer. And it’s 95% there … I just want that extra 5%. Right now it’s a tool for the power-user, but I think that with the extra 5%, which would look like magic to most, it can be brought down to the entry-level developer.

  3. Rick, I think part of the problem here is that you’re mixing the concerns between the objects and the database. The issue isn’t one of declarative vs. imperative, but one of consistency.

    Consider the situation where someone is using these objects without knowing that Hibernate is being used as the persistence mechanism. What you’re saying is that SOMETIMES the relationship should be managed automatically (if Hibernate is being used), and SOMETIMES the relationship should be managed manually (if Hibernate is not being used). To me, this sounds like a recipe for disaster. It makes much more sense that the presence or lack of the ORM have as little an effect as possible on how someone uses these objects.

  4. Heh, I was thinking the same thing about what you said: that it’s an issue of consistency. And I would use a similar example: someone who doesn’t know that Hibernate is in the picture.

    To me, it seems more logical that if I know there’s a relationship between two things, if they are bound together in some fashion, then each can see the changes to the other. Quantum entanglement, as it were.

    I’m not saying that sometimes object relationships should be magical (if using ORM) and other times they shouldn’t (if you hand-coded it). I’m saying that they should always be magical, whether using ORM or not. If they aren’t magical, well, that’s a bug as far as I’m concerned.

    Let me give you a hypothetical. In a scenario where you’re using ORM with a relationship and have only called one of the setters, is that a valid state for each of:

    1. The object that thas been set?

    2. The object that has not been set?

    3. The object-to-object relationship?

    Now, take each of those questions and replace the word “object” with “underlying data”. Do your answers still make sense? For any given ORM solution, they should still make sense. If they don’t, that means it’s possible for your objects to not correctly model your data — and that’s bad because that’s the entire point of ORM.

    Or, let me put it another way.

    What is the underlying data representation of an ORM relationship that has only been set on one side? Is it set? Is it not? That you can even have that ambiguity is bad.

  5. First, what you’re proposing then is a fundamental change in the way object relationships work in CF (the “always magical” approach).

    Second, the “ambiguity” that you mention is necessary because it depends on what is being modeled. The question of whether the state of an object is valid or not isn’t something that can be universally enforced or inferred. Just because one has MODELED a bidirectional relationship between two objects does not mean that, in every case and under all circumstances, setting one but not the other is valid or invalid. Nor can one know in what order, or under which sequence of steps the concept of valid or invalid is applied to the objects in question.

    This can only be determined on a case-by-case basis, since it depends on the nature of the relationship in question. One primary differentiator is whether the relationship is an a composition or an aggregation.

    In other words, is it a HAS-A relationship or a PART-OF relationship. An Invoice may have many LineItems. Can an Invoice exist with no LineItems? An invoice for nothing sure wouldn’t make much sense. If I delete the invoice, should all of the LineItems also be deleted? Probably, since a LineItem probably has no use without being associated an Invoice. But what about a Car that has many Tires? Can I have a Car without Tires? Can a Tire exist without being associated with a Car? Sure. But there’s no way to infer this just from knowing that there is a one-to-many/many-to-one relationship in place.

    Along the same lines, one can’t know if simply adding an object to one side should immediately and automatically create the reciprocal relationship. I may want validation in place before I let this go through. I may need to execute additional steps before I let it go through.

    If the system did as you’re proposing and automatically create both relationships, now what? Rather than expecting the developer to set this up themselves, we’d have to expect them to add code to STOP this from happening, or worse, manually UNDO it, in situations where creating it automatically isn’t desired.

    I understand where you’re coming from, and see how the initial appeal of trying to make this “just happen” might seem to be useful. But there are too many possible situations for this to work without introducing a lot more potential problems.

  6. All valid points, each of which would have to be considered when architecting an application. However, I’d hope that they would have been considered when the underlying data model was designed, not after the code on top of the model had already been started.

    I figured out what bugs me about the current CF ORM.

    var laundry = entityLoad("Album", { Title = "Laundry Service" }, true);
    var coverCount = laundry.getCoverImages();
    var cover = entityNew("CoverImage");
    cover.setAlbum(laundry);
    var newCoverCount = arrayLen(cover.getAlbum().getCoverImages());
    

    It breaks chaining. That code will have newCoverCount equal zero, just like coverCount. It’s a contrived example, as you’ve never do that intentionally, but in a scenario where you’re passing objects around, you might do it accidentally. And the fact that it returns a different value if you delete the first reference … just makes me feel icky inside.

  7. Well, an ORM is really meant to allow you to design the object model without thinking about the data model at all. The ORM translates your object model into a schema that can persist it.

    Of course, you can do this in reverse, and create the schema first and then attempt to shoehorn an object model on top of the schema. Using an ORM can make this much easier than it would otherwise be, but it isn’t the ideal use of the ORM. It’s meant to let you focus on objects and forget about the database as much as possible.

    Your chaining example is still trying to show this to be some problem with the ORM implementation. Again, I have to point out that what you show doesn’t work whether you’re using Hibernate or not. Setting the Album into the Cover, but not setting the Cover onto the Album, means that what you show won’t work, ORM or no ORM. If the chosen design requires both to know about each other, then both have to be told about each other.

  8. And just to add, enforcing the chosen rule that both know about each other is exactly what relationship management methods are for (and my blog entries on that were meant to help explain why one might use them).

Comments are closed.