Writing a Parent-Child Query Sorter in CF, Part 7

Let’s start looking at turning what we’ve done into a custom tag and UDF. I’m actually going to start with the UDF and work backwards to an old-school custom tag. I find that the structure imposed by a proper UDF helps me think more about what is going on, and then I can de-structure it later for deployment on older servers. For simplicity’s sake, my UDF will be named QueryTreeSort.

Here is the code we have so far:

<cfset BaseDepth=0>
<cfquery datasource="#Request.DSN#" name="Stuff">
SELECT ItemID, Name, ParentID
FROM Stuff
ORDER BY Name
</cfquery>
<cfset RowFromID=StructNew()>
<cfset ChildrenFromID=StructNew()>
<cfloop query="Stuff">
	<cfset RowFromID[ItemID]=CurrentRow>
	<cfif NOT StructKeyExists(ChildrenFromID, ParentID)>
		<cfset ChildrenFromID[ParentID]=ArrayNew(1)>
	</cfif>
	<cfset ArrayAppend(ChildrenFromID[ParentID], ItemID)>
</cfloop>
<cfset RootItems=ArrayNew(1)>
<cfset Depth=ArrayNew(1)>
<cfloop query="Stuff">
	<cfif NOT StructKeyExists(RowFromID, ParentID)>
		<cfset ArrayAppend(RootItems, ItemID)>
		<cfset ArrayAppend(Depth, BaseDepth)>
	</cfif>
</cfloop>
<cfloop condition="ArrayLen(RootItems) GT 0">
	<cfset ThisID=RootItems[1]>
	<cfset ArrayDeleteAt(RootItems, 1)>
	<cfset ThisDepth=Depth[1]>
	<cfset ArrayDeleteAt(Depth, 1)>
	<cfif StructKeyExists(RowFromID, ThisID)>
		<cfset RowID=RowFromID[ThisID]>
		<cfoutput>#RepeatString("",ThisDepth)##Stuff.Name[RowID]#<br /></cfoutput>
	</cfif>
	<cfif StructKeyExists(ChildrenFromID, ThisID)>
		<cfset ChildrenIDs=ChildrenFromID[ThisID]>
		<cfloop from="#ArrayLen(ChildrenIDs)#" to="1" step="-1" index="i">
			<cfset ArrayPrepend(RootItems, ChildrenIDs[i])>
			<cfset ArrayPrepend(Depth, ThisDepth + 1)>
		</cfloop>
	</cfif>
</cfloop>

Now let’s talk about component design. What do we want out of this component? There are a few ways to come at a component like this:

  1. Return a new query, which is already correctly sorted and has all of the original query’s data plus a new column with depth information.
  2. Return a sorted list of the row numbers that the calling code can then loop over.
  3. Return a sorted list of the row numbers and a list of depth measurements.

The first option is the more complete, black-box way to do it. However, for large datasets it may not be efficient to make a copy of them just to add a column for depth information. The second option is more memory-friendly, but the calling code is then responsible for figuring out depth information via the ParentID data. The third option returns both order and depth information, but is a bit trickier to work into a UDF since you are returning multiple distinct chunks of data. (You would most likely return a struct of arrays, as does REFindNoCase.) I’m going to just work on the first option for now, leaving the other two as exercises for the reader.

Our first step is to UDF-ize what we’ve already written. I’ve also added in a few comments, as we’re starting to grow beyond plainly-obvious.

First we need to start with a basic function signature:

<cffunction name="QueryTreeSort" returntype="query" output="No">
	<cfargument name="Stuff" type="query" required="Yes">
	<cfargument name="ParentID" type="string" required="No" default="ParentID">
	<cfargument name="ItemID" type="string" required="No" default="ItemID">
	<cfargument name="BaseDepth" type="numeric" required="No" default="0">
	<cfargument name="DepthName" type="string" required="No" default="TreeDepth">

We start off with the query to be sorted, which is the only required argument. The next two arguments are provided for convenience so that we don’t have to make sure that the queries we want to sort conform to some random naming scheme. Instead, we can pass in the name of our Item and Parent ID columns, thus allowing us to sort queries where the IDs are named ID or ItemID or Item or even TheItemID_int_4bytes_ricko_was_here.

The BaseDepth argument allows us to specify that we’d prefer that the depth numbering start at 1 or -1 or 42 or whatever. Similarly, we can specify the name of the new column with the TreeDepth argument. In this way if our query already has a column named Depth or Level then we can have the depth returned in a new column called MyTreeDepthInfo if we wanted.

We have essentially taken out anything that might have been hard-coded or assumed, and left it up to the caller to specify. This makes for code later that is a bit more interesting, but it makes the function disturbingly portable.

Next we need to declare our local variables. This function is truly black-box, so it should neither assume anything about the outside world, nor reveal anything to the outside about its inner workings.

	<cfset var RowFromID=StructNew()>
	<cfset var ChildrenFromID=StructNew()>
	<cfset var RootItems=ArrayNew(1)>
	<cfset var Depth=ArrayNew(1)>
	<cfset var ThisID=0>
	<cfset var ThisDepth=0>
	<cfset var RowID=0>
	<cfset var ChildrenIDs="">
	<cfset var ColName="">
	<cfset var Ret=QueryNew(ListAppend(Stuff.ColumnList,Arguments.DepthName))>

Most of the variables here are simply moved up to the top of the function from where they were before. The only new ones are the last two. We’ll need the ColName variable later when we loop over the columns in the query since we assume nothing about the structure of the query. The Ret variable is what will be returned, and it is a query with the same columns as what was passed into the function, plus the new depth information.

Next, as the comment says, we set up our indexes as before. This time, however, we’ve needed to substitute out our hard-coded column names for the dynamic ones passed into the function.

	<!--- Set up all of our indexing --->
	<cfloop query="Stuff">
		<cfset RowFromID[Stuff[Arguments.ItemID][Stuff.CurrentRow]]=CurrentRow>
		<cfif NOT StructKeyExists(ChildrenFromID, Stuff[Arguments.ParentID][Stuff.CurrentRow])>
			<cfset ChildrenFromID[Stuff[Arguments.ParentID][Stuff.CurrentRow]]=ArrayNew(1)>
		</cfif>
		<cfset ArrayAppend(ChildrenFromID[Stuff[Arguments.ParentID][Stuff.CurrentRow]], Stuff[Arguments.ItemID][Stuff.CurrentRow])>
	</cfloop>

The Stuff[Arguments.ItemID][Stuff.CurrentRow] stuff is just nasty-looking, isn’t it? But if you pick it apart it’s not all that bad. We’re looking into the Stuff query, in the column whose name was specified by our ItemID argument, for the data in the current row. Long-winded and ugly, but really no big deal.

Finding the parents requires similar find-and-replace gymnastics:

	<!--- Find parents without rows --->
	<cfloop query="Stuff">
		<cfif NOT StructKeyExists(RowFromID, Stuff[Arguments.ParentID][Stuff.CurrentRow])>
			<cfset ArrayAppend(RootItems, Stuff[Arguments.ItemID][Stuff.CurrentRow])>
			<cfset ArrayAppend(Depth, Arguments.BaseDepth)>
		</cfif>
	</cfloop>

Second verse, same as the first. And since we’re not going to loop over the query again, no more nasty-looking code. Up next is the sorting part, the preamble to which is unchanged.

	<!--- Do the deed --->
	<cfloop condition="ArrayLen(RootItems) GT 0">
		<cfset ThisID=RootItems[1]>
		<cfset ArrayDeleteAt(RootItems, 1)>
		<cfset ThisDepth=Depth[1]>
		<cfset ArrayDeleteAt(Depth, 1)>

Look! New code:

		<cfif StructKeyExists(RowFromID, ThisID)>
			<!--- Add this row to the query --->
			<cfset RowID=RowFromID[ThisID]>
			<cfset QueryAddRow(Ret)>
			<cfset QuerySetCell(Ret, Arguments.DepthName, ThisDepth)>
			<cfloop list="#Stuff.ColumnList#" index="ColName">
				<cfset QuerySetCell(Ret, ColName, Stuff[ColName][RowID])>
			</cfloop>
		</cfif>

Okay, so it’s not all that exciting. We add a row to the return query, set the depth, and then loop over the original query’s columns to copy the data to our return query.

The rest is unchanged, except that we also end our function, returning our resultant query.

		<cfif StructKeyExists(ChildrenFromID, ThisID)>
			<!--- Push children into the stack --->
			<cfset ChildrenIDs=ChildrenFromID[ThisID]>
			<cfloop from="#ArrayLen(ChildrenIDs)#" to="1" step="-1" index="i">
				<cfset ArrayPrepend(RootItems, ChildrenIDs[i])>
				<cfset ArrayPrepend(Depth, ThisDepth + 1)>
			</cfloop>
		</cfif>
	</cfloop>
	<cfreturn Ret>
</cffunction>

The following code can be used to test the function, presuming the table data we started with:

<cfquery datasource="#Request.DSN#" name="Stuff">
SELECT ItemID, Name, ParentID
FROM Stuff
ORDER BY Name
</cfquery>
<cfset TreeStuff=QueryTreeSort(Stuff)>

Dump out TreeStuff and you get:

ItemID Name ParentID TreeDepth
12Coding00
15Applications121
8C++152
9Hybrid121
19Perl92
26Web121
16ColdFusion262
7Food00
1Cheese71

As you can see, we now have a function that can take a query, tree-sort it, and return a new query that is not only in the right order, but also includes depth information to help us figure out how to format it. In the next column we’ll do the fun stuff: formatting.

Published by

Rick Osborne

I am a web geek who has been doing this sort of thing entirely too long. I rant, I muse, I whine. That is, I am not at all atypical for my breed.