Using ColdFusion and Timeline to visualize DTS task logs

Some smart guys from the SIMILE project at MIT recently released Timeline, a really nifty AJAX application for visualizing exactly that: timelines. Much of my job involves building automated systems for shoveling data from Point A to Point B, sometimes under time constraints. It is then very useful for me to have a graphical way of finding the slow points in these processes. This tutorial is a condensed version of how I use Timeline and ColdFusion to visualize the SQL Server DTS packages and logs.

First, the setup. There is a particular process that runs every 10 minutes during business hours that synchronizes my (SQL Server-driven) data-mining application with the (DB2-driven) AS/400 mainframe. This process is a single DTS job that is constantly being tweaked and updated to bring over more and more information as the users demand more and more of the data in the mainframe be crunchable in real-time. (Why, then, they spent gobs of money on a mainframe that can’t do this level of crunching in real time is beyond me, and not the point of this article.)

The theory behind this process is simple: (at this time) four tables need to be queried to fetch whatever has been updated since the last fetch. These changes need to then be integrated into the data-mining tables, some of which are denormalized for speed. The entire process is was originally split into roughly two dozen steps:

DTS Graph, Original

Given the correct configuration options, SQL Server keeps a log of the steps as the DTS package runs. These logs are kept in the msdb.dbo.sysdtspackagelog and sysdtssteplog tables. The former is per-job, while the latter is per-step. Getting a list of all of the jobs for today is easy:

SELECT name, lineagefull, starttime, endtime, elapsedtime,
  errorcode, errordescription
FROM msdb.dbo.sysdtspackagelog
WHERE (starttime > DATEADD(DAY,-1,GETDATE()))
ORDER BY starttime DESC

The lineagefull column is the primary key for the table, and is referenced via a foreign key in the step log table. Thus, getting the step details for a job is similarly easy:

SELECT stepname, stepexecstatus, stepexecresult, starttime, endtime, elapsedtime,
  errorcode, errordescription, progresscount
FROM msdb.dbo.sysdtssteplog
WHERE (lineagefull = <cfqueryparam cfsqltype="CF_SQL_VARCHAR" value="#whatever#">)
ORDER BY stepexecutionid

That’s pretty much it for the SQL you need to make your log visualizer. I created a page, dts_details.cfm, that takes three parameters: Date, Job, and Format. The drilldown works as so:

  • Overview List: Last 2 Weeks (number of jobs, errors, etc., by day)
    • Daily Overview List: (list of jobs run on the given Date)
      • Job Detail List: (list of tasks run in the given Job, record counts, etc.)
        • Job Timeline: (visualization of the given Job, which gets its data in an XML Format)

You may choose to promote your Job Timeline to be on the same level as the Job Detail List by presenting the user with two choices per job in the Daily Overview List. Whatever floats your boat. I went for this:


Once you’ve got your Job Detail List, going to a Job Timeline is really easy, as its the exact same data, and thus the exact same query. The Timeline script takes an input XML file that looks like this:

  <event title="Log Start" start="Jul 10 2006 12:30:01 GMT">Status: completed</event>
  <event title="Find Log ID" start="Jul 10 2006 12:30:01 GMT">Status: completed</event>
  <event title="Clear Open Temp" start="Jul 10 2006 12:30:01 GMT">Status: completed</event>
  <event title="Import" start="Jul 10 2006 12:30:01 GMT" end="Jul 10 2006 12:30:37 GMT" isDuration="true">Status: completed, Progress: 52</event>

As you can see, I’ve chosen to collapse steps that took less than one second to run to be no-duration steps. Anything taking longer than one second is given end and isDuration attributes. I also store the completion status, record count, and any errors in the text of each entry. The entry tag also has an optional color attribute that can be used to make steps red if they error out, green if they are checkpoints, or whatever.

A little bit of CF glue and voila:

DTS Timeline

As you can see, Timeline produces a nice visual representation of the execution of your DTS task. I made some modifications to Timeline to give it QUARTERHOUR and QUARTERMINUTE time spans, along with a bit of CF logic as to when to use them.

Based on the visual feedback given from Timeline, I was able to figure out that running the data transfers off the same connection was a bad idea. I cloned the connection and split the data transfers up to increase parallelization. I also reshuffled some of the tasks on the SQL server side to accommodate the increased parallelization, and this was the result:

DTS Graph, New

The new version of the DTS package runs almost twice as fast as the old one. Thanks, Timeline!

If anyone is interested, I can post my code. But really, it only took an hour to wire up, from start to finish, so it would probably be more worthwhile for you to write one from scratch so that you can more easily see where you want to tweak things.