MongoDB + NikePlus

I’ve been playing around with MongoDB, comparing and contrasting the group() and mapreduce() functions. For test data, I’ve been using the last two years worth of Nike+ data from my iPod. For my future reference, here’s how to aggregate all of that data using the group() function:

db.workouts.group({
    key: { },
    cond: { distancekm: { $gt: 0.5 } },
    initial: {
        km: 0,   // total kilometers
        n: 0,    // workout count
        g: 0,    // count of workouts with goals
        d: 0,    // duration in ms
        w: 0,    // "wins" = better than goal
        l: 0,    // "losses" = not better than goal
        mn: -1,  // minimum km
        mx: -1,  // maximum km
        wg: 0    // goal win rate
    },
    reduce: function(obj,prev) {
        prev.km += obj.distancekm;
        prev.n++;
        prev.d += obj.duration;
        if(obj.distancekm > prev.mx)
            prev.mx = obj.distancekm;
        if((prev.mn < 0) || (obj.distancekm < prev.mn))
            prev.mn = obj.distancekm;
        if(goal.goalType) {
            prev.wg++;
            prev.g += obj.goalValue;
            if(obj.goalType == "Time")
                prev[(obj.duration >= obj.goalValue * 1000) ? 'w' : 'l']++;
            else if(obj.goalType == "Distance") {
                dg = obj.goalValue * (obj.goalUnit == "mi" ? 1.609344 : 1.0);
                prev[obj.distancekm >= dg ? 'w' : 'l']++;
            }   // else Distance
        }   // if goalType
    },
    finalize: function(out) {
        out.avgkm = out.km / out.n;
        out.mi = out.km / 1.609344;
        out.d /= 1000.0;   // convert ms to seconds
        out.dm = out.d / 60.0;   // convert sec to min
        out.dh = out.dm / 60.0;   // convert min to hr
        out.wp = (out.wg > 0) ? out.w / out.wg : 0;
    }
});

This produces a result set like so:

[
    {
        "km" : 775.1911999999996,
        "n" : 80,
        "g" : 8307.25,
        "d" : 322547.941,
        "w" : 24,
        "l" : 5,
        "mn" : 0.7493,
        "mx" : 26.2581,
        "wg" : 29,
        "avgkm" : 9.689889999999995,
        "mi" : 481.68148015588935,
        "dm" : 5375.799016666667,
        "dh" : 89.59665027777778,
        "wp" : 0.8275862068965517
    }
]

I think what trips me up about this is just remembering that the prev object is the one being assigned to, and thus should always be an l-value and not an r-value, while the obj object is the incoming data, and thus should always be an r-value and never an l-value. Going forward, I should probably name them left and right or something to make it clearer.

I do like the cleaner syntax of the group() function over the mapreduce() function. But, I also get that it is just semantic sugar—you could easily translate from the one to the other. Hmm … maybe I should update my SQL→MongoDB cheat sheet to go through a group() example first?

Unfortunately, using group means you can’t use the server-side post-filtering that you can with mapreduce—since your results aren’t converted to a new collection, functions like find and sort throw an error if you try to chain them. Maybe this will be included/fixed in a future version of MongoDB?

Published by

Rick Osborne

I am a web geek who has been doing this sort of thing entirely too long. I rant, I muse, I whine. That is, I am not at all atypical for my breed.