Reduce All the Things! (part 2)

If you're not comfortable with how reduce generically combines values in a list, check out Reduce All the Things! first.

Hopefully you're now comfortable with how reduce works, but maybe not grokking how to implement it diferently, or how to approach solving problems with reduce. This post is split into two logical parts: First I'm going to include a few different approaches to how you might write a reduce function, then we'll explore ways to apply reduce to actual problems.

Going into these different solutions, I intentionally mixed up conditional checks and variable names; you want to focus on what's happening rather than memorizing any particular implementation.

Imperative

This implementation is basically what we wrote in the first post, but accounting for the optional starting value.

Imperative Reduce

function imperativeReduce(collection, accumulator, startValue) {

  var startIndex = 0;
  if (startValue===undefined) {
    startValue = collection[0];
    startIndex = 1;
  }

  for (var i=startIndex,len=collection.length; i<len; i++) {
    startValue = accumulator(startValue,collection[i]);
  }

  return startValue;
};

As you can see, other than modifying the starting index and accounting for a starting value, there's no real change.

First Declarative

I'm going to refer to the more functional approaches as declarative. Basically it just means I'm not using a direct loop, but rather a function that iterates for me.

First Declarative Reduce

function decReduce1(list, operation, start) {

  start      = arguments.length>2 ? start  : list[0];
  list       = arguments.length>2 ? list   : list.slice(1);

  list.forEach(function(value) {
    start = operation(start,value);
  });

  return start;

};

This one uses basically the same logic as the imperative version, but we have to account for iterating differently. Rather than modifying the starting index, we have to change the list itself - in this case, we reset the variable to be a slice starting from the 1st index to avoid mutating the original array (do you know what 'passed by reference' means?).

Second Declarative

This time we mix up the logic a bit - can you think of any downsides to this approach?

Second Declarative Reduce

function decReduce2(arr, combiner, init) {

  var iterator = function(item, index, array) {
    if (init!==undefined) {
      init = combiner(init, item);
    } else {
      init = item;
    }
  };

  collection.forEach(iterator);

  return init;
};

One big thing should jump out at you - we only have logic inside of the iterator function being called by forEach. While this will normally work, there are a few downsides to this method. One is the overhead of doing the conditional check on every iteration. The performance hit from this should be minimal, but at millions or billions of iterations we could expect to see a difference. Another is that if the given combiner function ever returns undefined, we will probably get unexpected output. You could argue that the given function shouldn't ever return undefined, and I'd agree with that, but it is something to consider if you're writing library functions (meaning functions intended for use by other programmers).

Recursive

If you're familiar with recursion, you can probably imagine there's at least 2 ways to write reduce this way. Let's take a look:

Recursive Reduces

function pureRecurReduce(list, mix, start) {
  list = list.slice();
  if (list.length===0) return start;
  if (start===undefined) start = list.shift();
  return mix(start, pureRecurReduce(list.slice(1),mix,list[0]));
}

function iterRecurReduce(list, mix, start) {
  list = list.slice();
  if (list.length===0) return start;
  if (start===undefined) start = list.shift();
  return iterRecurReduce(list.slice(1), mix, mix(start,list[0]));
}

They're deceptively similar. The first will always inflate the call stack, as every return statement has to wait until the pureRecurReduce function hits the base case to finish executing the mix function. The second could be tail-optimized, as it returns a call to the same function with modified arguments.

No Loops or Recursion!

And finally, we have an academic exercise - how do you iterate without some kind of a loop or recursion? You could use eval in Javascript, or do some sort of parent-child message passing, or in a low-level language manipulate the stack directly. I did none of these, but implemented it using something called the Y-Combinator:

Y-Combinator Reduce

var reduce = function(g,o,l) {
  return (function(F) {
    return (function(x) {
      return F(function(y) { return (x(x))(y);});
    })(function(x) {
      return F(function(y) { return (x(x))(y);});
    });
  })(function(r) {
    return function(arr,cb,acc) {
      acc = (acc===undefined)?arr.shift():acc;
      if (arr.length===1) return o(acc,arr[0]);
      return o(acc,r(arr));
    };
  })(g,o,l);
};

// more concisely:

let reduce =
(g,o,l)=>(f=>(x=>f(y=>x(x)(y)))(x=>f(y=>x(x)(y))))(r=>(a,s=a.shift())=>(a.length<2&&o(s,...a))||o(s,r(a)))([...g],l)

This is, of course, mostly useless. Please don't write code this way. If you're interested in how I shrunk the first function to one (long) long, look for a future post on code golf!

Application

Great, so you get how reduce works, you can write it several ways, you're dreaming in folded numbers; what the hell is the point? Programming at its core is thinking in abstractions. Abstractly, this reduce function you now know so well takes a series of things and condenses it into one thing. Any time you have a list of some kind and want that changed to something else, you can use reduce.

What's a practical example? You're writing code for a game site, where you get long lists of games and the outcomes. We can think of it as a long string like so:

1	WWWWLLLLTTTTWWWLLWWWWLLLLLTTTTLLLLWWWWWW

Basically a stream of Wins, Losses, and Ties. If we wanted to get a player's record at that moment, we basically want to reduce those characters into a count object, right? That might look something like this:

Calculate Record From String

var recordStr = 'WWWWLLLLTTTTWWWLLWWWWLLLLLTTTTLLLLWWWWWW';
var recordObj = recordStr.split('').reduce(function(growing, current) {
  if (current === 'W') {
    growing.wins++;
  } else if (current === 'L') {
    growing.losses++;
  } else if (current === 'T') {
    growing.ties++;
  }
  return growing;
}, {wins:0,losses:0,ties:0});
console.log(recordObj); // { wins: 17, losses: 15, ties: 8 }

Notice a few important things here - the callback we pass in takes two values, we know this from the implementation. Remember, you the programmer always know what the first argument is. We define it in the starting value (the object we pass in with 0 for all the values we're counting), then by returning a value inside of the callback. That first value is the one we build upon. It's often called accumulating or growing to signify that, while the second argument is the current item you're combining.

The types don't have to be similar. In the next example, imagine you're part of an engineering team building an assembly line for some mechanical component. The team has dozens of tests that return various numbers, representing a component's variance from a given tolerance. Every test needs to be within 0.3 of the original to pass the inspection, otherwise it fails and needs to be scrapped. This is also a good reduce problem:

Tolerance Test

// Create an array of multiple test objects
var firstBuild  = [0.2,0.1,-0.29,-0.11,0.2,0.123,0.02,0];
var secondBuild = [0.2,0.1,-0.29,-0.27,0.01,0.103,0.08,0];
var thirdBuild  = [0.2,0.31,-0.22,-0.31,0.2,0.223,0.01,0];
var testResults = [firstBuild, secondBuild, thirdBuild];

// We would actually use TWO helpers here - first a map,
// as we want a 1-to-1 match between a pass/fail for each
// item in the testResults array, then a reduce to change
// the numbered results to a single Boolean
var passFailArray = testResults.map(function(testArray) {
  return testArray.reduce(function(passing, current) {
    if (current < -0.3 || current > 0.3) return false;
    return passing;
  }, true);
});
console.log(passFailArray); // [ true, true, false ]

Why does this work? Here it's a logic puzzle - we assume the component passes all the tests by starting at true. If any test results are outside of our given tolerance (more than 0.3 away from 0), we return false. This flips the value of passing from true to false, and so the ultimate value will have to be false. Remember that the first argument in each subsequent call to our accumulating function is the return value of the previous one (after the start value), meaning all we have to do to change the result is return a different value. Otherwise, we want to continue returning our starting value of true.

Hopefully those examples help get you thinking about how to apply a reduce function in your own programming. If you have any questions or better examples, let me know! You can file an issue against the blog.