My JavaScript book is out! Don't miss the opportunity to upgrade your beginner or average dev skills.

Tuesday, October 14, 2014

Unusual tricks via Array .some()

A very good reason to use Array#some instead of forEach is that once a single truthy value is returned the loop simply stops.

This is not only more efficient, but also way more powerful than .indexOf(value) or eventually .contains(value) thanks to the callback that gives us the ability to perform any sort of synchronous check:
// entering in a chat room
if (chatRoomUsers.some(function admin(user, i) {
  return user.isAdmin;
})) {
  console.log('Watch out, this room is moderated!');
}
chatRoomUsers.push(you);
chatRoomUsers.forEach(
  room.sendEachMessage,
  you.name + ' entered the room'
);

/* example of room.sendEachMessage
Room.prototype.sendEachMessage = function (user) {
  user.sendMessage(this);
};
*/
The inevitable unfortunate caveat with .some(), compared to an .indexOf() approach, is that the returned value is either true or false, but when it is true we need eventually to loop that very same collection again in order to find the element we might want to keep in any sort of "should contain only one element per type" situation, as a list of users or products usually is.

1. A classic outer scope variable approach

The most obvious but tedious, boring, and quite error prone approach we could think about is the following:
// generic list of users
var lotteryUsers = [
  {ticket: 'Z1023', name: 'Z'},
  {ticket: 'AR1J5', name: 'F'},
  {ticket: 'F89DD', name: 'X'}
];

// the outer scope variable
var index;
if (lotteryUsers.some(function (user, i) {
  index = i;
  return user.ticket == this;
}, 'AR1J5')) {
  console.log(
    'Congratulation ' +
    lotteryUsers[index].name
  );
}
A simple way to avoid that reassignment each time could be doing it only on success.
lotteryUsers.some(function (user, i) {
  return user.ticket == this &&
         ~(index = i);
})
The ~ will simply ensure that the returned value will still be truthy, since any index, zero included, will be converted into its (i + 1) * -1 counter part so that 0 will be -1, a truthy value, 1 will be -2, still truthy ... and so on. We'll check later performance against always reassigning.
Another alternative could be used when functions are recycled instead of created each time.
if (lotteryUsers.some(lotteryWinner, ticket)) {
  console.log(
    'Congratulations ' +
     lotteryUsers[lotteryWinner.index].name
  );
}

// anywhere it is defined
function lotteryWinner(user, i) {
  return user.ticket == this &&
         ~(lotteryWinner.index = i);
}
In latter case the win over the outer scope variable and the tilde ~ trick works pretty well combined.

2. A common case based on RegExp

Since regular expressions update results in the global object each time, we can actually use this code to understand, as example, if a node program.js has been called with -j4 or --cpus=2 arguments, in order to limit the usage of the cluster module:
var numOfCPUs = process.argv.some(
  // the testing function, same RegExp each arg
  function(arg){return this.test(arg)},
  /(?:-j|--cpus=)(\d+)/
) ?
  // if it was true, we can use
  // the value contained in the $1 match
  parseInt(RegExp.$1, 10) :
  // otherwise we can require a module
  // and grab all the CPUs
  require('os').cpus().length
;

// in case we are in multi core env
if (1 < numOfCPUs) {
  var i = 0, cluster = require('cluster');
  if (cluster.isMaster) {
    // go multi core
    while(i++ < numOfCPUs) cluster.fork();
  }
}
Since the nature of the RegExp is this one since about ever, I wouldn't mind using a convenient and semantic approach recyclable per each different case:
// it would be untouched anyway ...
RegExp.test = function (v) { return this.test(v); };

// for any similar case 
var numOfCPUs = process.argv.some(
  // same function, different tests
  RegExp.test,      /(?:-j|--cpus=)(\d+)/
) ?
  parseInt(RegExp.$1, 10) :
  require('os').cpus().length
;

3. An index solution based on RegExp

We have reached the final trick of this post, the one that will combine the need to verify a generic value of an Array and store the index without bothering the outer scope or the function itself, suitable then with inline functions too.
if (lotteryUsers.some(function (user, i) {
  return user.ticket == this &&
         /\d+/.test(i); // <== see this ?
}, 'AR1J5')) {
  console.log(
    'Congratulation ' +
    // this is how you grab it ;-)
    lotteryUsers[RegExp['$&']].name
  );
}
Not only the regular expression inside the some(cb) loop will be created only if a check will be truthy, but we don't need to create outer scope variables or reachable callbacks since RegExp at the end will instantly give us that result.

Caveat

This works only if absolutely nothing else creates or use regular expressions between some() and the moment you address the result so ... do not ever call something else before you have retrieved the expected object, others pieces of code might be based on the same trick.
// GOOD
var value = arr.some(check) ? arr[RegExp['$&']] : null;

// BAD
if (arr.some(check)) {
  update(somethingElse);
  controller.emit('found', arr[RegExp['$&']]);
}

Performance

In order to have as many random cases as possible, I've added a test in jsperf.com which aim is to try all these approaches and see which one wins.
I think that's very irrelevant this time, since all approaches are mostly equivalent, but at least we have some little proof whatever we need, will be good for that case.

P.S. yes, you might have different results every single time you run that bench, it's part of the real world ;-)