#Map Reduce is Simple (in Javascript)
##The Implementation
var mf = function (item){return item;};//Sample MAP FUNCTION -- this one does nothings to the initial values of the DATA array
//For more information, check out https://developer.mozilla.org/en-US/docs/JavaScript/Reference/Global_Objects/Array/map
var rf = function (iv,current) {iv.push(current);return iv;};//Sample REDUCE FUNCTION -- this one simply adds mapped values to the INITIAL VALUE array
//For more information, check out https://developer.mozilla.org/en-US/docs/JavaScript/Reference/Global_Objects/Array/reduce
var iv = [];//Sample INITIAL VALUE for reduction step -- values must be "folded" into this using the REDUCE FUNCTION
var d = [];//Sample DATA to be processed
console.log(d
.map(mf)//Map Step using MAP FUNCTION
.reduce(rf,iv));//Reduce Step using REDUCE FUNCTION and INITIAL VALUE
##The Idea
The idea behind MAP REDUCE is that you have a data set (as an array) in which each member of the set is processed using a MAP FUNCTION. The results are then be combined using a REDUCE FUNCTION.
The "trick" lies in the fact that values produced by the REDUCE FUNCTION can be combined progressively, that is, data can start off on several different machines where the mapping (and possibly some of the reducing) is done before it's reduced at a central location into it's final form. I haven't read all the way through Google's famous paper on the subject (http://research.google.com/archive/mapreduce.html) but I imagine that it revolves around the fact that Google can use this "trick" to efficiently process their data as many of their data sets span several machines. I think there's some extra on keys using keys to sort data, but there's a few ways you can handle that. I also imagine that there might be some interesting stuff on NoSQL databases -- I really should read that thing...
BONUS KNOWLEDGE: If you ensure that the REDUCE FUNCTION is an ASSOCIATIVE FUNCTION, then the order in which data is combined will have no effect on the final outcome of the data. Associative means f(a,b) === f(b,a)
Keep in mind that Map Reduce is an absurdly simple concept. That's it.
##Examples ###Just a few to get you started Please note that in many examples, I create arrays of string splitting strings rather than creating arrays explicityly.
//Example: Calculate 15% tip on a list of bills and order them
//No keys used
//Reduce simply creates a sorted array
var mapper = function (item){return Math.round(parseFloat(item)*1.15*100)/100};
var reducer = function (intermediate, current) { intermediate.push(current); return intermediate.sort();};
var initial = [];
var data = "14.22 15.28 19.42 16.80 11.96".split(" ");
console.log(data.map(mapper).reduce(reducer,initial));
//Example: Zip Arrays
//No keys used
//Reduce simply creates a sorted array
var mapper = function (item){return item;}
var reducer = function (intermediate, current) {
for(var i = 0; i<current.length; i++){
intermediate[i] = intermediate[i] || [];
intermediate[i].push(current[i]);
}
return intermediate;
}
var initial = [];
var data = [[1,2,3],[4,5,6],[7,8,9]]
console.log(data.map(mapper).reduce(reducer,initial));
//Example: Get count of each word in a sentence
//Keys are generated dynamically from data
//Inspired by http://java.dzone.com/articles/javascript-mapreduce-one-liner
var mapper = function (item){return item.toLowerCase()};
var reducer = function (intermediate, current) { if (current) intermediate[current] = (intermediate[current] || 0) + 1; return intermediate;};
var initial = {};
var data = "Did that little red hen just run over that larger, redder black hen over there?".split(" ");
console.log(data.map(mapper).reduce(reducer,initial));
//Example: Count the number of characeters in each word in a sentence and group them by parity
//Possible keys know before hand and are incorporated into the initial value
//Doing this can simply the reduce function
var mapper = function (item){
return {
parity : (item.length % 2)? "odd" : "even",
len : item.length
}
};
var reducer = function (intermediate, current) {
if (current)
intermediate[current.parity].push(current);
return intermediate;
};
var initial = {"even":[],"odd":[]};
var data = "1 111 11 1111 11 111 1111 111 11 111 11 111111 11 1 11 11 1111".split(" ");
console.log(data.map(mapper).reduce(reducer,initial));
//Example: Count the number of characeters in each word in a sentence and group them by parity
//Possible keys not know before hand
var mapper = function (item){
return {
parity : item.length % 2,
len : item.length
}
};
var reducer = function (intermediate, current) {
if (current)
intermediate[current.parity] = intermediate[current.parity]? intermediate[current.parity].concat(current.len) : [current.len];
return intermediate;
};
var initial = {};
var data = "1 111 11 1111 11 111 1111 111 11 111 11 111111 11 1 11 11 1111".split(" ");
console.log(data.map(mapper).reduce(reducer,initial));
##License -- (MIT License) Copyright (c) 2013 John Henry <john@iamjohnhenry.com>
Permission is hereby granted, free of charge, to any person obtaining a copy of this software and associated documentation files (the 'Software'), to deal in the Software without restriction, including without limitation the rights to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies of the Software, and to permit persons to whom the Software is furnished to do so, subject to the following conditions:
The above copyright notice and this permission notice shall be included in all copies or substantial portions of the Software.
THE SOFTWARE IS PROVIDED 'AS IS', WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE.