Create a gist now

Instantly share code, notes, and snippets.

Embed
What would you like to do?
Instructor code that was shown on screen
import sys
salesTotal = 0
oldKey = None
for line in sys.stdin:
data = line.strip().split("\t")
if len(data) != 2:
# Something has gone wrong. Skip this line.
continue
thisKey, thisSale = data
if oldKey and oldKey != thisKey:
print oldKey, "\t", salesTotal
oldKey = thisKey
salesTotal = 0
oldKey = thisKey
salesTotal += float(thisSale)
if oldKey != None:
print oldKey, "\t", salesTotal
@piggybox

This comment has been minimized.

Show comment
Hide comment
@piggybox

piggybox Jan 28, 2014

The ";' isn't needed on line 13

"sys" needs to be imported at the beginning

piggybox commented Jan 28, 2014

The ";' isn't needed on line 13

"sys" needs to be imported at the beginning

@sanoops

This comment has been minimized.

Show comment
Hide comment
@sanoops

sanoops Mar 10, 2014

import sys
salesTotal = 0.0
oldKey = None
dummy_Data=["Miami 12.34","Miami 99.07","Miami 55.07","NYC 88.97","NYC 33.56"]

for line in dummy_Data:
data = line.strip().split(" ")
if len(data) != 2:
# Something has gone wrong. Skip this line.
continue

thisKey, thisSale = data
if oldKey and oldKey != thisKey:
    print oldKey, ":", salesTotal
    oldKey = thisKey
    salesTotal = 0

oldKey = thisKey
salesTotal += float(thisSale)

if oldKey != None:
print oldKey, ":", salesTotal

reducer.py https://gist.github.com/sanoops/9471084

sanoops commented Mar 10, 2014

import sys
salesTotal = 0.0
oldKey = None
dummy_Data=["Miami 12.34","Miami 99.07","Miami 55.07","NYC 88.97","NYC 33.56"]

for line in dummy_Data:
data = line.strip().split(" ")
if len(data) != 2:
# Something has gone wrong. Skip this line.
continue

thisKey, thisSale = data
if oldKey and oldKey != thisKey:
    print oldKey, ":", salesTotal
    oldKey = thisKey
    salesTotal = 0

oldKey = thisKey
salesTotal += float(thisSale)

if oldKey != None:
print oldKey, ":", salesTotal

reducer.py https://gist.github.com/sanoops/9471084

@spenceronuffer

This comment has been minimized.

Show comment
Hide comment
@spenceronuffer

spenceronuffer Oct 4, 2016

Would it be cleaner to store this info to dictionary? It would make it so you don't have to keep track of oldKey vs thisKey, also it will work if the sort is imperfect, but I'm not sure if there's any map reduce specific thing it would screw up

import sys

salesTotals = {}

for line in sys.stdin:
    data = line.strip().split("\t")
    if len(data) != 2:
        # Something has gone wrong. Skip this line.           
        continue

    store, sale = data
    salesTotals.setdefault(store, 0)
    salesTotals[store] += float(sale)

for store in salesTotals:
    print "{0}\t{1}".format(store, salesTotals[store])

spenceronuffer commented Oct 4, 2016

Would it be cleaner to store this info to dictionary? It would make it so you don't have to keep track of oldKey vs thisKey, also it will work if the sort is imperfect, but I'm not sure if there's any map reduce specific thing it would screw up

import sys

salesTotals = {}

for line in sys.stdin:
    data = line.strip().split("\t")
    if len(data) != 2:
        # Something has gone wrong. Skip this line.           
        continue

    store, sale = data
    salesTotals.setdefault(store, 0)
    salesTotals[store] += float(sale)

for store in salesTotals:
    print "{0}\t{1}".format(store, salesTotals[store])
@digitalmacgyver

This comment has been minimized.

Show comment
Hide comment
@digitalmacgyver

digitalmacgyver Jan 23, 2017

Line 13:

if oldKey and oldKey != thisKey:

Would be better written with an explicit check against None:

if oldKey is not None and oldKey != thisKey:

As it is, this code malfunctions if given input where the key is the empty string, e.g.:

NY\t100
\t200
SF\t300

Will yield an output of:
SF:600

digitalmacgyver commented Jan 23, 2017

Line 13:

if oldKey and oldKey != thisKey:

Would be better written with an explicit check against None:

if oldKey is not None and oldKey != thisKey:

As it is, this code malfunctions if given input where the key is the empty string, e.g.:

NY\t100
\t200
SF\t300

Will yield an output of:
SF:600

@senthil1988

This comment has been minimized.

Show comment
Hide comment
@senthil1988

senthil1988 Feb 16, 2017

Can anyone explain what the below line does, I understand one part and i don't get the first condition.

"if oldkey and oldkey!=None"

I don't get what the first condition "if oldkey and" does...Thanks in Advance

Senthil

senthil1988 commented Feb 16, 2017

Can anyone explain what the below line does, I understand one part and i don't get the first condition.

"if oldkey and oldkey!=None"

I don't get what the first condition "if oldkey and" does...Thanks in Advance

Senthil

@pabloalicante

This comment has been minimized.

Show comment
Hide comment
@pabloalicante

pabloalicante Mar 18, 2017

Hi @senthil1988

The sentence "if oldkey..." what tests is that the variable oldkey is assigned to some value and its type is different than NoneType.

It would be clear and easier to write "if oldKey is not None..." instead of "if oldkey..."

Regards!

pabloalicante commented Mar 18, 2017

Hi @senthil1988

The sentence "if oldkey..." what tests is that the variable oldkey is assigned to some value and its type is different than NoneType.

It would be clear and easier to write "if oldKey is not None..." instead of "if oldkey..."

Regards!

@wbl17

This comment has been minimized.

Show comment
Hide comment
@wbl17

wbl17 Jul 2, 2017

Just tested the code locally. To me, line 15 is not necessary, is it? In this example (and supposedly in general, with the keys sorted), when a new city gets processed, the assignment oldKey=thisKey will be done in line 18 anyway; setting totalSales=0 is necessary, though.

Happy coding!

wbl17 commented Jul 2, 2017

Just tested the code locally. To me, line 15 is not necessary, is it? In this example (and supposedly in general, with the keys sorted), when a new city gets processed, the assignment oldKey=thisKey will be done in line 18 anyway; setting totalSales=0 is necessary, though.

Happy coding!

@yashgyy

This comment has been minimized.

Show comment
Hide comment
@yashgyy

yashgyy Dec 27, 2017

I m having some Confusion around here the Reducer script is reading from sys.stdin so how does the mapper passes on to the file to Read, Mapper code is only printing the line, its not storing the lines to pass onto the Reducer, Reducer is reading from stdin so it has read to from the keyboard and not lines passed by mapper

yashgyy commented Dec 27, 2017

I m having some Confusion around here the Reducer script is reading from sys.stdin so how does the mapper passes on to the file to Read, Mapper code is only printing the line, its not storing the lines to pass onto the Reducer, Reducer is reading from stdin so it has read to from the keyboard and not lines passed by mapper

@DeepanshKhurana

This comment has been minimized.

Show comment
Hide comment
@DeepanshKhurana

DeepanshKhurana May 19, 2018

Hi, I'm very new to this and I was wondering why we need these lines of code at lines 21 and 22.

if oldKey != None: print oldKey, "\t", salesTotal

DeepanshKhurana commented May 19, 2018

Hi, I'm very new to this and I was wondering why we need these lines of code at lines 21 and 22.

if oldKey != None: print oldKey, "\t", salesTotal

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment