-
-
Save chenghan/7456549 to your computer and use it in GitHub Desktop.
import sys | |
salesTotal = 0 | |
oldKey = None | |
for line in sys.stdin: | |
data = line.strip().split("\t") | |
if len(data) != 2: | |
# Something has gone wrong. Skip this line. | |
continue | |
thisKey, thisSale = data | |
if oldKey and oldKey != thisKey: | |
print oldKey, "\t", salesTotal | |
oldKey = thisKey | |
salesTotal = 0 | |
oldKey = thisKey | |
salesTotal += float(thisSale) | |
if oldKey != None: | |
print oldKey, "\t", salesTotal |
Just tested the code locally. To me, line 15 is not necessary, is it? In this example (and supposedly in general, with the keys sorted), when a new city gets processed, the assignment oldKey=thisKey will be done in line 18 anyway; setting totalSales=0 is necessary, though.
Happy coding!
I m having some Confusion around here the Reducer script is reading from sys.stdin so how does the mapper passes on to the file to Read, Mapper code is only printing the line, its not storing the lines to pass onto the Reducer, Reducer is reading from stdin so it has read to from the keyboard and not lines passed by mapper
Hi, I'm very new to this and I was wondering why we need these lines of code at lines 21 and 22.
if oldKey != None: print oldKey, "\t", salesTotal
Hi, I'm very new to this and I was wondering why we need these lines of code at lines 21 and 22.
if oldKey != None: print oldKey, "\t", salesTotal
This is for printing the last line
" oldkey!=None ",means its testing if the oldkey has value or not but since the code has come out of for loop oldkey will have value.
Now if you ask "but we don't need if condition for printing last line".This is where its really interesting, if the (if len(data) != 2) turns out true or moreover if the input data is incorrect then the program wont simply print .
i have implanted the code and as an output i find this
newyork 28
amazon 22
washdc 1
i wander why the tab doesnt work and the number are not in the same line thanks
Can I use groupby function in Pandas? That was my first thought
I think this is probably how the groupby function in pandas works.
Hi @senthil1988
The sentence "if oldkey..." what tests is that the variable oldkey is assigned to some value and its type is different than NoneType.
It would be clear and easier to write "if oldKey is not None..." instead of "if oldkey..."
Regards!