Skip to content

Instantly share code, notes, and snippets.

@zhmz1326
Created July 22, 2015 05:57
Show Gist options
  • Star 1 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save zhmz1326/69fc7d605708525fd86a to your computer and use it in GitHub Desktop.
Save zhmz1326/69fc7d605708525fd86a to your computer and use it in GitHub Desktop.
Check duplicate by key in csv file, and then print the duplicate lines out.
import codecs
filename = 'input.csv'
# fr = open(filename)
fr = codecs.open(filename,"r","shift_jis")
lines = fr.readlines()
lines.sort()
dict = {}
for line in lines:
line = line.strip()
key = line.split(',')[0]
if key in dict:
dict[key] += 1
else:
dict[key] = 1
list = dict.keys()
for line in lines:
line = line.strip()
key = line.split(',')[0]
if key in dict and dict[key] > 1:
print line
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment