Skip to content

Instantly share code, notes, and snippets.

@peterdalle
Created November 18, 2015 14:25
Show Gist options
  • Save peterdalle/a4ea737d19a9fd3954d1 to your computer and use it in GitHub Desktop.
Save peterdalle/a4ea737d19a9fd3954d1 to your computer and use it in GitHub Desktop.
Remove lines in file A that are present in file B
#!/usr/bin/python
# Delete all lines in file A (keep.txt) that are present in file B (delete.txt).
# Rules:
# 1. If line is found in B but not in A, then delete (i.e., do not merge files).
# 2. If line is found in B and in A too, then delete (i.e., delete duplicates).
# Read file with lines that we shall keep.
keepfile = open("keep.txt")
keeplines = keepfile.readlines()
keepfile.close()
# Read file with lines we should delete.
delfile = open("delete.txt")
dellines = delfile.readlines()
delfile.close()
keep, remove, total = 0, 0, 0
# Remove any new lines (\n) at the end.
keeplines = map(lambda s: s.strip(), keeplines)
dellines = map(lambda s: s.strip(), dellines)
for line in keeplines:
if line not in dellines:
keep = keep + 1
print line
else:
remove = remove + 1
total = total + 1
print
print "------------------------------"
print "Keep: " + str(keep)
print "Remove: " + str(remove)
print "Total iterations: " + str(total)
print "Total in both list: " + str(len(keeplines) + len(dellines))
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment