Skip to content

Instantly share code, notes, and snippets.

@timrprobocom
Created August 19, 2023 00:27
Show Gist options
  • Save timrprobocom/09bf78e52ee56018fa7e4e35f035b057 to your computer and use it in GitHub Desktop.
Save timrprobocom/09bf78e52ee56018fa7e4e35f035b057 to your computer and use it in GitHub Desktop.
Untangle improperly sorted columns
lines = """\
ABCDE12345_001,FGHIJ6789_002,ABCDE12345_001
KLMNO5432_003,KLMNO5432_003,FGHIJ6789_002
PQRST24680_123,UVWXY13579_555,UVWXY13579_555
ZABCD876530_009,ZABCD876530_009,AABBCCDDEE_987
AABBCCDDEE_987,AABBCCDDEE_987,LMNOP98765_999
LMNOP98765_999,ZYXWV54321_777,ZYXWV54321_777""".splitlines()
record = []
for row in lines:
record += [(word,ordinal) for ordinal,word in enumerate(row.split(','))]
record.sort()
last = None
build = None
for word,posn in record:
if word != last:
if build:
print(','.join(build))
build = ['','','']
last = word
build[posn] = word
print(','.join(build))
@timrprobocom
Copy link
Author

Note that I fixed the lone "7" in the 4th data line, and removed an extra "," in the 3rd data line.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment