Last active
December 30, 2015 02:19
-
-
Save kwurst/7761789 to your computer and use it in GitHub Desktop.
This snippet shows how to reconstitute columns from a text file. See http://blog.karl.w-sts.com/2013/12/02/code-break-data-file-manipulations-in-python/ for the background and data format.
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
''' | |
Copyright (C) 2013 Karl R. Wurst | |
This program is free software: you can redistribute it and/or modify | |
it under the terms of the GNU General Public License as published by | |
the Free Software Foundation, either version 3 of the License, or | |
(at your option) any later version. | |
This program is distributed in the hope that it will be useful, | |
but WITHOUT ANY WARRANTY; without even the implied warranty of | |
MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the | |
GNU General Public License for more details. | |
You should have received a copy of the GNU General Public License | |
along with this program. If not, see <http://www.gnu.org/licenses/>. | |
''' | |
f = open(...) # fill in your path to the file | |
for line in f: | |
department = line.split() | |
# get the three values we know are at the end of the line | |
undergrad = department[-3] | |
graduate = department[-2] | |
total = department[-1] | |
# reconsititute the department name from the list items at the front of the list | |
name = '' | |
for item in department[:-3]: # sub-list of the front of the list | |
name = name + item + ' ' | |
name = name[:-1] # take off the trailing space we introduced... | |
# at this point we have all four columns: name, undergrad, graduate, total |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
''' | |
Copyright (C) 2013 Karl R. Wurst | |
This program is free software: you can redistribute it and/or modify | |
it under the terms of the GNU General Public License as published by | |
the Free Software Foundation, either version 3 of the License, or | |
(at your option) any later version. | |
This program is distributed in the hope that it will be useful, | |
but WITHOUT ANY WARRANTY; without even the implied warranty of | |
MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the | |
GNU General Public License for more details. | |
You should have received a copy of the GNU General Public License | |
along with this program. If not, see <http://www.gnu.org/licenses/>. | |
''' | |
f = open(...) # fill in your path to the file | |
for line in f: | |
# combine districts that have been broken across multiple lines | |
while line.find('0') == -1: # if there is no org code (starting with '0') on this line | |
line = line + f.readline() # combine lines until there is one | |
name = line[:find('0')] # district name is everything up to '0' | |
data = line[find('0'):].split() # data is a list created by splitting the rest of the line |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment