Skip to content

Instantly share code, notes, and snippets.

@nathanhinchey
Last active October 25, 2017 16:01
Show Gist options
  • Star 0 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save nathanhinchey/8edf9d2380f48ebd2561ba909edfcb89 to your computer and use it in GitHub Desktop.
Save nathanhinchey/8edf9d2380f48ebd2561ba909edfcb89 to your computer and use it in GitHub Desktop.
Script to find students who have been absent full days
import csv
import sys
from operator import itemgetter
# To use, export the excel file to CSV,
# then in the shell run `python absentee.counter.py <name of csv goes here>`
"""
Original Stackoverflow question:
https://stackoverflow.com/questions/46858010/excel-script-to-find-names-that-meet-multiple-criteria
Text of question at the time I wrote this:
A little background is required I think. I work at a school and we take
attendance by class, not by day as it is not practical. However, we are being
asked to compile a list of names that have missed a percentage of days. The
problem is that we are looking at approximately 80,000 entries that list name
and date (in separate cells) for every block they were absent. We classify a
person absent for the "day" if they have missed either 3 or 4 blocks. I wasn't
sure if there was a script that could find a name that appears 3 or 4 times
with the same date beside it (indicating that someone was absent 3 or 4
blocks). I know that none of the formulas I tried were successful. Just sorting
wasn't sufficient either as there are over 1500 students appearing on the list.
I can't share the specific workbook as it is confidential. Any help that can be
given would be greatly appreciated! (Even if it's only to tell me that what I
want isn't possible).
Here is a dummy sheet with the basic set up of my sheet. Keep in mind that my
sheet has approximately 80,000 rows not 29.
https://www.dropbox.com/s/lw3sp8tuin90ih6/Sample%20Attendance%20Sheet.xlsx?dl=0
"""
def make_absences_dict_from_file(input_filename):
absences = {}
with open(input_filename, 'rU') as input_file:
reader = csv.reader(input_file, dialect='excel')
for row in reader:
current_name = row[2]
current_date = row[0]
if not absences.has_key(current_name):
absences[current_name] = {}
if not absences[current_name].has_key(current_date):
absences[current_name][current_date] = 0
absences[current_name][current_date] += 1
return absences
def absence_counter(input_filename, output_filename='absentees.txt'):
absences = make_absences_dict_from_file(input_filename)
with open(output_filename, 'wb') as output_file:
writer = csv.writer(output_file, delimiter='\t')
absentees = []
for name in absences:
missing_days = 0
for date in absences[name]:
if absences[name][date] == 3 or absences[name][date] == 4:
missing_days += 1
if missing_days:
absentees.append([name, missing_days])
writer.writerow(['Student Name', 'Absences'])
absentees.sort(key=itemgetter(0))
for absentee in absentees:
writer.writerow(absentee)
print('Output file: ' + output_filename)
if __name__ == '__main__':
input_filename = sys.argv[1]
absence_counter(input_filename)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment