Skip to content

Instantly share code, notes, and snippets.

@aksiksi
Last active August 29, 2015 13:56
Show Gist options
  • Star 0 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save aksiksi/9280508 to your computer and use it in GitHub Desktop.
Save aksiksi/9280508 to your computer and use it in GitHub Desktop.
Display the source blob
Display the rendered blob
Raw
{
"metadata": {
"name": ""
},
"nbformat": 3,
"nbformat_minor": 0,
"worksheets": [
{
"cells": [
{
"cell_type": "heading",
"level": 1,
"metadata": {},
"source": [
"Python Workshop: Timing Optimization"
]
},
{
"cell_type": "heading",
"level": 2,
"metadata": {},
"source": [
"Outline"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"The general idea behind this notebook is to determine the best time for the Python workshop I'll be presenting this semester (Spring 2014).\n",
"\n",
"The libraries used:\n",
"\n",
"* [cPickle](http://docs.python.org/2/library/pickle.html#module-cPickle)\n",
"* [pandas](http://pandas.pydata.org/)\n",
"* [datetime](http://docs.python.org/2/library/datetime.html)\n",
"\n",
"I'll be using a pre-formatted dump of all UAEU courses available this semester. The course dump will be parsed into a `DataFrame` for easier processing.\n",
"\n",
"The final output will be a `pandas` `DataFrame` containing the candidate times, in order of least amount of courses at that time."
]
},
{
"cell_type": "heading",
"level": 2,
"metadata": {},
"source": [
"Implementation"
]
},
{
"cell_type": "code",
"collapsed": false,
"input": [
"import cPickle as pickle\n",
"import pandas as pd\n",
"from datetime import datetime"
],
"language": "python",
"metadata": {},
"outputs": [],
"prompt_number": 3
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Let's first load up the course dump from the `pickle` file."
]
},
{
"cell_type": "code",
"collapsed": false,
"input": [
"with open('classes.pickle') as f:\n",
" course_dump = pickle.load(f)"
],
"language": "python",
"metadata": {},
"outputs": [],
"prompt_number": 4
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Next, we compile a list of all of the sections that are both for boys and are at the undergraduate level."
]
},
{
"cell_type": "code",
"collapsed": false,
"input": [
"section_list = []\n",
"\n",
"for courses in course_dump.values():\n",
" for sections in courses.values():\n",
" for section in sections.values():\n",
" # Get the first digit of course code to determine level\n",
" try:\n",
" course_level = int(section['code'][0])\n",
" except:\n",
" continue\n",
" \n",
" # Take a section only if it is for boys and is undergrad level\n",
" if section['gender'] == 'B' and course_level < 6 and section['time'] != 'TBA':\n",
" section_list.append(section.values())\n",
"\n",
"columns = section.keys()"
],
"language": "python",
"metadata": {},
"outputs": [],
"prompt_number": 16
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Using `section_list`, we create a `DataFrame` to represent the sections and view its tail."
]
},
{
"cell_type": "code",
"collapsed": false,
"input": [
"df = pd.DataFrame(data=section_list, columns=columns)\n",
"df.tail()"
],
"language": "python",
"metadata": {},
"outputs": [
{
"html": [
"<div style=\"max-height:1000px;max-width:1500px;overflow:auto;\">\n",
"<table border=\"1\" class=\"dataframe\">\n",
" <thead>\n",
" <tr style=\"text-align: right;\">\n",
" <th></th>\n",
" <th>status</th>\n",
" <th>total</th>\n",
" <th>code</th>\n",
" <th>title</th>\n",
" <th>gender</th>\n",
" <th>section</th>\n",
" <th>time</th>\n",
" <th>days</th>\n",
" <th>current</th>\n",
" <th>credits</th>\n",
" <th>abbrev</th>\n",
" <th>location</th>\n",
" <th>attribute</th>\n",
" <th>duration</th>\n",
" <th>instructor</th>\n",
" <th>crn</th>\n",
" <th>remaining</th>\n",
" </tr>\n",
" </thead>\n",
" <tbody>\n",
" <tr>\n",
" <th>827</th>\n",
" <td> NR</td>\n",
" <td> 27</td>\n",
" <td> 220</td>\n",
" <td> Engineering thermodynamics</td>\n",
" <td> B</td>\n",
" <td> 03</td>\n",
" <td> 12:00 pm-01:50 pm</td>\n",
" <td> TU</td>\n",
" <td> 23</td>\n",
" <td> 3.000</td>\n",
" <td> GENG</td>\n",
" <td> F3 238</td>\n",
" <td> English</td>\n",
" <td> 02/16-06/19</td>\n",
" <td> Naeema I. Al Darmaki (P)</td>\n",
" <td> 21864</td>\n",
" <td> 4</td>\n",
" </tr>\n",
" <tr>\n",
" <th>828</th>\n",
" <td> C</td>\n",
" <td> 30</td>\n",
" <td> 220</td>\n",
" <td> Engineering thermodynamics</td>\n",
" <td> B</td>\n",
" <td> 01</td>\n",
" <td> 12:00 pm-01:50 pm</td>\n",
" <td> MW</td>\n",
" <td> 30</td>\n",
" <td> 3.000</td>\n",
" <td> GENG</td>\n",
" <td> F3 238</td>\n",
" <td> English</td>\n",
" <td> 02/16-06/19</td>\n",
" <td> Mohammad O. Hamdan (P)</td>\n",
" <td> 21862</td>\n",
" <td> 0</td>\n",
" </tr>\n",
" <tr>\n",
" <th>829</th>\n",
" <td> NR</td>\n",
" <td> 27</td>\n",
" <td> 220</td>\n",
" <td> Engineering thermodynamics</td>\n",
" <td> B</td>\n",
" <td> 02</td>\n",
" <td> 08:00 am-09:50 am</td>\n",
" <td> TU</td>\n",
" <td> 26</td>\n",
" <td> 3.000</td>\n",
" <td> GENG</td>\n",
" <td> F3 238</td>\n",
" <td> English</td>\n",
" <td> 02/16-06/19</td>\n",
" <td> Naeema I. Al Darmaki (P)</td>\n",
" <td> 21863</td>\n",
" <td> 1</td>\n",
" </tr>\n",
" <tr>\n",
" <th>830</th>\n",
" <td> C</td>\n",
" <td> 25</td>\n",
" <td> 240</td>\n",
" <td> Statics</td>\n",
" <td> B</td>\n",
" <td> 02</td>\n",
" <td> 02:00 pm-03:50 pm</td>\n",
" <td> MW</td>\n",
" <td> 25</td>\n",
" <td> 3.000</td>\n",
" <td> GENG</td>\n",
" <td> F3 036</td>\n",
" <td> \u00a0</td>\n",
" <td> 02/16-06/19</td>\n",
" <td> Said A. Elkhouly (P)</td>\n",
" <td> 20131</td>\n",
" <td> 0</td>\n",
" </tr>\n",
" <tr>\n",
" <th>831</th>\n",
" <td> NR</td>\n",
" <td> 25</td>\n",
" <td> 240</td>\n",
" <td> Statics</td>\n",
" <td> B</td>\n",
" <td> 01</td>\n",
" <td> 04:00 pm-05:50 pm</td>\n",
" <td> TU</td>\n",
" <td> 22</td>\n",
" <td> 3.000</td>\n",
" <td> GENG</td>\n",
" <td> F1 2126</td>\n",
" <td> \u00a0</td>\n",
" <td> 02/16-06/19</td>\n",
" <td> Hamad A. Al Jassmi (P)</td>\n",
" <td> 20130</td>\n",
" <td> 3</td>\n",
" </tr>\n",
" </tbody>\n",
"</table>\n",
"<p>5 rows \u00d7 17 columns</p>\n",
"</div>"
],
"metadata": {},
"output_type": "pyout",
"prompt_number": 17,
"text": [
" status total code title gender section \\\n",
"827 NR 27 220 Engineering thermodynamics B 03 \n",
"828 C 30 220 Engineering thermodynamics B 01 \n",
"829 NR 27 220 Engineering thermodynamics B 02 \n",
"830 C 25 240 Statics B 02 \n",
"831 NR 25 240 Statics B 01 \n",
"\n",
" time days current credits abbrev location attribute \\\n",
"827 12:00 pm-01:50 pm TU 23 3.000 GENG F3 238 English \n",
"828 12:00 pm-01:50 pm MW 30 3.000 GENG F3 238 English \n",
"829 08:00 am-09:50 am TU 26 3.000 GENG F3 238 English \n",
"830 02:00 pm-03:50 pm MW 25 3.000 GENG F3 036 \u00a0 \n",
"831 04:00 pm-05:50 pm TU 22 3.000 GENG F1 2126 \u00a0 \n",
"\n",
" duration instructor crn remaining \n",
"827 02/16-06/19 Naeema I. Al Darmaki (P) 21864 4 \n",
"828 02/16-06/19 Mohammad O. Hamdan (P) 21862 0 \n",
"829 02/16-06/19 Naeema I. Al Darmaki (P) 21863 1 \n",
"830 02/16-06/19 Said A. Elkhouly (P) 20131 0 \n",
"831 02/16-06/19 Hamad A. Al Jassmi (P) 20130 3 \n",
"\n",
"[5 rows x 17 columns]"
]
}
],
"prompt_number": 17
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"This `DataFrame` can easily be written to a CSV using the `to_csv` method. We don't need to yet - there's still the analysis to do!\n",
"\n",
"We'll need to define a new class that represents a section's time. Let's call it `SectionTime`."
]
},
{
"cell_type": "code",
"collapsed": false,
"input": [
"class SectionTime(object):\n",
" dt_cache = {}\n",
" \n",
" def __init__(self, time_string, day_string, format_string, seperator='-'):\n",
" # Time\n",
" start_string, end_string = time_string.split(seperator)\n",
" dt_cache = SectionTime.dt_cache\n",
" \n",
" self.start = dt_cache.get(start_string, None)\n",
" \n",
" if not self.start:\n",
" self.start = datetime.strptime(start_string, format_string)\n",
" dt_cache[start_string] = self.start\n",
" \n",
" self.end = dt_cache.get(end_string, None)\n",
" \n",
" if not self.end:\n",
" self.end = datetime.strptime(end_string, format_string)\n",
" dt_cache[end_string] = self.end\n",
" \n",
" self.start_string = self.start.strftime('%H:%M')\n",
" self.end_string = self.end.strftime('%H:%M')\n",
" \n",
" # Days\n",
" self.days = list(day_string)\n",
" \n",
" def is_conflict(self, other):\n",
" '''Check for a conflict with another SectionTime object.'''\n",
" if any([day in self.days for day in other.days]):\n",
" if other.start >= self.start and other.end <= self.end:\n",
" return True\n",
" elif self.start >= other.start and self.end <= other.end:\n",
" return True\n",
" else:\n",
" return False\n",
" \n",
" def __str__(self):\n",
" return '%s: %s-%s' % (', '.join(self.days), self.start_string, self.end_string)\n",
"\n",
"format_string = '%I:%M %p'\n",
"tr1 = SectionTime(df.time[831], 'MW', format_string)\n",
"tr2 = SectionTime(df.time[831], 'TU', format_string)\n",
"\n",
"tr1.is_conflict(tr2) # False\n",
"tr1.is_conflict(tr1) # True"
],
"language": "python",
"metadata": {},
"outputs": [
{
"metadata": {},
"output_type": "pyout",
"prompt_number": 18,
"text": [
"True"
]
}
],
"prompt_number": 18
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Now we merge the `time` and `days` columns into a list of tuples, and then create a `SectionTime` for each pair."
]
},
{
"cell_type": "code",
"collapsed": false,
"input": [
"pairs = list(zip(df.time, df.days))\n",
"sectiontimes = []\n",
"\n",
"for pair in pairs:\n",
" sectiontimes.append(SectionTime(pair[0], pair[1], format_string))\n",
" \n",
"len(sectiontimes)"
],
"language": "python",
"metadata": {},
"outputs": [
{
"metadata": {},
"output_type": "pyout",
"prompt_number": 41,
"text": [
"832"
]
}
],
"prompt_number": 41
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Next, we'll take each `SectionTime`, compare it to the rest, and then store the stats in a dictionary."
]
},
{
"cell_type": "code",
"collapsed": false,
"input": [
"results = {}\n",
"\n",
"for time in sectiontimes:\n",
" conflict_count = 0\n",
" \n",
" for each in sectiontimes:\n",
" if each != time:\n",
" if time.is_conflict(each):\n",
" conflict_count += 1\n",
" \n",
" results[str(time)] = conflict_count"
],
"language": "python",
"metadata": {},
"outputs": [],
"prompt_number": 42
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Finally, we convert our results into a `DataFrame`, sort by number of conflicts, and then output the 30 times with the least conflicts."
]
},
{
"cell_type": "code",
"collapsed": false,
"input": [
"results_df = pd.DataFrame(data=results.items(), columns=['time', 'conflicts'])\n",
"results_df.sort(columns=['conflicts'])[:30]"
],
"language": "python",
"metadata": {},
"outputs": [
{
"html": [
"<div style=\"max-height:1000px;max-width:1500px;overflow:auto;\">\n",
"<table border=\"1\" class=\"dataframe\">\n",
" <thead>\n",
" <tr style=\"text-align: right;\">\n",
" <th></th>\n",
" <th>time</th>\n",
" <th>conflicts</th>\n",
" </tr>\n",
" </thead>\n",
" <tbody>\n",
" <tr>\n",
" <th>76 </th>\n",
" <td> S: 13:00-15:50</td>\n",
" <td> 0</td>\n",
" </tr>\n",
" <tr>\n",
" <th>133</th>\n",
" <td> S: 09:00-11:50</td>\n",
" <td> 0</td>\n",
" </tr>\n",
" <tr>\n",
" <th>153</th>\n",
" <td> F: 09:00-11:50</td>\n",
" <td> 0</td>\n",
" </tr>\n",
" <tr>\n",
" <th>170</th>\n",
" <td> F: 13:00-15:50</td>\n",
" <td> 0</td>\n",
" </tr>\n",
" <tr>\n",
" <th>203</th>\n",
" <td> R: 17:00-20:00</td>\n",
" <td> 1</td>\n",
" </tr>\n",
" <tr>\n",
" <th>82 </th>\n",
" <td> U: 17:30-19:30</td>\n",
" <td> 1</td>\n",
" </tr>\n",
" <tr>\n",
" <th>45 </th>\n",
" <td> U, T: 18:30-19:20</td>\n",
" <td> 1</td>\n",
" </tr>\n",
" <tr>\n",
" <th>28 </th>\n",
" <td> R: 16:00-18:50</td>\n",
" <td> 2</td>\n",
" </tr>\n",
" <tr>\n",
" <th>129</th>\n",
" <td> M, W: 18:30-19:45</td>\n",
" <td> 4</td>\n",
" </tr>\n",
" <tr>\n",
" <th>73 </th>\n",
" <td> M, W: 17:30-18:50</td>\n",
" <td> 8</td>\n",
" </tr>\n",
" <tr>\n",
" <th>44 </th>\n",
" <td> W: 17:00-19:50</td>\n",
" <td> 9</td>\n",
" </tr>\n",
" <tr>\n",
" <th>25 </th>\n",
" <td> U, T, R: 17:00-18:50</td>\n",
" <td> 9</td>\n",
" </tr>\n",
" <tr>\n",
" <th>64 </th>\n",
" <td> R: 11:00-11:50</td>\n",
" <td> 12</td>\n",
" </tr>\n",
" <tr>\n",
" <th>155</th>\n",
" <td> R: 10:00-10:50</td>\n",
" <td> 12</td>\n",
" </tr>\n",
" <tr>\n",
" <th>142</th>\n",
" <td> R: 10:00-11:50</td>\n",
" <td> 13</td>\n",
" </tr>\n",
" <tr>\n",
" <th>102</th>\n",
" <td> U, T: 17:00-18:15</td>\n",
" <td> 14</td>\n",
" </tr>\n",
" <tr>\n",
" <th>158</th>\n",
" <td> W: 16:00-17:50</td>\n",
" <td> 14</td>\n",
" </tr>\n",
" <tr>\n",
" <th>125</th>\n",
" <td> M, W: 17:00-18:50</td>\n",
" <td> 15</td>\n",
" </tr>\n",
" <tr>\n",
" <th>88 </th>\n",
" <td> U, T: 11:00-12:50</td>\n",
" <td> 17</td>\n",
" </tr>\n",
" <tr>\n",
" <th>178</th>\n",
" <td> U: 16:00-18:45</td>\n",
" <td> 18</td>\n",
" </tr>\n",
" <tr>\n",
" <th>11 </th>\n",
" <td> W: 15:00-17:00</td>\n",
" <td> 18</td>\n",
" </tr>\n",
" <tr>\n",
" <th>96 </th>\n",
" <td> M: 17:00-18:00</td>\n",
" <td> 18</td>\n",
" </tr>\n",
" <tr>\n",
" <th>160</th>\n",
" <td> U: 16:00-17:50</td>\n",
" <td> 19</td>\n",
" </tr>\n",
" <tr>\n",
" <th>6 </th>\n",
" <td> W: 16:00-20:00</td>\n",
" <td> 20</td>\n",
" </tr>\n",
" <tr>\n",
" <th>10 </th>\n",
" <td> M, W: 17:00-18:15</td>\n",
" <td> 20</td>\n",
" </tr>\n",
" <tr>\n",
" <th>14 </th>\n",
" <td> T: 16:00-18:50</td>\n",
" <td> 23</td>\n",
" </tr>\n",
" <tr>\n",
" <th>85 </th>\n",
" <td> M: 16:00-18:50</td>\n",
" <td> 25</td>\n",
" </tr>\n",
" <tr>\n",
" <th>195</th>\n",
" <td> M: 16:00-17:50</td>\n",
" <td> 25</td>\n",
" </tr>\n",
" <tr>\n",
" <th>174</th>\n",
" <td> M: 16:00-20:00</td>\n",
" <td> 26</td>\n",
" </tr>\n",
" <tr>\n",
" <th>191</th>\n",
" <td> U: 15:30-17:20</td>\n",
" <td> 26</td>\n",
" </tr>\n",
" </tbody>\n",
"</table>\n",
"<p>30 rows \u00d7 2 columns</p>\n",
"</div>"
],
"metadata": {},
"output_type": "pyout",
"prompt_number": 45,
"text": [
" time conflicts\n",
"76 S: 13:00-15:50 0\n",
"133 S: 09:00-11:50 0\n",
"153 F: 09:00-11:50 0\n",
"170 F: 13:00-15:50 0\n",
"203 R: 17:00-20:00 1\n",
"82 U: 17:30-19:30 1\n",
"45 U, T: 18:30-19:20 1\n",
"28 R: 16:00-18:50 2\n",
"129 M, W: 18:30-19:45 4\n",
"73 M, W: 17:30-18:50 8\n",
"44 W: 17:00-19:50 9\n",
"25 U, T, R: 17:00-18:50 9\n",
"64 R: 11:00-11:50 12\n",
"155 R: 10:00-10:50 12\n",
"142 R: 10:00-11:50 13\n",
"102 U, T: 17:00-18:15 14\n",
"158 W: 16:00-17:50 14\n",
"125 M, W: 17:00-18:50 15\n",
"88 U, T: 11:00-12:50 17\n",
"178 U: 16:00-18:45 18\n",
"11 W: 15:00-17:00 18\n",
"96 M: 17:00-18:00 18\n",
"160 U: 16:00-17:50 19\n",
"6 W: 16:00-20:00 20\n",
"10 M, W: 17:00-18:15 20\n",
"14 T: 16:00-18:50 23\n",
"85 M: 16:00-18:50 25\n",
"195 M: 16:00-17:50 25\n",
"174 M: 16:00-20:00 26\n",
"191 U: 15:30-17:20 26\n",
"\n",
"[30 rows x 2 columns]"
]
}
],
"prompt_number": 45
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"The slots 11:00-12:50 (UT), 16:00-18:45 (U), and 15:30-17:20 (U) seem appropriate. I'll leave it up to the attendees to decide between these time slots."
]
}
],
"metadata": {}
}
]
}
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment