aksiksi/timing-optimization.ipynb

## timing-optimization.ipynb
{
 "metadata": {
  "name": ""
 },
 "nbformat": 3,
 "nbformat_minor": 0,
 "worksheets": [
  {
   "cells": [
    {
     "cell_type": "heading",
     "level": 1,
     "metadata": {},
     "source": [
      "Python Workshop: Timing Optimization"
     ]
    },
    {
     "cell_type": "heading",
     "level": 2,
     "metadata": {},
     "source": [
      "Outline"
     ]
    },
    {
     "cell_type": "markdown",
     "metadata": {},
     "source": [
      "The general idea behind this notebook is to determine the best time for the Python workshop I'll be presenting this semester (Spring 2014).\n",
      "\n",
      "The libraries used:\n",
      "\n",
      "* [cPickle](http://docs.python.org/2/library/pickle.html#module-cPickle)\n",
      "* [pandas](http://pandas.pydata.org/)\n",
      "* [datetime](http://docs.python.org/2/library/datetime.html)\n",
      "\n",
      "I'll be using a pre-formatted dump of all UAEU courses available this semester. The course dump will be parsed into a `DataFrame` for easier processing.\n",
      "\n",
      "The final output will be a `pandas` `DataFrame` containing the candidate times, in order of least amount of courses at that time."
     ]
    },
    {
     "cell_type": "heading",
     "level": 2,
     "metadata": {},
     "source": [
      "Implementation"
     ]
    },
    {
     "cell_type": "code",
     "collapsed": false,
     "input": [
      "import cPickle as pickle\n",
      "import pandas as pd\n",
      "from datetime import datetime"
     ],
     "language": "python",
     "metadata": {},
     "outputs": [],
     "prompt_number": 3
    },
    {
     "cell_type": "markdown",
     "metadata": {},
     "source": [
      "Let's first load up the course dump from the `pickle` file."
     ]
    },
    {
     "cell_type": "code",
     "collapsed": false,
     "input": [
      "with open('classes.pickle') as f:\n",
      "    course_dump = pickle.load(f)"
     ],
     "language": "python",
     "metadata": {},
     "outputs": [],
     "prompt_number": 4
    },
    {
     "cell_type": "markdown",
     "metadata": {},
     "source": [
      "Next, we compile a list of all of the sections that are both for boys and are at the undergraduate level."
     ]
    },
    {
     "cell_type": "code",
     "collapsed": false,
     "input": [
      "section_list = []\n",
      "\n",
      "for courses in course_dump.values():\n",
      "    for sections in courses.values():\n",
      "        for section in sections.values():\n",
      "            # Get the first digit of course code to determine level\n",
      "            try:\n",
      "                course_level = int(section['code'][0])\n",
      "            except:\n",
      "                continue\n",
      "            \n",
      "            # Take a section only if it is for boys and is undergrad level\n",
      "            if section['gender'] == 'B' and course_level < 6 and section['time'] != 'TBA':\n",
      "                section_list.append(section.values())\n",
      "\n",
      "columns = section.keys()"
     ],
     "language": "python",
     "metadata": {},
     "outputs": [],
     "prompt_number": 16
    },
    {
     "cell_type": "markdown",
     "metadata": {},
     "source": [
      "Using `section_list`, we create a `DataFrame` to represent the sections and view its tail."
     ]
    },
    {
     "cell_type": "code",
     "collapsed": false,
     "input": [
      "df = pd.DataFrame(data=section_list, columns=columns)\n",
      "df.tail()"
     ],
     "language": "python",
     "metadata": {},
     "outputs": [
      {
       "html": [
        "<div style=\"max-height:1000px;max-width:1500px;overflow:auto;\">\n",
        "<table border=\"1\" class=\"dataframe\">\n",
        "  <thead>\n",
        "    <tr style=\"text-align: right;\">\n",
        "      <th></th>\n",
        "      <th>status</th>\n",
        "      <th>total</th>\n",
        "      <th>code</th>\n",
        "      <th>title</th>\n",
        "      <th>gender</th>\n",
        "      <th>section</th>\n",
        "      <th>time</th>\n",
        "      <th>days</th>\n",
        "      <th>current</th>\n",
        "      <th>credits</th>\n",
        "      <th>abbrev</th>\n",
        "      <th>location</th>\n",
        "      <th>attribute</th>\n",
        "      <th>duration</th>\n",
        "      <th>instructor</th>\n",
        "      <th>crn</th>\n",
        "      <th>remaining</th>\n",
        "    </tr>\n",
        "  </thead>\n",
        "  <tbody>\n",
        "    <tr>\n",
        "      <th>827</th>\n",
        "      <td> NR</td>\n",
        "      <td> 27</td>\n",
        "      <td> 220</td>\n",
        "      <td> Engineering thermodynamics</td>\n",
        "      <td> B</td>\n",
        "      <td> 03</td>\n",
        "      <td> 12:00 pm-01:50 pm</td>\n",
        "      <td> TU</td>\n",
        "      <td> 23</td>\n",
        "      <td> 3.000</td>\n",
        "      <td> GENG</td>\n",
        "      <td>  F3 238</td>\n",
        "      <td> English</td>\n",
        "      <td> 02/16-06/19</td>\n",
        "      <td> Naeema I. Al Darmaki (P)</td>\n",
        "      <td> 21864</td>\n",
        "      <td> 4</td>\n",
        "    </tr>\n",
        "    <tr>\n",
        "      <th>828</th>\n",
        "      <td>  C</td>\n",
        "      <td> 30</td>\n",
        "      <td> 220</td>\n",
        "      <td> Engineering thermodynamics</td>\n",
        "      <td> B</td>\n",
        "      <td> 01</td>\n",
        "      <td> 12:00 pm-01:50 pm</td>\n",
        "      <td> MW</td>\n",
        "      <td> 30</td>\n",
        "      <td> 3.000</td>\n",
        "      <td> GENG</td>\n",
        "      <td>  F3 238</td>\n",
        "      <td> English</td>\n",
        "      <td> 02/16-06/19</td>\n",
        "      <td>   Mohammad O. Hamdan (P)</td>\n",
        "      <td> 21862</td>\n",
        "      <td> 0</td>\n",
        "    </tr>\n",
        "    <tr>\n",
        "      <th>829</th>\n",
        "      <td> NR</td>\n",
        "      <td> 27</td>\n",
        "      <td> 220</td>\n",
        "      <td> Engineering thermodynamics</td>\n",
        "      <td> B</td>\n",
        "      <td> 02</td>\n",
        "      <td> 08:00 am-09:50 am</td>\n",
        "      <td> TU</td>\n",
        "      <td> 26</td>\n",
        "      <td> 3.000</td>\n",
        "      <td> GENG</td>\n",
        "      <td>  F3 238</td>\n",
        "      <td> English</td>\n",
        "      <td> 02/16-06/19</td>\n",
        "      <td> Naeema I. Al Darmaki (P)</td>\n",
        "      <td> 21863</td>\n",
        "      <td> 1</td>\n",
        "    </tr>\n",
        "    <tr>\n",
        "      <th>830</th>\n",
        "      <td>  C</td>\n",
        "      <td> 25</td>\n",
        "      <td> 240</td>\n",
        "      <td>                    Statics</td>\n",
        "      <td> B</td>\n",
        "      <td> 02</td>\n",
        "      <td> 02:00 pm-03:50 pm</td>\n",
        "      <td> MW</td>\n",
        "      <td> 25</td>\n",
        "      <td> 3.000</td>\n",
        "      <td> GENG</td>\n",
        "      <td>  F3 036</td>\n",
        "      <td>       \u00a0</td>\n",
        "      <td> 02/16-06/19</td>\n",
        "      <td>     Said A. Elkhouly (P)</td>\n",
        "      <td> 20131</td>\n",
        "      <td> 0</td>\n",
        "    </tr>\n",
        "    <tr>\n",
        "      <th>831</th>\n",
        "      <td> NR</td>\n",
        "      <td> 25</td>\n",
        "      <td> 240</td>\n",
        "      <td>                    Statics</td>\n",
        "      <td> B</td>\n",
        "      <td> 01</td>\n",
        "      <td> 04:00 pm-05:50 pm</td>\n",
        "      <td> TU</td>\n",
        "      <td> 22</td>\n",
        "      <td> 3.000</td>\n",
        "      <td> GENG</td>\n",
        "      <td> F1 2126</td>\n",
        "      <td>       \u00a0</td>\n",
        "      <td> 02/16-06/19</td>\n",
        "      <td>   Hamad A. Al Jassmi (P)</td>\n",
        "      <td> 20130</td>\n",
        "      <td> 3</td>\n",
        "    </tr>\n",
        "  </tbody>\n",
        "</table>\n",
        "<p>5 rows \u00d7 17 columns</p>\n",
        "</div>"
       ],
       "metadata": {},
       "output_type": "pyout",
       "prompt_number": 17,
       "text": [
        "    status total code                       title gender section  \\\n",
        "827     NR    27  220  Engineering thermodynamics      B      03   \n",
        "828      C    30  220  Engineering thermodynamics      B      01   \n",
        "829     NR    27  220  Engineering thermodynamics      B      02   \n",
        "830      C    25  240                     Statics      B      02   \n",
        "831     NR    25  240                     Statics      B      01   \n",
        "\n",
        "                  time days current credits abbrev location attribute  \\\n",
        "827  12:00 pm-01:50 pm   TU      23   3.000   GENG   F3 238   English   \n",
        "828  12:00 pm-01:50 pm   MW      30   3.000   GENG   F3 238   English   \n",
        "829  08:00 am-09:50 am   TU      26   3.000   GENG   F3 238   English   \n",
        "830  02:00 pm-03:50 pm   MW      25   3.000   GENG   F3 036         \u00a0   \n",
        "831  04:00 pm-05:50 pm   TU      22   3.000   GENG  F1 2126         \u00a0   \n",
        "\n",
        "        duration                instructor    crn remaining  \n",
        "827  02/16-06/19  Naeema I. Al Darmaki (P)  21864         4  \n",
        "828  02/16-06/19    Mohammad O. Hamdan (P)  21862         0  \n",
        "829  02/16-06/19  Naeema I. Al Darmaki (P)  21863         1  \n",
        "830  02/16-06/19      Said A. Elkhouly (P)  20131         0  \n",
        "831  02/16-06/19    Hamad A. Al Jassmi (P)  20130         3  \n",
        "\n",
        "[5 rows x 17 columns]"
       ]
      }
     ],
     "prompt_number": 17
    },
    {
     "cell_type": "markdown",
     "metadata": {},
     "source": [
      "This `DataFrame` can easily be written to a CSV using the `to_csv` method. We don't need to yet - there's still the analysis to do!\n",
      "\n",
      "We'll need to define a new class that represents a section's time. Let's call it `SectionTime`."
     ]
    },
    {
     "cell_type": "code",
     "collapsed": false,
     "input": [
      "class SectionTime(object):\n",
      "    dt_cache = {}\n",
      "    \n",
      "    def __init__(self, time_string, day_string, format_string, seperator='-'):\n",
      "        # Time\n",
      "        start_string, end_string = time_string.split(seperator)\n",
      "        dt_cache = SectionTime.dt_cache\n",
      "        \n",
      "        self.start = dt_cache.get(start_string, None)\n",
      "        \n",
      "        if not self.start:\n",
      "            self.start = datetime.strptime(start_string, format_string)\n",
      "            dt_cache[start_string] = self.start\n",
      "            \n",
      "        self.end = dt_cache.get(end_string, None)\n",
      "        \n",
      "        if not self.end:\n",
      "            self.end = datetime.strptime(end_string, format_string)\n",
      "            dt_cache[end_string] = self.end\n",
      "            \n",
      "        self.start_string = self.start.strftime('%H:%M')\n",
      "        self.end_string = self.end.strftime('%H:%M')\n",
      "        \n",
      "        # Days\n",
      "        self.days = list(day_string)\n",
      "        \n",
      "    def is_conflict(self, other):\n",
      "        '''Check for a conflict with another SectionTime object.'''\n",
      "        if any([day in self.days for day in other.days]):\n",
      "            if other.start >= self.start and other.end <= self.end:\n",
      "                return True\n",
      "            elif self.start >= other.start and self.end <= other.end:\n",
      "                return True\n",
      "        else:\n",
      "            return False\n",
      "    \n",
      "    def __str__(self):\n",
      "        return '%s: %s-%s' % (', '.join(self.days), self.start_string, self.end_string)\n",
      "\n",
      "format_string = '%I:%M %p'\n",
      "tr1 = SectionTime(df.time[831], 'MW', format_string)\n",
      "tr2 = SectionTime(df.time[831], 'TU', format_string)\n",
      "\n",
      "tr1.is_conflict(tr2) # False\n",
      "tr1.is_conflict(tr1) # True"
     ],
     "language": "python",
     "metadata": {},
     "outputs": [
      {
       "metadata": {},
       "output_type": "pyout",
       "prompt_number": 18,
       "text": [
        "True"
       ]
      }
     ],
     "prompt_number": 18
    },
    {
     "cell_type": "markdown",
     "metadata": {},
     "source": [
      "Now we merge the `time` and `days` columns into a list of tuples, and then create a `SectionTime` for each pair."
     ]
    },
    {
     "cell_type": "code",
     "collapsed": false,
     "input": [
      "pairs = list(zip(df.time, df.days))\n",
      "sectiontimes = []\n",
      "\n",
      "for pair in pairs:\n",
      "    sectiontimes.append(SectionTime(pair[0], pair[1], format_string))\n",
      "    \n",
      "len(sectiontimes)"
     ],
     "language": "python",
     "metadata": {},
     "outputs": [
      {
       "metadata": {},
       "output_type": "pyout",
       "prompt_number": 41,
       "text": [
        "832"
       ]
      }
     ],
     "prompt_number": 41
    },
    {
     "cell_type": "markdown",
     "metadata": {},
     "source": [
      "Next, we'll take each `SectionTime`, compare it to the rest, and then store the stats in a dictionary."
     ]
    },
    {
     "cell_type": "code",
     "collapsed": false,
     "input": [
      "results = {}\n",
      "\n",
      "for time in sectiontimes:\n",
      "    conflict_count = 0\n",
      "    \n",
      "    for each in sectiontimes:\n",
      "        if each != time:\n",
      "            if time.is_conflict(each):\n",
      "                conflict_count += 1\n",
      "                \n",
      "    results[str(time)] = conflict_count"
     ],
     "language": "python",
     "metadata": {},
     "outputs": [],
     "prompt_number": 42
    },
    {
     "cell_type": "markdown",
     "metadata": {},
     "source": [
      "Finally, we convert our results into a `DataFrame`, sort by number of conflicts, and then output the 30 times with the least conflicts."
     ]
    },
    {
     "cell_type": "code",
     "collapsed": false,
     "input": [
      "results_df = pd.DataFrame(data=results.items(), columns=['time', 'conflicts'])\n",
      "results_df.sort(columns=['conflicts'])[:30]"
     ],
     "language": "python",
     "metadata": {},
     "outputs": [
      {
       "html": [
        "<div style=\"max-height:1000px;max-width:1500px;overflow:auto;\">\n",
        "<table border=\"1\" class=\"dataframe\">\n",
        "  <thead>\n",
        "    <tr style=\"text-align: right;\">\n",
        "      <th></th>\n",
        "      <th>time</th>\n",
        "      <th>conflicts</th>\n",
        "    </tr>\n",
        "  </thead>\n",
        "  <tbody>\n",
        "    <tr>\n",
        "      <th>76 </th>\n",
        "      <td>       S: 13:00-15:50</td>\n",
        "      <td>  0</td>\n",
        "    </tr>\n",
        "    <tr>\n",
        "      <th>133</th>\n",
        "      <td>       S: 09:00-11:50</td>\n",
        "      <td>  0</td>\n",
        "    </tr>\n",
        "    <tr>\n",
        "      <th>153</th>\n",
        "      <td>       F: 09:00-11:50</td>\n",
        "      <td>  0</td>\n",
        "    </tr>\n",
        "    <tr>\n",
        "      <th>170</th>\n",
        "      <td>       F: 13:00-15:50</td>\n",
        "      <td>  0</td>\n",
        "    </tr>\n",
        "    <tr>\n",
        "      <th>203</th>\n",
        "      <td>       R: 17:00-20:00</td>\n",
        "      <td>  1</td>\n",
        "    </tr>\n",
        "    <tr>\n",
        "      <th>82 </th>\n",
        "      <td>       U: 17:30-19:30</td>\n",
        "      <td>  1</td>\n",
        "    </tr>\n",
        "    <tr>\n",
        "      <th>45 </th>\n",
        "      <td>    U, T: 18:30-19:20</td>\n",
        "      <td>  1</td>\n",
        "    </tr>\n",
        "    <tr>\n",
        "      <th>28 </th>\n",
        "      <td>       R: 16:00-18:50</td>\n",
        "      <td>  2</td>\n",
        "    </tr>\n",
        "    <tr>\n",
        "      <th>129</th>\n",
        "      <td>    M, W: 18:30-19:45</td>\n",
        "      <td>  4</td>\n",
        "    </tr>\n",
        "    <tr>\n",
        "      <th>73 </th>\n",
        "      <td>    M, W: 17:30-18:50</td>\n",
        "      <td>  8</td>\n",
        "    </tr>\n",
        "    <tr>\n",
        "      <th>44 </th>\n",
        "      <td>       W: 17:00-19:50</td>\n",
        "      <td>  9</td>\n",
        "    </tr>\n",
        "    <tr>\n",
        "      <th>25 </th>\n",
        "      <td> U, T, R: 17:00-18:50</td>\n",
        "      <td>  9</td>\n",
        "    </tr>\n",
        "    <tr>\n",
        "      <th>64 </th>\n",
        "      <td>       R: 11:00-11:50</td>\n",
        "      <td> 12</td>\n",
        "    </tr>\n",
        "    <tr>\n",
        "      <th>155</th>\n",
        "      <td>       R: 10:00-10:50</td>\n",
        "      <td> 12</td>\n",
        "    </tr>\n",
        "    <tr>\n",
        "      <th>142</th>\n",
        "      <td>       R: 10:00-11:50</td>\n",
        "      <td> 13</td>\n",
        "    </tr>\n",
        "    <tr>\n",
        "      <th>102</th>\n",
        "      <td>    U, T: 17:00-18:15</td>\n",
        "      <td> 14</td>\n",
        "    </tr>\n",
        "    <tr>\n",
        "      <th>158</th>\n",
        "      <td>       W: 16:00-17:50</td>\n",
        "      <td> 14</td>\n",
        "    </tr>\n",
        "    <tr>\n",
        "      <th>125</th>\n",
        "      <td>    M, W: 17:00-18:50</td>\n",
        "      <td> 15</td>\n",
        "    </tr>\n",
        "    <tr>\n",
        "      <th>88 </th>\n",
        "      <td>    U, T: 11:00-12:50</td>\n",
        "      <td> 17</td>\n",
        "    </tr>\n",
        "    <tr>\n",
        "      <th>178</th>\n",
        "      <td>       U: 16:00-18:45</td>\n",
        "      <td> 18</td>\n",
        "    </tr>\n",
        "    <tr>\n",
        "      <th>11 </th>\n",
        "      <td>       W: 15:00-17:00</td>\n",
        "      <td> 18</td>\n",
        "    </tr>\n",
        "    <tr>\n",
        "      <th>96 </th>\n",
        "      <td>       M: 17:00-18:00</td>\n",
        "      <td> 18</td>\n",
        "    </tr>\n",
        "    <tr>\n",
        "      <th>160</th>\n",
        "      <td>       U: 16:00-17:50</td>\n",
        "      <td> 19</td>\n",
        "    </tr>\n",
        "    <tr>\n",
        "      <th>6  </th>\n",
        "      <td>       W: 16:00-20:00</td>\n",
        "      <td> 20</td>\n",
        "    </tr>\n",
        "    <tr>\n",
        "      <th>10 </th>\n",
        "      <td>    M, W: 17:00-18:15</td>\n",
        "      <td> 20</td>\n",
        "    </tr>\n",
        "    <tr>\n",
        "      <th>14 </th>\n",
        "      <td>       T: 16:00-18:50</td>\n",
        "      <td> 23</td>\n",
        "    </tr>\n",
        "    <tr>\n",
        "      <th>85 </th>\n",
        "      <td>       M: 16:00-18:50</td>\n",
        "      <td> 25</td>\n",
        "    </tr>\n",
        "    <tr>\n",
        "      <th>195</th>\n",
        "      <td>       M: 16:00-17:50</td>\n",
        "      <td> 25</td>\n",
        "    </tr>\n",
        "    <tr>\n",
        "      <th>174</th>\n",
        "      <td>       M: 16:00-20:00</td>\n",
        "      <td> 26</td>\n",
        "    </tr>\n",
        "    <tr>\n",
        "      <th>191</th>\n",
        "      <td>       U: 15:30-17:20</td>\n",
        "      <td> 26</td>\n",
        "    </tr>\n",
        "  </tbody>\n",
        "</table>\n",
        "<p>30 rows \u00d7 2 columns</p>\n",
        "</div>"
       ],
       "metadata": {},
       "output_type": "pyout",
       "prompt_number": 45,
       "text": [
        "                     time  conflicts\n",
        "76         S: 13:00-15:50          0\n",
        "133        S: 09:00-11:50          0\n",
        "153        F: 09:00-11:50          0\n",
        "170        F: 13:00-15:50          0\n",
        "203        R: 17:00-20:00          1\n",
        "82         U: 17:30-19:30          1\n",
        "45      U, T: 18:30-19:20          1\n",
        "28         R: 16:00-18:50          2\n",
        "129     M, W: 18:30-19:45          4\n",
        "73      M, W: 17:30-18:50          8\n",
        "44         W: 17:00-19:50          9\n",
        "25   U, T, R: 17:00-18:50          9\n",
        "64         R: 11:00-11:50         12\n",
        "155        R: 10:00-10:50         12\n",
        "142        R: 10:00-11:50         13\n",
        "102     U, T: 17:00-18:15         14\n",
        "158        W: 16:00-17:50         14\n",
        "125     M, W: 17:00-18:50         15\n",
        "88      U, T: 11:00-12:50         17\n",
        "178        U: 16:00-18:45         18\n",
        "11         W: 15:00-17:00         18\n",
        "96         M: 17:00-18:00         18\n",
        "160        U: 16:00-17:50         19\n",
        "6          W: 16:00-20:00         20\n",
        "10      M, W: 17:00-18:15         20\n",
        "14         T: 16:00-18:50         23\n",
        "85         M: 16:00-18:50         25\n",
        "195        M: 16:00-17:50         25\n",
        "174        M: 16:00-20:00         26\n",
        "191        U: 15:30-17:20         26\n",
        "\n",
        "[30 rows x 2 columns]"
       ]
      }
     ],
     "prompt_number": 45
    },
    {
     "cell_type": "markdown",
     "metadata": {},
     "source": [
      "The slots 11:00-12:50 (UT), 16:00-18:45 (U), and 15:30-17:20 (U) seem appropriate. I'll leave it up to the attendees to decide between these time slots."
     ]
    }
   ],
   "metadata": {}
  }
 ]
}
	{
	"metadata": {
	"name": ""
	},
	"nbformat": 3,
	"nbformat_minor": 0,
	"worksheets": [
	{
	"cells": [
	{
	"cell_type": "heading",
	"level": 1,
	"metadata": {},
	"source": [
	"Python Workshop: Timing Optimization"
	]
	},
	{
	"cell_type": "heading",
	"level": 2,
	"metadata": {},
	"source": [
	"Outline"
	]
	},
	{
	"cell_type": "markdown",
	"metadata": {},
	"source": [
	"The general idea behind this notebook is to determine the best time for the Python workshop I'll be presenting this semester (Spring 2014).\n",
	"\n",
	"The libraries used:\n",
	"\n",
	"* [cPickle](http://docs.python.org/2/library/pickle.html#module-cPickle)\n",
	"* [pandas](http://pandas.pydata.org/)\n",
	"* [datetime](http://docs.python.org/2/library/datetime.html)\n",
	"\n",
	"I'll be using a pre-formatted dump of all UAEU courses available this semester. The course dump will be parsed into a `DataFrame` for easier processing.\n",
	"\n",
	"The final output will be a `pandas` `DataFrame` containing the candidate times, in order of least amount of courses at that time."
	]
	},
	{
	"cell_type": "heading",
	"level": 2,
	"metadata": {},
	"source": [
	"Implementation"
	]
	},
	{
	"cell_type": "code",
	"collapsed": false,
	"input": [
	"import cPickle as pickle\n",
	"import pandas as pd\n",
	"from datetime import datetime"
	],
	"language": "python",
	"metadata": {},
	"outputs": [],
	"prompt_number": 3
	},
	{
	"cell_type": "markdown",
	"metadata": {},
	"source": [
	"Let's first load up the course dump from the `pickle` file."
	]
	},
	{
	"cell_type": "code",
	"collapsed": false,
	"input": [
	"with open('classes.pickle') as f:\n",
	" course_dump = pickle.load(f)"
	],
	"language": "python",
	"metadata": {},
	"outputs": [],
	"prompt_number": 4
	},
	{
	"cell_type": "markdown",
	"metadata": {},
	"source": [
	"Next, we compile a list of all of the sections that are both for boys and are at the undergraduate level."
	]
	},
	{
	"cell_type": "code",
	"collapsed": false,
	"input": [
	"section_list = []\n",
	"\n",
	"for courses in course_dump.values():\n",
	" for sections in courses.values():\n",
	" for section in sections.values():\n",
	" # Get the first digit of course code to determine level\n",
	" try:\n",
	" course_level = int(section['code'][0])\n",
	" except:\n",
	" continue\n",
	" \n",
	" # Take a section only if it is for boys and is undergrad level\n",
	" if section['gender'] == 'B' and course_level < 6 and section['time'] != 'TBA':\n",
	" section_list.append(section.values())\n",
	"\n",
	"columns = section.keys()"
	],
	"language": "python",
	"metadata": {},
	"outputs": [],
	"prompt_number": 16
	},
	{
	"cell_type": "markdown",
	"metadata": {},
	"source": [
	"Using `section_list`, we create a `DataFrame` to represent the sections and view its tail."
	]
	},
	{
	"cell_type": "code",
	"collapsed": false,
	"input": [
	"df = pd.DataFrame(data=section_list, columns=columns)\n",
	"df.tail()"
	],
	"language": "python",
	"metadata": {},
	"outputs": [
	{
	"html": [
	"<div style=\"max-height:1000px;max-width:1500px;overflow:auto;\">\n",
	"<table border=\"1\" class=\"dataframe\">\n",
	" <thead>\n",
	" <tr style=\"text-align: right;\">\n",
	" <th></th>\n",
	" <th>status</th>\n",
	" <th>total</th>\n",
	" <th>code</th>\n",
	" <th>title</th>\n",
	" <th>gender</th>\n",
	" <th>section</th>\n",
	" <th>time</th>\n",
	" <th>days</th>\n",
	" <th>current</th>\n",
	" <th>credits</th>\n",
	" <th>abbrev</th>\n",
	" <th>location</th>\n",
	" <th>attribute</th>\n",
	" <th>duration</th>\n",
	" <th>instructor</th>\n",
	" <th>crn</th>\n",
	" <th>remaining</th>\n",
	" </tr>\n",
	" </thead>\n",
	" <tbody>\n",
	" <tr>\n",
	" <th>827</th>\n",
	" <td> NR</td>\n",
	" <td> 27</td>\n",
	" <td> 220</td>\n",
	" <td> Engineering thermodynamics</td>\n",
	" <td> B</td>\n",
	" <td> 03</td>\n",
	" <td> 12:00 pm-01:50 pm</td>\n",
	" <td> TU</td>\n",
	" <td> 23</td>\n",
	" <td> 3.000</td>\n",
	" <td> GENG</td>\n",
	" <td> F3 238</td>\n",
	" <td> English</td>\n",
	" <td> 02/16-06/19</td>\n",
	" <td> Naeema I. Al Darmaki (P)</td>\n",
	" <td> 21864</td>\n",
	" <td> 4</td>\n",
	" </tr>\n",
	" <tr>\n",
	" <th>828</th>\n",
	" <td> C</td>\n",
	" <td> 30</td>\n",
	" <td> 220</td>\n",
	" <td> Engineering thermodynamics</td>\n",
	" <td> B</td>\n",
	" <td> 01</td>\n",
	" <td> 12:00 pm-01:50 pm</td>\n",
	" <td> MW</td>\n",
	" <td> 30</td>\n",
	" <td> 3.000</td>\n",
	" <td> GENG</td>\n",
	" <td> F3 238</td>\n",
	" <td> English</td>\n",
	" <td> 02/16-06/19</td>\n",
	" <td> Mohammad O. Hamdan (P)</td>\n",
	" <td> 21862</td>\n",
	" <td> 0</td>\n",
	" </tr>\n",
	" <tr>\n",
	" <th>829</th>\n",
	" <td> NR</td>\n",
	" <td> 27</td>\n",
	" <td> 220</td>\n",
	" <td> Engineering thermodynamics</td>\n",
	" <td> B</td>\n",
	" <td> 02</td>\n",
	" <td> 08:00 am-09:50 am</td>\n",
	" <td> TU</td>\n",
	" <td> 26</td>\n",
	" <td> 3.000</td>\n",
	" <td> GENG</td>\n",
	" <td> F3 238</td>\n",
	" <td> English</td>\n",
	" <td> 02/16-06/19</td>\n",
	" <td> Naeema I. Al Darmaki (P)</td>\n",
	" <td> 21863</td>\n",
	" <td> 1</td>\n",
	" </tr>\n",
	" <tr>\n",
	" <th>830</th>\n",
	" <td> C</td>\n",
	" <td> 25</td>\n",
	" <td> 240</td>\n",
	" <td> Statics</td>\n",
	" <td> B</td>\n",
	" <td> 02</td>\n",
	" <td> 02:00 pm-03:50 pm</td>\n",
	" <td> MW</td>\n",
	" <td> 25</td>\n",
	" <td> 3.000</td>\n",
	" <td> GENG</td>\n",
	" <td> F3 036</td>\n",
	" <td> \u00a0</td>\n",
	" <td> 02/16-06/19</td>\n",
	" <td> Said A. Elkhouly (P)</td>\n",
	" <td> 20131</td>\n",
	" <td> 0</td>\n",
	" </tr>\n",
	" <tr>\n",
	" <th>831</th>\n",
	" <td> NR</td>\n",
	" <td> 25</td>\n",
	" <td> 240</td>\n",
	" <td> Statics</td>\n",
	" <td> B</td>\n",
	" <td> 01</td>\n",
	" <td> 04:00 pm-05:50 pm</td>\n",
	" <td> TU</td>\n",
	" <td> 22</td>\n",
	" <td> 3.000</td>\n",
	" <td> GENG</td>\n",
	" <td> F1 2126</td>\n",
	" <td> \u00a0</td>\n",
	" <td> 02/16-06/19</td>\n",
	" <td> Hamad A. Al Jassmi (P)</td>\n",
	" <td> 20130</td>\n",
	" <td> 3</td>\n",
	" </tr>\n",
	" </tbody>\n",
	"</table>\n",
	"<p>5 rows \u00d7 17 columns</p>\n",
	"</div>"
	],
	"metadata": {},
	"output_type": "pyout",
	"prompt_number": 17,
	"text": [
	" status total code title gender section \\\n",
	"827 NR 27 220 Engineering thermodynamics B 03 \n",
	"828 C 30 220 Engineering thermodynamics B 01 \n",
	"829 NR 27 220 Engineering thermodynamics B 02 \n",
	"830 C 25 240 Statics B 02 \n",
	"831 NR 25 240 Statics B 01 \n",
	"\n",
	" time days current credits abbrev location attribute \\\n",
	"827 12:00 pm-01:50 pm TU 23 3.000 GENG F3 238 English \n",
	"828 12:00 pm-01:50 pm MW 30 3.000 GENG F3 238 English \n",
	"829 08:00 am-09:50 am TU 26 3.000 GENG F3 238 English \n",
	"830 02:00 pm-03:50 pm MW 25 3.000 GENG F3 036 \u00a0 \n",
	"831 04:00 pm-05:50 pm TU 22 3.000 GENG F1 2126 \u00a0 \n",
	"\n",
	" duration instructor crn remaining \n",
	"827 02/16-06/19 Naeema I. Al Darmaki (P) 21864 4 \n",
	"828 02/16-06/19 Mohammad O. Hamdan (P) 21862 0 \n",
	"829 02/16-06/19 Naeema I. Al Darmaki (P) 21863 1 \n",
	"830 02/16-06/19 Said A. Elkhouly (P) 20131 0 \n",
	"831 02/16-06/19 Hamad A. Al Jassmi (P) 20130 3 \n",
	"\n",
	"[5 rows x 17 columns]"
	]
	}
	],
	"prompt_number": 17
	},
	{
	"cell_type": "markdown",
	"metadata": {},
	"source": [
	"This `DataFrame` can easily be written to a CSV using the `to_csv` method. We don't need to yet - there's still the analysis to do!\n",
	"\n",
	"We'll need to define a new class that represents a section's time. Let's call it `SectionTime`."
	]
	},
	{
	"cell_type": "code",
	"collapsed": false,
	"input": [
	"class SectionTime(object):\n",
	" dt_cache = {}\n",
	" \n",
	" def __init__(self, time_string, day_string, format_string, seperator='-'):\n",
	" # Time\n",
	" start_string, end_string = time_string.split(seperator)\n",
	" dt_cache = SectionTime.dt_cache\n",
	" \n",
	" self.start = dt_cache.get(start_string, None)\n",
	" \n",
	" if not self.start:\n",
	" self.start = datetime.strptime(start_string, format_string)\n",
	" dt_cache[start_string] = self.start\n",
	" \n",
	" self.end = dt_cache.get(end_string, None)\n",
	" \n",
	" if not self.end:\n",
	" self.end = datetime.strptime(end_string, format_string)\n",
	" dt_cache[end_string] = self.end\n",
	" \n",
	" self.start_string = self.start.strftime('%H:%M')\n",
	" self.end_string = self.end.strftime('%H:%M')\n",
	" \n",
	" # Days\n",
	" self.days = list(day_string)\n",
	" \n",
	" def is_conflict(self, other):\n",
	" '''Check for a conflict with another SectionTime object.'''\n",
	" if any([day in self.days for day in other.days]):\n",
	" if other.start >= self.start and other.end <= self.end:\n",
	" return True\n",
	" elif self.start >= other.start and self.end <= other.end:\n",
	" return True\n",
	" else:\n",
	" return False\n",
	" \n",
	" def __str__(self):\n",
	" return '%s: %s-%s' % (', '.join(self.days), self.start_string, self.end_string)\n",
	"\n",
	"format_string = '%I:%M %p'\n",
	"tr1 = SectionTime(df.time[831], 'MW', format_string)\n",
	"tr2 = SectionTime(df.time[831], 'TU', format_string)\n",
	"\n",
	"tr1.is_conflict(tr2) # False\n",
	"tr1.is_conflict(tr1) # True"
	],
	"language": "python",
	"metadata": {},
	"outputs": [
	{
	"metadata": {},
	"output_type": "pyout",
	"prompt_number": 18,
	"text": [
	"True"
	]
	}
	],
	"prompt_number": 18
	},
	{
	"cell_type": "markdown",
	"metadata": {},
	"source": [
	"Now we merge the `time` and `days` columns into a list of tuples, and then create a `SectionTime` for each pair."
	]
	},
	{
	"cell_type": "code",
	"collapsed": false,
	"input": [
	"pairs = list(zip(df.time, df.days))\n",
	"sectiontimes = []\n",
	"\n",
	"for pair in pairs:\n",
	" sectiontimes.append(SectionTime(pair[0], pair[1], format_string))\n",
	" \n",
	"len(sectiontimes)"
	],
	"language": "python",
	"metadata": {},
	"outputs": [
	{
	"metadata": {},
	"output_type": "pyout",
	"prompt_number": 41,
	"text": [
	"832"
	]
	}
	],
	"prompt_number": 41
	},
	{
	"cell_type": "markdown",
	"metadata": {},
	"source": [
	"Next, we'll take each `SectionTime`, compare it to the rest, and then store the stats in a dictionary."
	]
	},
	{
	"cell_type": "code",
	"collapsed": false,
	"input": [
	"results = {}\n",
	"\n",
	"for time in sectiontimes:\n",
	" conflict_count = 0\n",
	" \n",
	" for each in sectiontimes:\n",
	" if each != time:\n",
	" if time.is_conflict(each):\n",
	" conflict_count += 1\n",
	" \n",
	" results[str(time)] = conflict_count"
	],
	"language": "python",
	"metadata": {},
	"outputs": [],
	"prompt_number": 42
	},
	{
	"cell_type": "markdown",
	"metadata": {},
	"source": [
	"Finally, we convert our results into a `DataFrame`, sort by number of conflicts, and then output the 30 times with the least conflicts."
	]
	},
	{
	"cell_type": "code",
	"collapsed": false,
	"input": [
	"results_df = pd.DataFrame(data=results.items(), columns=['time', 'conflicts'])\n",
	"results_df.sort(columns=['conflicts'])[:30]"
	],
	"language": "python",
	"metadata": {},
	"outputs": [
	{
	"html": [
	"<div style=\"max-height:1000px;max-width:1500px;overflow:auto;\">\n",
	"<table border=\"1\" class=\"dataframe\">\n",
	" <thead>\n",
	" <tr style=\"text-align: right;\">\n",
	" <th></th>\n",
	" <th>time</th>\n",
	" <th>conflicts</th>\n",
	" </tr>\n",
	" </thead>\n",
	" <tbody>\n",
	" <tr>\n",
	" <th>76 </th>\n",
	" <td> S: 13:00-15:50</td>\n",
	" <td> 0</td>\n",
	" </tr>\n",
	" <tr>\n",
	" <th>133</th>\n",
	" <td> S: 09:00-11:50</td>\n",
	" <td> 0</td>\n",
	" </tr>\n",
	" <tr>\n",
	" <th>153</th>\n",
	" <td> F: 09:00-11:50</td>\n",
	" <td> 0</td>\n",
	" </tr>\n",
	" <tr>\n",
	" <th>170</th>\n",
	" <td> F: 13:00-15:50</td>\n",
	" <td> 0</td>\n",
	" </tr>\n",
	" <tr>\n",
	" <th>203</th>\n",
	" <td> R: 17:00-20:00</td>\n",
	" <td> 1</td>\n",
	" </tr>\n",
	" <tr>\n",
	" <th>82 </th>\n",
	" <td> U: 17:30-19:30</td>\n",
	" <td> 1</td>\n",
	" </tr>\n",
	" <tr>\n",
	" <th>45 </th>\n",
	" <td> U, T: 18:30-19:20</td>\n",
	" <td> 1</td>\n",
	" </tr>\n",
	" <tr>\n",
	" <th>28 </th>\n",
	" <td> R: 16:00-18:50</td>\n",
	" <td> 2</td>\n",
	" </tr>\n",
	" <tr>\n",
	" <th>129</th>\n",
	" <td> M, W: 18:30-19:45</td>\n",
	" <td> 4</td>\n",
	" </tr>\n",
	" <tr>\n",
	" <th>73 </th>\n",
	" <td> M, W: 17:30-18:50</td>\n",
	" <td> 8</td>\n",
	" </tr>\n",
	" <tr>\n",
	" <th>44 </th>\n",
	" <td> W: 17:00-19:50</td>\n",
	" <td> 9</td>\n",
	" </tr>\n",
	" <tr>\n",
	" <th>25 </th>\n",
	" <td> U, T, R: 17:00-18:50</td>\n",
	" <td> 9</td>\n",
	" </tr>\n",
	" <tr>\n",
	" <th>64 </th>\n",
	" <td> R: 11:00-11:50</td>\n",
	" <td> 12</td>\n",
	" </tr>\n",
	" <tr>\n",
	" <th>155</th>\n",
	" <td> R: 10:00-10:50</td>\n",
	" <td> 12</td>\n",
	" </tr>\n",
	" <tr>\n",
	" <th>142</th>\n",
	" <td> R: 10:00-11:50</td>\n",
	" <td> 13</td>\n",
	" </tr>\n",
	" <tr>\n",
	" <th>102</th>\n",
	" <td> U, T: 17:00-18:15</td>\n",
	" <td> 14</td>\n",
	" </tr>\n",
	" <tr>\n",
	" <th>158</th>\n",
	" <td> W: 16:00-17:50</td>\n",
	" <td> 14</td>\n",
	" </tr>\n",
	" <tr>\n",
	" <th>125</th>\n",
	" <td> M, W: 17:00-18:50</td>\n",
	" <td> 15</td>\n",
	" </tr>\n",
	" <tr>\n",
	" <th>88 </th>\n",
	" <td> U, T: 11:00-12:50</td>\n",
	" <td> 17</td>\n",
	" </tr>\n",
	" <tr>\n",
	" <th>178</th>\n",
	" <td> U: 16:00-18:45</td>\n",
	" <td> 18</td>\n",
	" </tr>\n",
	" <tr>\n",
	" <th>11 </th>\n",
	" <td> W: 15:00-17:00</td>\n",
	" <td> 18</td>\n",
	" </tr>\n",
	" <tr>\n",
	" <th>96 </th>\n",
	" <td> M: 17:00-18:00</td>\n",
	" <td> 18</td>\n",
	" </tr>\n",
	" <tr>\n",
	" <th>160</th>\n",
	" <td> U: 16:00-17:50</td>\n",
	" <td> 19</td>\n",
	" </tr>\n",
	" <tr>\n",
	" <th>6 </th>\n",
	" <td> W: 16:00-20:00</td>\n",
	" <td> 20</td>\n",
	" </tr>\n",
	" <tr>\n",
	" <th>10 </th>\n",
	" <td> M, W: 17:00-18:15</td>\n",
	" <td> 20</td>\n",
	" </tr>\n",
	" <tr>\n",
	" <th>14 </th>\n",
	" <td> T: 16:00-18:50</td>\n",
	" <td> 23</td>\n",
	" </tr>\n",
	" <tr>\n",
	" <th>85 </th>\n",
	" <td> M: 16:00-18:50</td>\n",
	" <td> 25</td>\n",
	" </tr>\n",
	" <tr>\n",
	" <th>195</th>\n",
	" <td> M: 16:00-17:50</td>\n",
	" <td> 25</td>\n",
	" </tr>\n",
	" <tr>\n",
	" <th>174</th>\n",
	" <td> M: 16:00-20:00</td>\n",
	" <td> 26</td>\n",
	" </tr>\n",
	" <tr>\n",
	" <th>191</th>\n",
	" <td> U: 15:30-17:20</td>\n",
	" <td> 26</td>\n",
	" </tr>\n",
	" </tbody>\n",
	"</table>\n",
	"<p>30 rows \u00d7 2 columns</p>\n",
	"</div>"
	],
	"metadata": {},
	"output_type": "pyout",
	"prompt_number": 45,
	"text": [
	" time conflicts\n",
	"76 S: 13:00-15:50 0\n",
	"133 S: 09:00-11:50 0\n",
	"153 F: 09:00-11:50 0\n",
	"170 F: 13:00-15:50 0\n",
	"203 R: 17:00-20:00 1\n",
	"82 U: 17:30-19:30 1\n",
	"45 U, T: 18:30-19:20 1\n",
	"28 R: 16:00-18:50 2\n",
	"129 M, W: 18:30-19:45 4\n",
	"73 M, W: 17:30-18:50 8\n",
	"44 W: 17:00-19:50 9\n",
	"25 U, T, R: 17:00-18:50 9\n",
	"64 R: 11:00-11:50 12\n",
	"155 R: 10:00-10:50 12\n",
	"142 R: 10:00-11:50 13\n",
	"102 U, T: 17:00-18:15 14\n",
	"158 W: 16:00-17:50 14\n",
	"125 M, W: 17:00-18:50 15\n",
	"88 U, T: 11:00-12:50 17\n",
	"178 U: 16:00-18:45 18\n",
	"11 W: 15:00-17:00 18\n",
	"96 M: 17:00-18:00 18\n",
	"160 U: 16:00-17:50 19\n",
	"6 W: 16:00-20:00 20\n",
	"10 M, W: 17:00-18:15 20\n",
	"14 T: 16:00-18:50 23\n",
	"85 M: 16:00-18:50 25\n",
	"195 M: 16:00-17:50 25\n",
	"174 M: 16:00-20:00 26\n",
	"191 U: 15:30-17:20 26\n",
	"\n",
	"[30 rows x 2 columns]"
	]
	}
	],
	"prompt_number": 45
	},
	{
	"cell_type": "markdown",
	"metadata": {},
	"source": [
	"The slots 11:00-12:50 (UT), 16:00-18:45 (U), and 15:30-17:20 (U) seem appropriate. I'll leave it up to the attendees to decide between these time slots."
	]
	}
	],
	"metadata": {}
	}
	]
	}