Skip to content

Instantly share code, notes, and snippets.

Embed
What would you like to do?
{
"cells": [
{
"cell_type": "markdown",
"metadata": {},
"source": [
"# CMB Data Interview Question\n",
"\n",
"In CMB, Education is considered one of the most important factors when generating a match.\n",
"Matches of people that went to the same school, are more likely to convert into a connection.\n",
"\n",
"On our app, we have a free form text field to input education. This means that people:\n",
"- refer to the same school with a different name\n",
"- make spelling mistakes\n",
"\n",
"Interesting fact: people with comparable educations are more likely to connect.\n",
"\n",
"Example: consider these 4 names that all refer to the same school:\n",
"- MIT\n",
"- M.I.T.\n",
"- Mass. Uni\n",
"- Massachusetts Institute of Technology\n",
"\n",
"\n",
"We want to be able to group all these schools together.\n",
"\n",
"\n",
"Luckily our CS team went through the list of schools, and built a CSV\n",
"for us. The CSV contains 2 columns that describe 2 scool names as string\n",
"\n",
"| New or already existing school | New school name |\n",
"|--------------|---------------|\n",
"| MIT | Mass Uni. |\n",
"| Stanford | Stanford ENG |\n",
"| Mass Uni | Mass. I. Tech |\n",
"| Stanford | U. Stanford |\n",
"\n",
"Every row describes 2 schools that are perceived as the same school, with a different name.\n",
"\n",
"**NOTE:** CS has made sure that the second column `New school name` will always contain a new school that we have not seen before. Therefore, assuming that we scan our CSV from top to bottom, we can always guarantee that `New school name` has not been seen before. We cannot make the same assumption about `New or already existing school`"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Task\n",
"\n",
"Write a function that takes a parsed CSV file (a list of tuples) and a school name and outputs a collection of school names connected to it\n"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"SCHOOLS = [\n",
" ('A', 'B'),\n",
" ('C', 'D'),\n",
" ('B', 'E'),\n",
" ('C', 'F'),\n",
"]"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"# Question 1: Given two school names, check if they refer to the same school\n",
"def check_if_schools_are_the_same(school_a: str, school_b: str) -> bool:\n",
" \"\"\"Returns True if `school_a` and `school_b` are the same school, else returns False\"\"\"\n",
" pass\n"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"# Question 2: Get the root school of a specific school\n",
"def get_all_school_names(a_school: str) -> str:\n",
" \"\"\"Returns the name of the root school, given a school name\n",
" \n",
" If school name does not exist, `get_root_school` will return None\n",
" \"\"\"\n",
" pass\n"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Tests"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"# Question 1 tests\n",
"assert check_if_root_and_child_are_same('A', 'B') == True\n",
"assert check_if_root_and_child_are_same('A', 'E') == True\n",
"assert check_if_root_and_child_are_same('A', 'C') == False"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"# Question 2 tests\n",
"assert get_root_school('B') == 'A'\n",
"assert get_root_school('A') == 'A'"
]
}
],
"metadata": {
"kernelspec": {
"display_name": "Python 3",
"language": "python",
"name": "python3"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.6.6"
}
},
"nbformat": 4,
"nbformat_minor": 2
}
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment