Skip to content

Instantly share code, notes, and snippets.

@clee-jana
Forked from abdinoor-jana/coding-chalenge.md
Last active December 17, 2017 13:46
Show Gist options
  • Save clee-jana/a1535a7ea0c7b0fed41d to your computer and use it in GitHub Desktop.
Save clee-jana/a1535a7ea0c7b0fed41d to your computer and use it in GitHub Desktop.

Challenge:

Create a command line program that will take an internet domain name (i.e. “jana.com”) and print out a list of the email addresses that were found on that website only.

Example:

The following is expected output from jana.com and web.mit.edu, but it should also run on other websites. In the example of jana.com, the program should not crawl other subdomains (blog.jana.com, technology.jana.com).

# These are expected output from www.jana.com
> python find_email_addresses.py www.jana.com
Found these email addresses:
sales@jana.com
press@jana.com
info@jana.com

# Here are some examples from web.mit.edu (subject to change)
> python find_email_addresses.py web.mit.edu
Found these email addresses:
campus-map@mit.edu
mitgrad@mit.edu
sfs@mit.edu
llwebmaster@ll.mit.edu
webmaster@ll.mit.edu
whatsonyourmind@mit.edu
fac-officers@mit.edu

More information:

  • You can use any modern programming language you like. We work in Python and Java, so one of those is preferred but not required.
  • Create a new github repository for this project. The repository should be public but please give it some kind of codename that doesn't have the word jana in it. The master branch should be empty, and then create a branch with your code in it.
  • Push your branch up to github, and create a pull request. Send me the link to the pull request, and I can comment directly on it. All our code goes through this code review process, so it's a little glimpse into how we work.
  • In your repo, please include a readme that has any instructions we might need to setup and install your solution.
  • Your program must work on another computer, so be sure to include any required libraries (using libraries is OK). You do not need to check in the source for those libraries. Build scripts and/or a requirements.txt file would be preferred.

Hints:

  • Make sure to find email addresses on any discoverable page of the website, not just the home page.

Style:

  • At Jana we follow the Google Style Guides for Python and Java. However, it is not critical for this challenge.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment