Skip to content

Instantly share code, notes, and snippets.

@jsgriffin
Created March 7, 2011 16:41
Show Gist options
  • Star 0 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save jsgriffin/858756 to your computer and use it in GitHub Desktop.
Save jsgriffin/858756 to your computer and use it in GitHub Desktop.

Task 1: Twitter Wars!

idio has two offices in the UK, the first in Exeter and the second in London. As avid and curious Twitter users we'd like to find out whether the population of Exeter or London are better at spelling. By using the Twitter API, aggregate and analyse tweets from Exeter and London.

Assumptions

  • Hashtags and @ replies contained in tweets can be ignored and should not have an impact on the overall spelling quality of a tweet.
  • Decisions about how you classify what a tweet from London is and what a tweet from Exeter is are for you to make.
  • Decisions about the scoring of individual tweets are for you to make.

Deliverables

  • Full source code in PHP, Ruby or Python, along with instructions about how to run it for ourselves.
  • Results of the analysis along with any comments you wish to make, and short description of the decisions you made whilst programming your solution, written in either Markdown or Textile
  • Upload the source and description as a private Gists on GitHub

Task 2: Database Dilemma

Part A:

A client has asked for a new website with a custom platform to host articles written by users. Users sign up with the following data: email, name, password. Articles contain the following data: title, content, photo. Users should be able to create and tag each article with metadata, and the client would like to provide facilities for users to search for articles by tag or keyword search. The client also wishes to record each time a identified user views an article (and by associate the article's tags) in-order to infer the user's favorite topics of interest to be displayed in a personalized tag-cloud.

Design a relational database schema which will support this site (this can be written descriptions of each table, UML is not required), and explain any design decisions you have made.

Part B:

The client is experiencing high-traffic volumes and knock-on performance issues with both site search and the creation of personalized tag-clouds. The client has asked you to investigate alternative technologies which could provide performance gains. Suggest which aspects of the system could be replaced with alternatives for performance reasons, and which aspects should remain in a relational database schema.

Part C:

The client is keen to reduce the cost of hosting the platform, and allow for better performance and scalability. The client has asked you to investigate Amazon Web Services to help achieve this. Suggest which Amazon Web Services may be useful to run each part of the system and how the architecture could be adjusted to make best use of these.

Deliverables

  • Write a document outlining your suggestions, in either Markdown or Textile
  • Write a separate document with suggested schemas, in a format of your choice
  • Upload these as a private Gist on GitHub
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment