Skip to content

Instantly share code, notes, and snippets.

@lmaccherone
Last active December 17, 2015 20:29
Show Gist options
  • Save lmaccherone/5667732 to your computer and use it in GitHub Desktop.
Save lmaccherone/5667732 to your computer and use it in GitHub Desktop.
Lumenize programming assignments

Assignment A: functions

We have a functions "module" that contains a number of functions. We want to add a sumCubes function. Start by looking at the code for functions.sum and functions.sumSquares. Go take a look at functions.sumSquares(). The code for sumCubes will be very similar but instead of summing the raw values (temp += v), or the square of the values (temp += v * v), this will sum the cube (temp += v * v * v). Note: temp += v is short hand for temp = temp + v.

Lumenize.functions are designed to incrementally calculate the results from prior results. Did you notice how that works? The trick is to understand that in CoffeeScript it is perfectly legal to call a function with less than the full list of parameters. So, if you call functions.sum([1, 2, 3, 4]) (only using the fist values parameter), the answer will be 10. However, let's say you previously calculated the sum for [1, 2, 3] = 6 and you wanted to take advantage of that prior calculation to speed up the calculation of sum of [1, 2, 3, 4]. In that case, you could call functions.sum(null, 6, [4]) the answer will also be 10 but the code will execute much faster. It won't make a difference for this small example but it would matter if the array had thousands of numbers and was incrementally updated dozens of times.

Note, the goal of this assignment is to get you familiar with the codebase, CoffeeScript language, and a test-driven approach to coding. You'll complete this assignment by cutting and pasting existing code to create similar functionality. If you follow instructions, you'll be able to get this working without too much difficulty. However, I fully expect to have to answer questions either about the code or setting up the environment. The bonus phase and later assignments will test your ability to think about algorithms.

Phase 1: A sumCubes() function

Steps:

  1. Run cake test to confirm that all tests are passing. Note, I have some work in progress on undocumented features (classfier and ANOVA) that will spit out some extra text when tests are run. Ignore those for now. But make sure you get an "OK" at the end.
  2. Devise a few simple test cases and add them to test/functionsTest.coffee. You can start with the test for sumSquares by cutting and pasting, renaming, and modifying.
  3. Run cake test and see that your tests fail. I know this sounds silly but it's how you do Test-Driven Development (TDD), aka test-first design.
  4. Cut and paste the code for sumSquares, modify it to do sumCubes instead.
  5. Run the tests again and see if they pass. Edit your code until they do.
  6. What other edge case tests would you ideally have for your new function? Add those. Can you suggest any improvements to the existing sum or sumSquares tests?

Phase 2: a product() function

Like the sum fuction which adds the numbers, we want a product function which multiplies them. So, product([2, 3, 7]) is 42. If we've previously calculated that and we want to know the product of [2, 3, 7, 2], then the incremental calculation capability would simply multiple the oldResult (42) by the newValues ([2]) and return 84.

Write a few tests for a product function, see them fail, then write the code for a product function until the tests pass. Add documentation and more edge case tests.

Phase 3 (bonus): a mode() function

The mode(s) of a list of values is(are) the value(s) that occurs most often. So, the mode of [1, 2, 3, 2] is 2. I write in the singular (and plural) because it's possible for there to be a tie for the mode. For instance, if we have [1, 1, 2, 2, 3], then the modes are both 1 and 2. Since there is a possibility of more than one, let's make the return value of our function always be an array and let's name the function modes() to help the user remember this. So, the answers to the two examples above would be [2] and [1, 2] respectively.

Note: It is perfectly acceptable to Google for "mode algorithm javascript" or something like that. I still want the code to be yours though. If you lean heavily on some web resource, please give credit in your comments like I do in my own documentation (see reference to wikipedia in my percentileCreator documentation).

I suspect that the hardest part of this assignment will be the creation of a data structure to store your temporary calculations. The ideal algorithm is to use a plain old javascript object storing a list of key:value pairs where the key is one of the numbers in the input and the value is the count of occurances. If this doesn't make sense to you after reading up on CoffeeScript/JavaScript, don't spend more than 15 minutes struggling with it. I can do a quick data structures lesson with you that will make this much easier.

Also, note that the oldResult and newValues don't help you with a mode() function. Write it to only work with the values parameter expecting it to be non-null every time this function is called. However, we will have to upgrade it to support dependentValues and prefix before we go live with it.

Again, don't struggle with this. If it's not coming to you and you need more training in CoffeeScript, don't fret. I don't expect you to be able to write this yet.

Assignment B: Use an OLAPCube

On-line analytical processing (OLAP) cubes are the bread and butter of business intelligence (BI) tools. The typical example involves categorizing sales by region, month, and type. The heart of Lumenize is its OLAP Cube implementation. In this assignment, you are going to use it to do simple aggregations and data analysis. Go read the documentation for Lumenize.OLAPCube

Note, the goal of this assignment is to understand how well you do at reading documentation and duplicating examples.

Let's say you have some data about the best books of all time (note pages are made up by me) and you want to do some analysis on it.

Phase 1: A simple group-by example

Here is the code to use the Lumenize OLAPCube to find out how many books were written in each century.

books = [
  {title: "The Three Musketeers", author: "Alexandre Dumas", century: 19, pages: 500},
  {title: "The Count of Monte Cristo", author: "Alexandre Dumas", century: 19, pages: 600},
  {title: "Pride and Prejudice", author: "Jane Austen", century: 19, pages: 500},
  {title: "Emma", author: "Jane Austen", century: 19, pages: 400},
  {title: "Mansfield Park", author: "Jane Austen", century: 19, pages: 650},
  {title: "Ulysses", author: "James Joyce", century: 20, pages: 400},
  {title: "The Great Gatsby", author: "F. Scott Fitzgerald", century: 20, pages: 350},
  {title: "A Portrait of the Artist", author: "James Joyce", century: 20, pages: 325},
  {title: "The Shining", author: "Stephen King", century: 20, pages: 1000},
  {title: "The Dark Tower", author: "Stephen King", century: 21, pages: 900}
]

{OLAPCube} = require("Lumenize")

dimensions = [{field: "century"}]
metrics = [{f: "count"}]
config = {dimensions}

cube = new OLAPCube(config, books)

console.log(cube.toString())

Create a file in the "play" subdirectory of Lumenize named b1.coffee. Drop to a command prompt in that directory and type coffee b1.coffee. You should see:

| century | _count |
|==================|
| 19      |      5 |
| 20      |      4 |
| 21      |      1 |

Phase 2: Your own group-by example

Write the code in a file named b2.coffee to find the sum of pages for each author.

Phase 3: A two-dimensional example

Write the code in a file named b3.coffee to create a pivot table with author as the rows, century as the columns and the average as the metric. Read up on the toString() function and use its optional parameters to create the output requested.

Phase 4: Answering questions

Write the code in a file named b4.coffee to answer the following questions (hint: you'll want to use keepTotals):

  1. Which century had the highest average page size?
  2. Which author had the lowest average page size?
  3. Which century had the highest min page size?
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment