Skip to content

Instantly share code, notes, and snippets.

@dgrapov
Created May 22, 2019 13:03
Show Gist options
  • Save dgrapov/d15aedea295f32fa43d76b0a864c577b to your computer and use it in GitHub Desktop.
Save dgrapov/d15aedea295f32fa43d76b0a864c577b to your computer and use it in GitHub Desktop.
Data Science Exercise
DATA SCIENCE EXERCISE
The following challenge requires the beer reviews data set called beer_reviews.csv. This data set can be downloaded from the following site: https://data.world/socialmediadata/beeradvocate . Note you can create a free temporary account to download this .csv.
Questions to answer using this data:
Which brewery produces the strongest beers by ABV%?
If you had to pick 3 beers to recommend using only this data, which would you pick?
Which of the factors (aroma, taste, appearance, palette) are most important in determining the overall quality of a beer?
Additional math/coding question unrelated to the data:
4. Generate 10,000 random numbers (i.e. sample) from a binomial distribution with p = 0.5 and N=20. Do not use any libraries or packages except basic math library functions and a random number generator (such as runif in R or random.random in python).
Plot the histogram of the data.
Document your thinking process as you attempt to answer these questions. Include any plots used to support your answers and provide the code, either packaged with the analysis as a markdown or notebook file or separately in a code file.  Complete the work using any open source programming language with statistics capabilities (e.g. R, python, scala, java, Octave, Julia. No SQL or SAS derivatives).
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment