Skip to content

Instantly share code, notes, and snippets.

Dmitry Grapov dgrapov

Block or report user

Report or block dgrapov

Hide content and notifications from this user.

Learn more about blocking users

Contact Support about this user’s behavior.

Learn more about reporting abuse

Report abuse
View GitHub Profile
View gist:d15aedea295f32fa43d76b0a864c577b
DATA SCIENCE EXERCISE
The following challenge requires the beer reviews data set called beer_reviews.csv. This data set can be downloaded from the following site: https://data.world/socialmediadata/beeradvocate . Note you can create a free temporary account to download this .csv.
Questions to answer using this data:
Which brewery produces the strongest beers by ABV%?
If you had to pick 3 beers to recommend using only this data, which would you pick?
Which of the factors (aroma, taste, appearance, palette) are most important in determining the overall quality of a beer?
Additional math/coding question unrelated to the data:
View replace_in.R
> in
Error: unexpected 'in' in "in"
@dgrapov
dgrapov / pca.R
Created Mar 24, 2018
basic principal components analysis and visualization in R
View pca.R
# Basic PCA example
# use www.createdatasol.com for
# an advanced user interface
#required packages for plotting
library(ggplot2)
library(ggrepel)
#load data
data<-read.csv('~/Sampledata.csv',
@dgrapov
dgrapov / example.R
Created Feb 2, 2018
Example of a shiny app with data upload and different plot options
View example.R
#initialize
library(shiny)
library(ggplot2)
library(purrr)
library(dplyr)
#example data
data(iris)
@dgrapov
dgrapov / tanimoto.R
Created Jan 10, 2018
fast (?) implementations of tanimoto distance calculations
View tanimoto.R
#' @title fast_tanimoto
#' @param mat matrix or data frame of numeric values
#' @param output 'matrix' (default) or 'edge list' (non-redundant and undirected)
#' @param progress TRUE, show progress
#' @imports reshape2
fast_tanimoto<-function(mat,output='matrix',progress=TRUE){
mat[is.na(mat)]<-0
#scoring function
score<-function(x){sum(x==2)/sum(x>0)}
@dgrapov
dgrapov / plotly_select_DT.R
Last active Jun 19, 2019
ggplot2 to plotly to shiny to box/lasso select to DT
View plotly_select_DT.R
#plotly box or lasso select linked to
# DT data table
# using Wage data
# the out group: is sex:Male, region:Middle Atlantic +
library(ggplot2)
library(plotly)
library(dplyr)
library(ISLR)
@dgrapov
dgrapov / SOM example.R
Last active Feb 21, 2018
Self-organizing map (SOM) example in R
View SOM example.R
#SOM example using wines data set
library(kohonen)
data(wines)
set.seed(7)
#create SOM grid
sommap <- som(scale(wines), grid = somgrid(2, 2, "hexagonal"))
## use hierarchical clustering to cluster the codebook vectors
groups<-3
@dgrapov
dgrapov / example.R
Last active Sep 21, 2015
Convert adjacency (or other) matrix to edge list
View example.R
library(reshape2)
gen.mat.to.edge.list<-function(mat,symmetric=TRUE,diagonal=FALSE,text=FALSE){
#create edge list from matrix
# if symmetric duplicates are removed
mat<-as.matrix(mat)
id<-is.na(mat) # used to allow missing
mat[id]<-"nna"
if(symmetric){mat[lower.tri(mat)]<-"na"} # use to allow missing values
if(!diagonal){diag(mat)<-"na"}
@dgrapov
dgrapov / RECA_test.R
Created Aug 21, 2015
Testing RECA: Relevant Component Analysis for Supervised Distance Metric Learning
View RECA_test.R
#R code, testing RECA with the iris data
library(RECA)
#test data
data(iris)
x<-iris[,-5]
y<-iris$Species
#similar groups (species) in each chunk (n=3)
chunksvec<-as.numeric(y)
@dgrapov
dgrapov / app
Last active Aug 29, 2015
ggvis linked brushing bug
View app
library(shiny)
library(ggvis)
shinyApp(
ui =bootstrapPage(
actionButton("randomize", "Randomize"),
ggvisOutput("plot1"),
ggvisOutput("plot2"),
verbatimTextOutput("summary")
),
You can’t perform that action at this time.