Skip to content

Instantly share code, notes, and snippets.

View grigory93's full-sized avatar

Gregory Kanevsky grigory93

View GitHub Profile
from datetime import date
import datatable as dt
dt.Frame([dt.Type.date32.min, date(2021, 7, 20), dt.Type.date32.max], stype='date32')
@grigory93
grigory93 / covid-19-products-google-trends.R
Last active April 6, 2020 08:01
Google trends datasets for product searches during COVID-19 crisis
@grigory93
grigory93 / 0-covid-19-blog.R
Last active March 30, 2020 23:49
COVID-19 Blog visualizations using R and ggplot2
library(ggplot2)
library(ggthemes)
library(scales)
@grigory93
grigory93 / 01-connect-to-DAI.R
Last active December 16, 2019 17:44
creating visuals for DAI
# Load dai package
library(dai)
# Connect to Driverless AI instance
dai_uri = "http://your.instance.name:12345"
usr = "user_id"
pwd = "password"
dai.connect(uri = dai_uri, username = usr, password = pwd, force_version = FALSE)
#!/bin/bash
# Install core packages
sudo apt -y update && \
sudo apt -y --no-install-recommends install \
curl \
apt-utils \
wget \
libblas-dev \
default-jre \
@grigory93
grigory93 / creaate-decision-tree-and-get-tree.R
Last active October 7, 2021 05:43
Plotting decision trees with H2O-3
titanic_1tree = h2o.gbm(x = predictors, y = response,
training_frame = titanicHex,
ntrees = 1, min_rows = 1, sample_rate = 1, col_sample_rate = 1,
max_depth = 5,
# use early stopping once the validation AUC doesn't improve by at least 0.01%
# for 5 consecutive scoring events
stopping_rounds = 3, stopping_tolerance = 0.01,
stopping_metric = "AUC",
seed = 1)
@grigory93
grigory93 / ggplot_mtcars_barplot.R
Last active September 16, 2017 22:49
How to expand color palette with ggplot and RColorBrewer
library(ggplot2)
data(mtcars)
ggplot(mtcars) +
geom_bar(aes(factor(cyl), fill=factor(cyl)))
@grigory93
grigory93 / Survival Analysis in Dallas Animal Shelters using Dallas OpenData.R
Last active August 2, 2017 22:32
Survival Analysis in Dallas Animal Shelters using Dallas OpenData.R
library(RSocrata)
data15.source = read.socrata(url = "https://www.dallasopendata.com/resource/8pn8-24ku.csv")
data16.source = read.socrata(url = "https://www.dallasopendata.com/resource/4qfv-27du.csv")
data17.source = read.socrata(url = "https://www.dallasopendata.com/resource/8849-mzxh.csv")
@grigory93
grigory93 / vacation-recap-Italy-googledocs.R
Last active July 5, 2017 16:46
Small Data Visualization - Vacation Recap
library(googlesheets)
library(dplyr)
library(lubridate)
gsh = gs_title("Rome-Siracusa-vacations")
vdata = gsh %>% gs_read()
vdata$ts = mdy(vdata$Date)
@grigory93
grigory93 / data-set-model-and-K-folds.R
Last active May 31, 2016 20:01
Running parallel jobs on Aster with R and toaster
library(toaster)
close(conn)
conn = odbcDriverConnect(connection="driver={Aster ODBC Driver};server=10.xx.xx.xx;port=2406;database=dallas;uid=beehive;pwd=beehive",
interpretDot=TRUE)
dallasPermitsTableInfo = getTableSummary(conn, "dallasbuildingpermits")
getNumericColumns(dallasPermitsTableInfo)