Skip to content

Instantly share code, notes, and snippets.

Ed Berry eddjberry

Block or report user

Report or block eddjberry

Hide content and notifications from this user.

Learn more about blocking users

Contact Support about this user’s behavior.

Learn more about reporting abuse

Report abuse
View GitHub Profile
@eddjberry
eddjberry / prop_test_power_curves.R
Created Jul 12, 2019
Power curves for a prop.test created using pwr & ggplot2
View prop_test_power_curves.R
#========================================================#
# Setup
#========================================================#
library(dplyr)
library(ggplot2)
library(here)
library(pwr)
library(scales)
library(stringr)
@eddjberry
eddjberry / round_nearest.R
Created Jun 5, 2019
Round values to the nearest 0.05, 5, 10 etc.
View round_nearest.R
# function to round a value to the nearest digit
# e.g. if nearest = 5 then 42 would round to 40
# and 47 would be rounded to 45
# source: http://r.789695.n4.nabble.com/Rounding-to-the-nearest-5-td863189.html
round_nearest <- function(x, nearest) {
nearest * round(x / nearest)
}
@eddjberry
eddjberry / group_prop.R
Created Feb 19, 2019
Get counts and proportions by group(s)
View group_prop.R
# get counts and proportions by group(s)
group_prop <- function(df, ...) {
# enquo the dots
var <- enquos(...)
# count then calculate
# proportions
df %>%
count(!!!var) %>%
@eddjberry
eddjberry / str_proper.R
Last active Feb 19, 2019
Format a string such that the first character is upper case and the rest are lower case
View str_proper.R
# a function to format strings
# to be in Proper case
str_proper <- function(string) {
# get the first letter
first_letter = substring(string, first = 1, last = 1)
# get the other letters
other_letters = substring(string, first = 2)
# combine the first letter (upper case)
@eddjberry
eddjberry / tibble_select_column.R
Last active Jan 24, 2019
Different return types for selecting columns from a tibble
View tibble_select_column.R
# create a tibble----------------------
tbl <- tibble::tibble(x = letters[1:5],
y = letters[5:1])
# returns a tibble --------------------
dplyr::select(tbl, x)
tbl[1]
tbl[, 1]
@eddjberry
eddjberry / show_palette_cols.R
Created Jan 24, 2019
Show the colours in a palette with hex codes
View show_palette_cols.R
library(scales)
library(viridis)
show_col(viridis(12))
@eddjberry
eddjberry / filter_at_remove_nas.R
Last active Feb 19, 2019
Using filter_at to remove rows with some or all NAs for a specified set of columns. If we wanted to do this for all columns we could use janitor::remove_empty('rows')
View filter_at_remove_nas.R
# create some data
(df <- data_frame(x = 1:2,
y = c(NA, NA),
z = c(NA, 3)))
# remove rows where either col y or z contain NA
# i.e. keep rows where all variables are not NA
df %>%
filter_at(vars(y:z), all_vars(!is.na(.)))
@eddjberry
eddjberry / split_df_csv.R
Created Aug 14, 2018
Create separate csv files of the data for each level of some categorical column
View split_df_csv.R
library(tidyverse)
# Nest iris by Species
iris_nest <- iris %>%
group_by(Species) %>%
nest()
# Get the data list and set the names of the list to Species
# write_csv for each df in the data list with its name as the filename
iris_nest %>%
@eddjberry
eddjberry / sim_binom.R
Last active May 26, 2018
Simulate a binomial target and some features
View sim_binom.R
sim_binom <- function(n_samples = 1000, n_features = 2,
true_target_prob = 0.5, beta = NULL, seed = NULL) {
if(!is.null(seed)) {
set.seed(seed)
}
x = matrix(rnorm(n_samples * n_features),
nrow = n_samples, ncol = n_features)
@eddjberry
eddjberry / sparklyr_cv_pipeline_example.R
Last active Aug 22, 2019
An example of creating a Spark pipeline with sparklyr
View sparklyr_cv_pipeline_example.R
# Load packages
library(dplyr)
library(sparklyr)
# Set up connect
sc <- spark_connect(master = "local")
# Create a Spark DataFrame of mtcars
mtcars_sdf <- copy_to(sc, mtcars)
You can’t perform that action at this time.