Skip to content

Instantly share code, notes, and snippets.


Sam Firke sfirke

View GitHub Profile
sfirke /
Created Dec 30, 2020
Retrieving SolarEdge solar panel generation data from the API using Python
import pandas as pd
import solaredge
import time
s = solaredge.Solaredge("YOUR-API-KEY")
site_id = YOUR-SITE-ID
# Edit this date range as you see fit
# If querying at the maximum resolution of 15 minute intervals, the API is limited to queries of a month at a time
# This script queries one day at a time, with a one-second pause per day that is polite but probably not necessary
sfirke / tidytext_wordclouds.R
Created Mar 9, 2018
Make wordclouds from a text column in R
View tidytext_wordclouds.R
p_load(tidytext, wordcloud, janeaustenr, dplyr)
ppdf <- data.frame(prideprejudice, stringsAsFactors = FALSE)
# create a word cloud
create_word_cloud <- function(dat, col_name, exclude = "", max.words = 50, colors = "#034772", ...){
col <- deparse(substitute(col_name))
dat %>%
sfirke / split_tinker_combine_tidyverse.R
Created Mar 7, 2018
Using split with magrittr's $%$ to reference the names of the listed data.frames
View split_tinker_combine_tidyverse.R
# I want to remove duplicate mpg rows where cylinder is 4
# Split, tinker with the data.frames by name, bind_rows
mtcars %>%
split(., .$cyl == 4) %$%
`TRUE` %>%
distinct(mpg, .keep_all = TRUE))
sfirke / render_keep_md.R
Last active May 22, 2020
Function to build R package vignettes, retaining both .md and .Rmd
View render_keep_md.R
# From
# Usage: render_keep_md("tabyls")
render_keep_md <- function(vignette_name){
# added the "encoding" argument to get the oe character passed through correctly to the resulting .Md
rmarkdown::render(paste0("./vignettes/",vignette_name, ".Rmd"), clean=FALSE, encoding = 'UTF-8')
files_to_remove = paste0("./vignettes/",vignette_name, c(".html","",""))
lapply(files_to_remove, file.remove)
sfirke / fix_surveymonkey_two_row_headers.R
Last active Mar 29, 2018
(roughly) handle SurveyMonkey exports where the variable names are split over the first two rows
View fix_surveymonkey_two_row_headers.R
# Fix dual-row names: if the first row is not NA or containing the word "response", use the one from the first row
# Note: read your SurveyMonkey .csv with readr::read_csv, not read.csv - otherwise this may not work
fix_SM_dual_row_names <- function(dat){
current_names <- names(dat)
row_1 <- unlist(dat[1, ])
sfirke / add_centered_title.R
Last active Sep 21, 2017
Center all of your ggplot2 titles over the whole plot using a function
View add_centered_title.R
add_centered_title <- function(p, text, font_size){
title.grob <- textGrob(
label = text,
gp = gpar(fontsize = font_size,
View file28dc3223345c.R
Package: janitor
Title: Simple Tools for Examining and Cleaning Dirty Data
Authors@R: c(person("Sam", "Firke", email = "", role = c("aut", "cre")),
person("Chris", "Haid", email = "", role = "ctb"),
person("Ryan", "Knight", email = "", role = "ctb"))
Description: The main janitor functions can: perfectly format data.frame column
names; provide quick one- and two-variable tabulations (i.e., frequency
tables and crosstabs); and isolate duplicate records. Other janitor functions
nicely format the tabulation results. These tabulate-and-report functions
sfirke / final_predictions.R
Created Mar 16, 2017
making final Kaggle March Mania predictions
View final_predictions.R
final_blank <- read_csv("data/kaggle/SampleSubmission.csv") %>%
separate(Id, into = c("year", "lower_team", "higher_team"), sep = "_", convert = TRUE, remove = FALSE) %>%
final_blank_with_data <- final_blank %>%
add_kp_data %>%
create_vars_for_prediction %>%
mutate(lower_team_court_adv = as.factor("N")) %>%
dplyr::select(contains("diff"), lower_team_court_adv, contains("rank")) %>%
dplyr::select(-lower_pre_seas_rank_all, -higher_pre_seas_rank_all)
sfirke / gist:c0bd2b9c4d4e044b040966841e19a73b
Last active Oct 19, 2016
quick hack at get_fuzzy_dupes() function
View gist:c0bd2b9c4d4e044b040966841e19a73b
p_load(fuzzyjoin, dplyr)
# returns clusters of records that almost match
get_fuzzy_dupes <- function(x, max_dist = 2){
result <- stringdist_inner_join(x, x, max_dist = max_dist, distance_col = "distance")
result <- result[result[[1]] != result[[2]], ] # remove actual 100% accurate duplicates
result <- t(apply(result, 1, sort)) # these two lines treat A, B as a duplicate of B, A and remove it. From
result <- result[!duplicated(result), ]
as_data_frame(result) %>%
sfirke / email_split.R
Created Jul 5, 2016
separating first and last names in email
View email_split.R
get_part_before_dot <- function(email){
x <- str_split(email, "[.]")
lapply(x, `[[`, 1) %>%
dat <- data.frame(email = c("", ""))