Skip to content

Instantly share code, notes, and snippets.

View wrobstory's full-sized avatar

Rob Story wrobstory

View GitHub Profile
@wrobstory
wrobstory / forecast.txt
Created February 14, 2024 15:17
Feb 14 2024 Forecast
.DISCUSSION...Today through next Tuesday...Well, we mentioned a
nonzero chance of lowland snow in the last few discussions, and that
appears to be coming to fruition in what will be an extremely
challenging forecast for the lowlands north of about Salem. The
addition of high resolution guidance has significantly increased the
probabilities of snow accumulation for these areas, including for the
greater Portland and Vancouver metro area. Several inches of snow are
likely in the Columbia River Gorge east of Multnomah Falls, with over
a foot likely for the Cascades and upper portions of the Hood River
Valley by the time snow diminishes late Thursday or early Friday.
@wrobstory
wrobstory / dataeng.md
Last active September 24, 2023 16:14
Data Engineering Problem

You're the first data engineer and find your self with the following scenario:

Your company has three user-facing clients: Web, iOS, and Android. Your data science team is interested in analyzing the following data:

  1. Support messages
  2. Client interactions (clicks, touches, how they move through the app, etc)

The data scientists need to be able to join these two data streams together on a common user_id to perform their analysis. Currently the support messages are going to a service owned by the backend team; they go through standard HTTP endpoints and are getting written to PostgreSQL. You're going to be responsible for the service receiving the client interactions.

Q1: Knowing that you're going to be in charge of getting this to some sort of data store downstream, what would your schemas look like? The only hard requirement is that support messages must have the message body, and client interactions have to have event and target fields to represent actions like click on login button and t

PostgreSQL Data Types AWS DMS Data Types Redshift Data Types
INTEGER INT4 INT4
SMALLINT INT2 INT2
BIGINT INT8 INT8
NUMERIC (p,s) If precision is 39 or greater, then use STRING. If the scale is => 0 and =< 37 then: NUMERIC (p,s) If the scale is => 38 and =< 127 then: VARCHAR (Length)
DECIMAL(P,S) If precision is 39 or greater, then use STRING. If the scale is => 0 and =< 37 then: NUMERIC (p,s) If the scale is => 38 and =< 127 then: VARCHAR (Length)
REAL REAL4 FLOAT4
DOUBLE REAL8 FLOAT8
SMALLSERIAL INT2 INT2
SERIAL INT4 INT4
@wrobstory
wrobstory / README.md
Last active March 28, 2023 02:44
D3 Brush and Tooltip Complete

Example for Cooperative Brushing and Tooltips in D3.

The completed chart, with both tooltips and brushing working cooperatively. You can start a brush-zoom on either the background or a data point.

@wrobstory
wrobstory / README.md
Last active August 29, 2022 07:07
Folium Click-for-marker

A Leaflet.js map created with Folium- click on the map to add markers, double-click to remove them. This map was generated with the following Python code:

map_4 = folium.Map(location=[46.8527, -121.7649], tiles='Stamen Terrain',
                   zoom_start=13)
map_4.simple_marker(location=[46.8354, -121.7325], popup='Camp Muir')
map_4.click_for_marker(popup='Waypoint')
map_4.create_map(path='mtrainier.html')
@wrobstory
wrobstory / redshift.sql
Created November 5, 2014 16:53
Redshift debugging queries
-- Gets all queries for a given date range
select starttime, endtime, trim(querytxt) as query
from stl_query
where starttime between '2014-11-04' and '2014-11-05'
order by starttime desc;
-- Gets all queries that have been aborted for a given date range
select starttime, endtime, trim(querytxt) as query, aborted
from stl_query
where aborted=1
@wrobstory
wrobstory / README.md
Last active March 22, 2022 21:06
Folium Employed Choropleth

A Leaflet.js map created with Folium and the default D3 threshold scale, with data bound between the Pandas DataFrame and the TopoJSON. See the Gist for the python code to generate the dataframe. The map was generated with the following Python code:

map_1 = folium.Map(location=[48, -102], zoom_start=3)
map_1.geo_json(geo_path=county_geo, data_out='data1.json', data=df,
               columns=['GEO_ID', 'Employed_2011'],
               key_on='feature.id',
               fill_color='YlOrRd', fill_opacity=0.7, line_opacity=0.3,
               topojson='objects.us_counties_20m')
map_1.create_map(path='map_1.html')
@wrobstory
wrobstory / README.md
Last active March 22, 2022 20:03
Folium Choropleth Custom

A Leaflet.js map created with Folium and the default D3 threshold scale. See the Gist for the python code to generate the dataframe. The map was generated with the following Python code:

map.geo_json(geo_path=state_geo, data=state_data,
             columns=['State', 'Unemployment'],
             threshold_scale=[5, 6, 7, 8, 9, 10],
             key_on='feature.id',
             fill_color='BuPu', fill_opacity=0.7, line_opacity=0.5,
             legend_name='Unemployment Rate (%)',
 reset=True)
@wrobstory
wrobstory / README.md
Last active May 1, 2020 02:31
Choropleth with Vincent

Demonstration of binding Vincent graphs to individual GeoJSON data.

@wrobstory
wrobstory / traverse.rs
Last active April 26, 2020 21:16
Rust Traverse
fn into_result(input: &i32) -> Result<&i32, String> {
Ok(input)
}
fn main() {
let numbers: Vec<i32> = vec![1, 2, 3, 4, 5];
let mapper = numbers.iter().map(|x| into_result(x));
let vector_of_results = mapper.collect::<Vec<Result<&i32, String>>>();
println!("{:?}", vector_of_results);
// [Ok(1), Ok(2), Ok(3), Ok(4), Ok(5)]