Skip to content

Instantly share code, notes, and snippets.

@davidbjourno
Created July 4, 2018 14:22
Show Gist options
  • Save davidbjourno/57d958d4e18173471fa4efb179553268 to your computer and use it in GitHub Desktop.
Save davidbjourno/57d958d4e18173471fa4efb179553268 to your computer and use it in GitHub Desktop.
Hydrogen multiple kernels demo

Load, join and filter with Python/pandas

import pandas as pd

df_results = pd.read_csv(
    'data/raw/EU-referendum-result-data.csv',
    index_col='Area_Code')
df_aps = pd.read_csv(
    'data/raw/2395818381.csv',
    index_col='mnemonic',
    skiprows=7,
    nrows=380,
    na_values='-')
df_pop = pd.read_excel(
    'data/raw/ukmidyearestimates20122016.xls',
    sheet_name='MYE6',
    skiprows=4,
    index_col='Code')

df = df_results \
    .join(df_aps.iloc[:, [3, 7]]) \
    .join(df_pop.iloc[:, 1]) \
    .pipe(lambda df: df[~df['Region'].isin(['Northern Ireland', 'Scotland'])]) \
    .pipe(lambda df: df[df['Economic activity rate - aged 16-64'].notnull()]) \
    .pipe(lambda df: df[df['Employment rate - aged 16-64'].notnull()])

df

df.to_csv('joined-and-filtered.csv', index=False)

Mutate with R

library(tidyverse)

df <- read_csv('joined-and-filtered.csv')

df %>%
  mutate(result = ifelse(Remain > Leave, "remain", "leave")) %>%
  mutate(is_flagged = ifelse(
    `Mid-2016` >= 50 &
      `Employment rate - aged 16-64` <= mean(`Employment rate - aged 16-64`) &
      result == "leave",
    TRUE, FALSE
  )) %>%
  write.csv('mutated.csv')

Plot with D3

const jsdom = require('jsdom');
const d3 = require('d3');
const fs = require('fs');

const { JSDOM } = jsdom;
const { document } = (new JSDOM('')).window;

global.document = document;

const margin = { top: 20, right: 20, bottom: 20, left: 30 };
const width = 500 - margin.left - margin.right;
const height = 500 - margin.top - margin.bottom;
const x = d3.scaleLinear()
  .range([0, width]);
const y = d3.scaleLinear()
  .range([height, 0]);
const color = d3.scaleOrdinal(d3.schemeCategory10);
const xAxis = d3.axisBottom(x);
const yAxis = d3.axisLeft(y);
const svg = d3.select(document.body).append('svg')
  .attr('width', width + margin.left + margin.right)
  .attr('height', height + margin.top + margin.bottom)
const chart = svg.append('g')
  .attr('transform', `translate(${margin.left}, ${margin.top})`);

fs.readFile('mutated.csv', 'utf8', (error, data) => {
  if (error) throw error;

  data = d3.csvParse(data);

  x.domain(d3.extent(data, d => d['Mid-2016']));
  y.domain(d3.extent(data, d => d['Employment rate - aged 16-64']));

  chart.append('g')
      .attr('class', 'x axis')
      .attr('transform', 'translate(0,' + height + ')')
      .call(xAxis)
    .append('text')
      .attr('class', 'label')
      .attr('x', width)
      .attr('y', -6)
      .style('text-anchor', 'end')
      .style('fill', '#000')
      .text('Mean age');

  chart.append('g')
      .attr('class', 'y axis')
      .call(yAxis)
    .append('text')
      .attr('class', 'label')
      .attr('transform', 'rotate(-90)')
      .attr('y', 6)
      .attr('dy', '.71em')
      .style('text-anchor', 'end')
      .style('fill', '#000')
      .text('Employment rate - aged 16-64');

  chart.selectAll('.dot')
      .data(data)
    .enter().append('circle')
      .attr('class', 'dot')
      .attr('r', 3.5)
      .attr('cx', d => x(d['Mid-2016']))
      .attr('cy', d => y(d['Employment rate - aged 16-64']))
      .style('fill', d => color(d.is_flagged));
})

$$.svg(svg.node().outerHTML);
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment