Skip to content

Instantly share code, notes, and snippets.

@zippeurfou
zippeurfou / usa_particule.md
Created December 12, 2014 05:26
Getting the particule level from 1999 to 2013 in R insipired from the EDA class in coursera

Air pollution USA

Wednesday, December 10, 2014

##We will be looking at the pollution of the air in the USA

Data are extracted from http://www.epa.go.

@zippeurfou
zippeurfou / eda.md
Last active August 29, 2015 14:15
EDA analysis

EDA analysis

Marc Ferradou
Wednesday, November 26, 2014

##Data Sets

The data set consist of 6 csv files:

  • episode_country.csv => association between country id and country name
  • episode_region.csv => association between region id and region name (Europe...)
@zippeurfou
zippeurfou / codehunter.m
Last active August 29, 2015 14:16
codehunter
%Marc Ferradou
%v0.0.1
%Licence: WTFPL
%Date: 3/16/14
%This code is an answer to the BNP codehunter challenge. It tooks me about 2 hours and I picked Matlab for its simplicity to do the job.
%As easy as this code looks, I was actually one of the 20 winners. As we say, KIS.
%Read more here: https://graduates.bnpparibas.com/codehunter/crack-the-challenge/
%Result: this is a 29x29 transformation of a london (I believe) landscape
%with 3 Easter eggs.
%Note: I choosed matlab because it was in my opinion the quickest (in term
@zippeurfou
zippeurfou / rmachinelearning.md
Last active November 27, 2016 08:58
Playing with R and machine learning

##Introduction The question is: Tell us something interesting about the ping backs we receive from videos. Input:

  • Question asked before
  • tsv data file
  • pdf file with data format Output:
  • This document

I picked R in order to do this analysis as it did appears to me that this is mainly an exploratory data analysis and R markdown + ggplot2 are very conveniant for that in my opinion.

@zippeurfou
zippeurfou / rsqlite.md
Created February 13, 2015 21:08
Playing with R and SQLite

There are multiple questions:

  1. The operating regions are indicated by region_id. Generate a report of the average hourly_charge in each operating region as well as the overall average.

  2. Assuming that a booking is completed if it is not cancelled by the customer and has no reschedule events, generate a report based on the calendar week (running Sun-Sat) of the number of bookings done, number of bookings done using coupons, total hours booked, and number of bookings which were cancelled by the customer.

  3. Recurring bookings are bookings which happen on a regularly scheduled basis and are indicated by recurring_id and a frequency (freq) indicating how many weeks pass between each booking in the series. Determine the distribution of bookings based on the frequency of the recurring booking to which they belong across the days of the week on which they were completed.

  4. Say we have a problem with customers canceling and rescheduling bookings. Assuming all the bookings are from different users, pull metric

@zippeurfou
zippeurfou / R_and_FB_graph_API.md
Last active August 21, 2018 04:09
Using Facebook graph API with R

Playing with facebook

Tuesday, December 09, 2014

library(knitr)
opts_knit$set(upload.fun = imgur_upload, base.url = NULL) # upload all images to imgur.com
d3 = function() {
var d3 = {
version: "3.2.7"
};
if (!Date.now) Date.now = function() {
return +new Date();
};
var d3_document = document, d3_documentElement = d3_document.documentElement, d3_window = window;
try {
d3_document.createElement("div").style.setProperty("opacity", 0, "");
d3 = function() {
var d3 = {
version: "3.2.7"
};
if (!Date.now) Date.now = function() {
return +new Date();
};
var d3_document = document, d3_documentElement = d3_document.documentElement, d3_window = window;
try {
d3_document.createElement("div").style.setProperty("opacity", 0, "");