# Drop columns
df = df.drop(['col1', 'col2', 'col3'], axis=1)
# Count NaN values for each column
df.isnull().sum()
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
class ForwardingRequestHandler (tornado.web.RequestHandler): | |
def handle_response(self, response): | |
if response.error and not isinstance(response.error, | |
tornado.httpclient.HTTPError): | |
note("response has error %s", response.error) | |
self.set_status(500) | |
self.write("Internal server error:\n" + | |
str(response.error)) |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
import pandas as pd | |
def _map_to_pandas(rdds): | |
""" Needs to be here due to pickling issues """ | |
return [pd.DataFrame(list(rdds))] | |
def toPandas(df, n_partitions=None): | |
""" | |
Returns the contents of `df` as a local `pandas.DataFrame` in a speedy fashion. The DataFrame is | |
repartitioned if `n_partitions` is passed. |
Copyright 2014, 2016 Kyle Kingsbury
This outline accompanies a 12-16 hour overview class on distributed systems fundamentals. The course aims to introduce software engineers to the practical basics of distributed systems, through lecture and discussion. Participants will gain an intuitive understanding of key distributed systems terms, an overview of the algorithmic landscape, and explore production concerns.
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Pour la config réseau il faudra juste déclarer les fqdn des noeuds (par exemple nodeX.filrouge.com) dans les fichiers /etc/hosts et /etc/sysconfig/network ( pour centos ou redhat). | |
Après le problème avec cet offre est qu'il ne donne qu'une seule adresse public, du coup les autres noeuds n'ont pas accès à internet; donc il faudra installer un proxy sur la machine connectée à internet et configurer le reste des noeuds pour passer par celui ci. | |
voila un tuto pr installer le proxy | |
installer le proxy ; | |
http://www.krizna.com/centos/how-to-install-squid-proxy-on-centos-6/ | |
http://www.cyberciti.biz/tips/linux-unix-squid-proxy-server-authentication.html |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
#!/usr/bin/env python | |
# encoding: utf-8 | |
import tweepy #https://github.com/tweepy/tweepy | |
import csv | |
#Twitter API credentials | |
consumer_key = "" | |
consumer_secret = "" | |
access_key = "" |
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
""" | |
A webserver to test Google OAuth in a couple of scenarios. | |
""" | |
import argparse | |
import time | |
import tornado.ioloop | |
import tornado.web | |
import tornado.auth | |
import tornado.gen |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
#!/bin/sh | |
TABLE_SCHEMA=$1 | |
TABLE_NAME=$2 | |
mytime=`date '+%y%m%d%H%M'` | |
hostname=`hostname | tr 'A-Z' 'a-z'` | |
file_prefix="trimax$TABLE_NAME$mytime$TABLE_SCHEMA" | |
bucket_name=$file_prefix | |
splitat="4000000000" | |
bulkfiles=200 |