- Setting up a new development VM should be as easy as 2 commands. And it is:
vagrant init; vagrant up
- In this example we are converting the HDP Sandbox to be used in this way. But the howto will work with any existing VM.
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
/* | |
* I add this to html files generated with pandoc. | |
*/ | |
html { | |
font-size: 100%; | |
overflow-y: scroll; | |
-webkit-text-size-adjust: 100%; | |
-ms-text-size-adjust: 100%; | |
} |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
# Plotting networks in R | |
# An example how to use R and rgexf package to create a .gexf file for network visualization in Gephi | |
############################################################################################ | |
# Clear workspace | |
rm(list = ls()) | |
# Load libraries | |
library("igraph") | |
library("plyr") |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
throttle = new.env(parent = emptyenv()) | |
throttle$recent = data.frame(domain = character(), last_visit = character()) | |
#' A throttled version of GET | |
#' | |
#' This uses \code{httr::GET} to fetch a web page, but throttles based on domains. | |
#' | |
#' \code{slowGET} keeps a list of domains recently accessed by itself in a | |
#' separate environment. If a domain has been accessed since \code{pause} | |
#' seconds ago, it will delay execution until that time has passed |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
sudo wget http://repos.fedorapeople.org/repos/dchen/apache-maven/epel-apache-maven.repo -O /etc/yum.repos.d/epel-apache-maven.repo | |
sudo sed -i s/\$releasever/6/g /etc/yum.repos.d/epel-apache-maven.repo | |
sudo yum install -y apache-maven | |
mvn --version |
The dplyr
package in R makes data wrangling significantly easier.
The beauty of dplyr
is that, by design, the options available are limited.
Specifically, a set of key verbs form the core of the package.
Using these verbs you can solve a wide range of data problems effectively in a shorter timeframe.
Whilse transitioning to Python I have greatly missed the ease with which I can think through and solve problems using dplyr in R.
The purpose of this document is to demonstrate how to execute the key dplyr verbs when manipulating data using Python (with the pandas
package).
dplyr is organised around six key verbs:
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
ser www-data; | |
worker_processes 4; | |
pid /run/nginx.pid; | |
events { | |
worker_connections 1024; | |
} | |
http { |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
#!/bin/bash | |
# These variables can be overwritten using the arguments below | |
VERSION="1.1.463" | |
# drwho is listed as user in YARN's Resource Manager UI. | |
USER="drwho" | |
# Depending on where the EMR cluster lives, you might have to change this to avoid security issues. | |
# To change the default password (and user), use the arguments bellow. | |
# If the cluster is not visible on the Internet, you can just leave the defaults for convenience. | |
PASS="tardis" |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Ansible playbook to setup HTTPS using Let's encrypt on nginx. | |
The Ansible playbook installs everything needed to serve static files from a nginx server over HTTPS. | |
The server pass A rating on [SSL Labs](https://www.ssllabs.com/). | |
To use: | |
1. Install [Ansible](https://www.ansible.com/) | |
2. Setup an Ubuntu 16.04 server accessible over ssh | |
3. Create `/etc/ansible/hosts` according to template below and change example.com to your domain | |
4. Copy the rest of the files to an empty directory (`playbook.yml` in the root of that folder and the rest in the `templates` subfolder) |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
#!/bin/bash | |
set -x -e | |
# AWS EMR bootstrap script | |
# for installing open-source R (www.r-project.org) with RHadoop packages and RStudio on AWS EMR | |
# | |
# tested with AMI 4.0.0 (hadoop 2.6.0) | |
# | |
# schmidbe@amazon.de | |
# 24. September 2014 |
OlderNewer