Skip to content

Instantly share code, notes, and snippets.

View smr547's full-sized avatar

Steven Ring smr547

View GitHub Profile
@smr547
smr547 / hdfs.md
Last active October 6, 2016 01:23
Hadoop Distributed Files System on HPC

Introduction

The department's HPC plaforms offers users 25TB of storage space within the [Hadoop Distributed File System](http://www.aosabook.org/en/hdfs.html]. This disk space is designed to store large datasets accessible by programs designed around the Map/Reduce pattern and running on the Hadoo platform.

User disk space

If you are user fred you may view the contents of your personal hdfs using the hadoop command

#!/usr/bin/env python
import numpy
import luigi
import luigi.contrib.mpi as mpi
import cPickle as pickle
from os.path import exists
import time
from datetime import datetime
from mpi_log_utils import MPILogFilter