Skip to content

Instantly share code, notes, and snippets.

View nellaivijay's full-sized avatar
🎯
Focusing

Vijayakumar Ramdoss nellaivijay

🎯
Focusing
View GitHub Profile
#!/bin/bash
#
# MongoDB Backup Script
# VER. 0.1
# Note, this is a lobotomized port of AutoMySQLBackup
# (http://sourceforge.net/projects/automysqlbackup/) for use with
# MongoDB.
#
# This program is free software; you can redistribute it and/or modify
# it under the terms of the GNU General Public License as published by
@sritchie
sritchie / WholeFile.java
Created February 2, 2011 17:32
Hadoop input format for swallowing entire files.
package forma;
import forma.WholeFileInputFormat;
import cascading.scheme.Scheme;
import cascading.tap.Tap;
import cascading.tuple.Fields;
import cascading.tuple.Tuple;
import cascading.tuple.TupleEntry;
import java.io.IOException;
import org.apache.hadoop.mapred.JobConf;
@cchandler
cchandler / gist:939951
Created April 24, 2011 22:55
glpk model for cloud vs colo costs
#Colo Server costs
set ServerTypes;
set InstanceTypes;
param CoreDemand; #How many cores do we need for a workload
param OurMoney; #The maximum upper-bound of what we're willing to spend
param ColoCostPerU; #How much are we paying per U of colocation
param Months; # How many months do we know we need this hardware
@mhermans
mhermans / neo4R_example.R
Created August 29, 2011 09:50
Neo4j-Cypher-R
# Requirements
#sudo apt-get install libcurl4-gnutls-dev # for RCurl on linux
#install.packages('RCurl')
#install.packages('RJSONIO')
library('RCurl')
library('RJSONIO')
query <- function(querystring) {
h = basicTextGatherer()
@brikis98
brikis98 / Average page views by industry
Created November 2, 2011 03:57
Apache Pig Script Example
pv_by_industry = GROUP profile_view by viewee_industry_id
pv_avg_by_industry = FOREACH pv_by_industry
GENERATE group as viewee_industry_id, AVG(profie_view) AS average_pv;
@ChrisWills
ChrisWills / .screenrc-main-example
Created November 3, 2011 17:50
A nice default screenrc
# GNU Screen - main configuration file
# All other .screenrc files will source this file to inherit settings.
# Author: Christian Wills - cwills.sys@gmail.com
# Allow bold colors - necessary for some reason
attrcolor b ".I"
# Tell screen how to set colors. AB = background, AF=foreground
termcapinfo xterm 'Co#256:AB=\E[48;5;%dm:AF=\E[38;5;%dm'
@tonyc
tonyc / gist:1384523
Last active June 3, 2024 15:34
Using strace and lsof

Using strace and lsof to debug blocked processes

You can use strace on a specific pid to figure out what a specific process is doing, e.g.:

strace -fp <pid>

You might see something like:

select(9, [3 5 8], [], [], {0, 999999}) = 0 (Timeout)

@jteso
jteso / ZipFileInputFormat.java
Created February 20, 2012 05:56
How to read zip files from Map Reduce job -- Rolling your own input format (source:https://github.com/cotdp)
package com.jteso.hadoop.contrib.inputformat;
import java.io.IOException;
import org.apache.hadoop.fs.Path;
import org.apache.hadoop.io.BytesWritable;
import org.apache.hadoop.io.Text;
import org.apache.hadoop.mapreduce.InputSplit;
import org.apache.hadoop.mapreduce.RecordReader;
import org.apache.hadoop.mapreduce.TaskAttemptContext;
@darkseed
darkseed / apache-logs-hive.sql
Created March 12, 2012 00:04 — forked from emk/apache-logs-hive.sql
Apache log analysis with Hadoop, Hive and HBase
-- This is a Hive program. Hive is an SQL-like language that compiles
-- into Hadoop Map/Reduce jobs. It's very popular among analysts at
-- Facebook, because it allows them to query enormous Hadoop data
-- stores using a language much like SQL.
-- Our logs are stored on the Hadoop Distributed File System, in the
-- directory /logs/randomhacks.net/access. They're ordinary Apache
-- logs in *.gz format.
--
-- We want to pretend that these gzipped log files are a database table,
@tariqmislam
tariqmislam / instructions and how-to
Created March 22, 2012 15:58
Setting Up Hadoop 0.20.2 on Windows 7 With Cygwin
=================================================================
SETTING UP SSHD AS A SERVICE FOR RUNNING HADOOP DAEMONS ON WINDOWS 7
=================================================================
Steps:
1. Download 'setup.exe' from Cygwin website
2. Right-click on 'setup.exe'
3. Leave settings as they are, click through until you come to the plugin selection window
3.1 - Make sure that the installation directory is 'C:\cygwin'