Skip to content

Instantly share code, notes, and snippets.

@sureshsaggar
sureshsaggar / Apache Oozie - JA009: java.io.IOException: Invalid job requirements.
Created November 27, 2012 05:30
Apache Oozie - JA009: java.io.IOException: Invalid job requirements.
2012-11-26 18:46:14,525 WARN ActionStartXCommand:542 - USER[hduser] GROUP[-] TOKEN[] APP[map-reduce-wf] JOB[0000000-121126183401740-oozie-hdus-W] ACTION[0000000-121126183401740-oozie-hdus-W@mr-node] Error starting action [mr-node]. ErrorType [TRANSIENT], ErrorCode [JA009], Message [JA009: java.io.IOException: job_201211260754_0024(-1 memForMapTasks -1 memForReduceTasks): Invalid job requirements.
at org.apache.hadoop.mapred.JobTracker.checkMemoryRequirements(JobTracker.java:4992)
at org.apache.hadoop.mapred.JobTracker.submitJob(JobTracker.java:3752)
at org.apache.hadoop.mapred.JobTracker.submitJob(JobTracker.java:3695)
at sun.reflect.GeneratedMethodAccessor11.invoke(Unknown Source)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:601)
at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:578)
at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1393)
at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1389)
@sureshsaggar
sureshsaggar / bucket_mqtt_logs.pl
Last active December 10, 2015 01:48
Perl script to categorize log files in separate output directories depending on the file suffix.
#!/usr/bin/perl
use strict;
use warnings;
use File::Copy;
my $usage = <<'END';
# Usage: perl $0 <input_logs_dir> <output_logs_dir> <suffix_regex?>
# Example: perl bucket_mqtt_logs.pl ./snapshots/ ./snapshots/ 'log\.((\d{4})-(\d{2})-(\d{2}))-(\d{2})'
END
@sureshsaggar
sureshsaggar / Setting up STUD network proxy
Last active December 10, 2015 01:58
Setting up STUD network proxy
Specify the configuration in a file and test the same via the "-t" option.
root:~/stud# ./stud -t --config=ss_default_config.conf --ssl
Trying to initialize SSL contexts with your certificates{core}
Note: no DH parameters found in /root/stud/saggar.in.includesprivatekey.pem
stud configuration looks ok.
STUD creates two separate process i.e. do a ps from a different screen and you will see 2 pids:
root:~/stud# ps -ef| grep stud
@sureshsaggar
sureshsaggar / Setup GeoIP with NginX & PHP
Last active June 26, 2022 20:35
Setup GeoIP with NginX & PHP
On my Ubuntu machine I located the GeoIP.dat file. If not available then download/intall the same.
root@localhost:~# locate GeoIP.dat
/usr/share/GeoIP/GeoIP.dat
Open Nginx configuration (/etc/nginx/nginx.conf) and specify <geoip_country> <path to GeoIP.dat>
line under the "http" block. Example block:
http {
# SS - meant to find country code from the client IP
@sureshsaggar
sureshsaggar / Analytics - Join two collections in MongoDB
Last active December 10, 2015 14:49
Analytics - Join two collections in MongoDB
/*
* Pending - Covert _id to uid or vice versa
*/
// Collection 01 - users
users_map = function() {
// Simply emit the msisdn and 0 for the file length.
// The file length will come from the other collection.
emit(this._id, { msisdn: this.msisdn, file_length: 0 });
}
@sureshsaggar
sureshsaggar / gist:5270339
Last active December 15, 2015 13:49
ERROR 1066: Unable to open iterator for alias log
hdfs@hadoop-prod-growthui:~$ cat /var/lib/hdfs/pig_1364556796933.log
Pig Stack Trace
---------------
ERROR 1066: Unable to open iterator for alias log
org.apache.pig.impl.logicalLayer.FrontendException: ERROR 1066: Unable to open iterator for alias log
at org.apache.pig.PigServer.openIterator(PigServer.java:836)
at org.apache.pig.tools.grunt.GruntParser.processDump(GruntParser.java:696)
at org.apache.pig.tools.pigscript.parser.PigScriptParser.parse(PigScriptParser.java:320)
at org.apache.pig.tools.grunt.GruntParser.parseStopOnError(GruntParser.java:194)
@sureshsaggar
sureshsaggar / gist:5493610
Created May 1, 2013 03:40
Forecasting in R: Developing Python API to analyze time series data in Redis
'''
Description: dashboard/* primarily powers the APIs requires to support the growth dashboards.
@author: sureshsaggar
'''
#!/usr/bin/env python
from httpserver import *
from werkzeug.routing import *
import time
import redis
@sureshsaggar
sureshsaggar / Snippet from Redis
Created May 1, 2013 08:40
Forecasting in R: Python API to analyze time series data in Redis
redis 127.0.0.1:6379> hgetall linearregression
1) "1367373828" <<<< timestamp
2) "30860" <<<< visitors count
3) "1367473828"
4) "32860"
5) "1367273828"
6) "28860"
7) "1367073828"
8) "27060"
.....
@sureshsaggar
sureshsaggar / gist:5494401
Created May 1, 2013 09:02
Forecasting in R: Python API to analyze time series data in Redis
curl -X GET http://0.0.0.0:6600/rpy/linearregression
{
"stat": "pass",
"future": {
"c": -19581061.761599176,
"points": 4,
"m": 0.01434285714285654,
"predictions": {
"2013-05-05": 36307,
"2013-05-04": 35068,
@sureshsaggar
sureshsaggar / Hadoop distcp between hortonworks and cloudera
Last active December 18, 2015 04:48
Hadoop distcp between hortonworks and cloudera
hdfs@hadoop-prod-growthui:~$ hadoop distcp -i hdfs://hadoop-prod-master.vpc:8020/data/analytics/smsrecords hdfs://10.0.0.144:8020/data/analytics/smsrecords
13/06/07 07:18:19 INFO tools.DistCp: Input Options: DistCpOptions{atomicCommit=false, syncFolder=false, deleteMissing=false, ignoreFailures=true, maxMaps=20, sslConfigurationFile='null', copyStrategy='uniformsize', sourceFileListing=null, sourcePaths=[hdfs://hadoop-prod-master.vpc:8020/data/analytics/smsrecords], targetPath=hdfs://10.0.0.144:8020/data/analytics/smsrecords}
13/06/07 07:18:22 INFO service.AbstractService: Service:org.apache.hadoop.yarn.client.YarnClientImpl is inited.
13/06/07 07:18:23 INFO service.AbstractService: Service:org.apache.hadoop.yarn.client.YarnClientImpl is started.
13/06/07 07:18:26 ERROR tools.DistCp: Exception encountered
java.io.IOException: Failed on local exception: java.io.IOException: Broken pipe; Host Details : local host is: "hadoop-prod-growthui.vpc/10.0.0.230"; destination host is: "ip-10-0-0-144.ap-southeast-1.comp