Skip to content

Instantly share code, notes, and snippets.

View dedunumax's full-sized avatar

Dedunu Dhananjaya dedunumax

View GitHub Profile
from urllib.request import urlopen
from xml.dom import minidom
__author__ = 'dedunu'
class Reader:
def get_url_list(self, url_string):
data = urlopen(url_string)
rss_string = b''
import urllib2
from xml.dom import minidom
__author__ = 'dedunu'
class Reader:
def get_url_list(self, url_string):
data = urllib2.urlopen(url_string)
rss_string = ''
@dedunumax
dedunumax / remove_all_followers.py
Last active December 15, 2022 13:57
Remove all the followers from your twitter account.
import tweepy
__author__ = 'dedunumax'
'''
This script will remove all the followers from your twitter account. For that first it will block user one by one and
then unblock them. If you are following your followers, you won't be subscribed to them anymore once you run this job.
Rub this script carefully.
Install tweepy module using pip. To install tweepy run below command in your terminal.
@dedunumax
dedunumax / Hadoop Cluster ID mismatch error log
Last active August 29, 2015 14:21
Hadoop Cluster ID mismatch error
FATAL org.apache.hadoop.hdfs.server.datanode.DataNode: Initialization failed for Block pool (Datanode Uuid unassigned) service to master/192.168.1.1:9000. Exiting.
java.io.IOException: Incompatible clusterIDs in /home/hadoop/hadoop/data: namenode clusterID = CID-68a4c0d2-5524-486e-8bc9-e1fc3c5c2e29; datanode clusterID = CID-c6c3e9e5-be1c-4a3f-a4b2-bb9441a989c5
at org.apache.hadoop.hdfs.server.datanode.DataStorage.doTransition(DataStorage.java:646)
at org.apache.hadoop.hdfs.server.datanode.DataStorage.addStorageLocations(DataStorage.java:320)
at org.apache.hadoop.hdfs.server.datanode.DataStorage.recoverTransitionRead(DataStorage.java:403)
at org.apache.hadoop.hdfs.server.datanode.DataStorage.recoverTransitionRead(DataStorage.java:422)
at org.apache.hadoop.hdfs.server.datanode.DataNode.initStorage(DataNode.java:1311)
at org.apache.hadoop.hdfs.server.datanode.DataNode.initBlockPool(DataNode.java:1276)
at org.apache.hadoop.hdfs.server.datanode.BPOfferService.verifyAndSetNamespaceInfo(
15/05/21 09:48:09 INFO client.RMProxy: Connecting to ResourceManager at /0.0.0.0:8032
15/05/21 09:48:10 WARN mapreduce.JobSubmitter: Hadoop command-line option parsing not performed. Implement the Tool interface and execute your application with ToolRunner to remedy this.
15/05/21 09:48:10 INFO input.FileInputFormat: Total input paths to process : 1
15/05/21 09:48:10 INFO input.FileInputFormat: Total input paths to process : 1
15/05/21 09:48:10 INFO input.FileInputFormat: Total input paths to process : 1
15/05/21 09:48:10 INFO mapreduce.JobSubmitter: number of splits:3
15/05/21 09:48:10 INFO mapreduce.JobSubmitter: Submitting tokens for job: job_1432197111554_0002
15/05/21 09:48:10 INFO impl.YarnClientImpl: Submitted application application_1432197111554_0002
15/05/21 09:48:10 INFO mapreduce.Job: The url to track the job: http://hdp101.local:8088/proxy/application_1432197111554_0002/
15/05/21 09:48:10 INFO mapreduce.Job: Running job: job_1432197111554_0002
@dedunumax
dedunumax / MultiInputSampleHadoop.java
Created May 21, 2015 09:50
Sample Main class for Hadoop MultiInputs.
package org.dedunu.hadoop.muiltiinputsample;
import org.apache.hadoop.conf.Configuration;
import org.apache.hadoop.fs.Path;
import org.apache.hadoop.io.Text;
import org.apache.hadoop.mapreduce.Job;
import org.apache.hadoop.mapreduce.lib.input.MultipleInputs;
import org.apache.hadoop.mapreduce.lib.output.FileOutputFormat;
import org.apache.hadoop.mapreduce.lib.output.TextOutputFormat;
@dedunumax
dedunumax / AirlineInputFormat.java
Created May 21, 2015 09:49
Sample Custom InputFormat class for Hadoop.
package org.dedunu.hadoop.muiltiinputsample;
import org.apache.hadoop.io.Text;
import org.apache.hadoop.mapreduce.InputSplit;
import org.apache.hadoop.mapreduce.RecordReader;
import org.apache.hadoop.mapreduce.TaskAttemptContext;
import org.apache.hadoop.mapreduce.lib.input.FileInputFormat;
import org.apache.hadoop.mapreduce.lib.input.LineRecordReader;
import java.io.IOException;
ABC_America_Airline_Jan_2015.txt
Date SSN From To Amount($)
01/15/15 12345678 CO TX 200
01/16/15 23452345 NV UT 150
01/16/15 34252454 CA CO 200
01/16/15 56785678 CA TX 150
01/17/15 43545666 LA UT 200
01/17/15 67856783 TX CO 150
@dedunumax
dedunumax / Increased-Vagrantfile
Created May 18, 2015 12:59
In this vagrant file I have increase the memory of vagrant nodes.
Vagrant.configure("2") do |config|
config.vm.define "master" do |master|
master.vm.box = "ubuntu/trusty64"
master.vm.hostname = "master.local"
master.vm.network "private_network", ip: "192.168.2.2"
end
config.vm.provider "virtualbox" do |v|
v.memory = 8192
v.cpus = 2
@dedunumax
dedunumax / vagrantoutput.log
Created May 18, 2015 12:36
vagrant output for sample project
Bringing machine 'master' up with 'virtualbox' provider...
Bringing machine 'slave1' up with 'virtualbox' provider...
Bringing machine 'slave2' up with 'virtualbox' provider...
==> master: Importing base box 'ubuntu/trusty64'...
==> master: Matching MAC address for NAT networking...
==> master: Checking if box 'ubuntu/trusty64' is up to date...
==> master: A newer version of the box 'ubuntu/trusty64' is available! You currently
==> master: have version '20150420.1.1'. The latest is version '20150512.0.1'. Run
==> master: `vagrant box update` to update.
==> master: Setting the name of the VM: test2_master_1431951423815_51724