This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
using System; | |
using System.Text; | |
using Microsoft.ServiceBus.Messaging; | |
using System.Net; | |
using System.IO; | |
namespace StreamingAnalyticsEventPublisher | |
{ | |
class MeetupRSVPEventSender | |
{ |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
import java.util.Properties; | |
import org.apache.oozie.client.OozieClient; | |
import org.apache.oozie.client.WorkflowJob; | |
public class myOozieWorkflowJavaAPICall { | |
public static void main(String[] args) { | |
OozieClient wc = new OozieClient("http://cdh-dev01:11000/oozie"); |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This gist includes a mapper, reducer and driver in java that can parse log files using | |
regex; The code for combiner is the same as reducer; | |
Usecase: Count the number of occurances of processes that got logged, inception to date. | |
Includes: | |
--------- | |
Sample data and scripts for download:01-ScriptAndDataDownload | |
Sample data and structure: 02-SampleDataAndStructure | |
Mapper: 03-LogEventCountMapper.java | |
Reducer: 04-LogEventCountReducer.java |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This gist is part of a series of gists related to Map-side joins in Java map-reduce. | |
In the gist - https://gist.github.com/airawat/6597557, we added the reference data available | |
in HDFS to the distributed cache from the driver code. | |
This gist demonstrates adding a local file via command line to distributed cache. | |
Refer gist at https://gist.github.com/airawat/6597557 for- | |
1. Data samples and structure | |
2. Expected results | |
3. Commands to load data to HDFS |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Kerberos | |
Kerberos is a network authentication protocol. It is designed to provide strong authentication for client/server applications by using secret-key cryptography | |
Kerberos Principals | |
A user in Kerberos is called a principal, which is made up of three distinct components: the primary, instance, and realm. | |
A Kerberos principal is used in a Kerberos-secured system to represent a unique identity. | |
The first component of the principal is called the primary, or sometimes the user component. | |
The primary component is an arbitrary string and may be the operating system username of the user or the name of a service. | |
The primary component is followed by an optional section called the instance, which is used to create principals that are used by users in special roles or to define the host on which a service runs, for example. | |
An instance, if it exists, is separated from the primary by a slash and then the content is used to disambiguate multiple principals for a single user or service. |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
spark-submit --class com.khanolkar.bda.util.CompactRawLogs \ | |
............ | |
MyJar-1.0.jar \ | |
"/user/akhanolk/data/raw/streaming/to-be-compacted/" \ | |
"/user/akhanolk/data/raw/compacted/" \ | |
"2" "128" "oozie-124" | |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
package com.khanolkar.bda.util | |
/** | |
* @author Anagha Khanolkar | |
*/ | |
import org.apache.spark.sql.SparkSession | |
import org.apache.hadoop.fs.{ FileSystem, Path } | |
import org.apache.hadoop.conf.Configuration | |
import org.apache.spark.sql._ | |
import com.databricks.spark.avro._ |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This gist includes components of a oozie (time initiated) coordinator application - scripts/code, sample data | |
and commands; Oozie actions covered: hdfs action, email action, java main action, | |
hive action; Oozie controls covered: decision, fork-join; The workflow includes a | |
sub-workflow that runs two hive actions concurrently. The hive table is partitioned; | |
Parsing uses hive-regex serde, and Java-regex. Also, the java mapper, gets the input | |
directory path and includes part of it in the key. | |
Usecase: Parse Syslog generated log files to generate reports; | |
Pictorial overview of job: |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
#!/bin/sh | |
set -x | |
# create the input file based on size (you can get size pattern by running fdisk -l as root) | |
# Be sure to exclude the Root disk if it is part of your config. You must edit this file to do so | |
size=$1 | |
shift; | |
fdisk -l|grep $size|awk '{print $2}'|sed -e"s/\:$//g" > foo |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
My blog has an introduction to reduce side join in Java map reduce- | |
http://hadooped.blogspot.com/2013/09/reduce-side-join-options-in-java-map.html | |