Skip to content

Instantly share code, notes, and snippets.

View anjijava16's full-sized avatar
💭
Awesome

Anjaiah Methuku anjijava16

💭
Awesome
View GitHub Profile
@anjijava16
anjijava16 / SkipMapper.java
Created December 26, 2016 10:17 — forked from amalgjose/SkipMapper.java
Mapreduce program for removing stop words from the given text files. Hadoop Distributed cache and counters are used in this program
package com.hadoop.skipper;
import java.io.BufferedReader;
import java.io.FileReader;
import java.io.IOException;
import java.util.HashSet;
import java.util.Set;
import java.util.StringTokenizer;
import org.apache.hadoop.fs.Path;
@anjijava16
anjijava16 / Import XLS all sheets.ipynb
Created April 8, 2018 19:22 — forked from rmoff/Import XLS all sheets.ipynb
Import all sheets of an XLS into Oracle Big Data Discovery
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.

Introduction

This tutorial describes how to refine data for a Trucking IoT  Data Discovery (aka IoT Discovery) use case using the Hortonworks Data Platform. The IoT Discovery use cases involves vehicles, devices and people moving across a map or similar surface. Your analysis is interested in tying together location information with your analytic data.

Hello World is often used by developers to familiarize themselves with new concepts by building a simple program. This tutorial aims to achieve a similar purpose by getting practitioners started with Hadoop and HDP. We will use an Internet of Things (IoT) use case to build your first HDP application.

For our tutorial we are looking at a use case where we have a truck fleet. Each truck has been equipped to log location and event data. These events are streamed back to a datacenter where we will be processing the data.  The company wants to use this data to better understand risk.

@anjijava16
anjijava16 / Benchmark-results.txt
Created November 14, 2018 03:51 — forked from saurabhmishra/Benchmark-results.txt
Kafka Benchmark Commands
End-to-end Latency
0.0543 ms ms (median)
0.003125 ms (99th percentile)
5 ms (99.9th percentile)
Producer and consumer
Producer - 1431170.2 records/sec (136.49 MB/sec)
Consumer - 3276754.7021 records/sec (312.4957 MB/sec)
import org.apache.spark.sql.SparkSession
val spark = SparkSession.
builder.
master("local").
appName("Spark Structured Streaming Demo").
getOrCreate
spark.sparkContext.setLogLevel("ERROR")
import org.apache.spark.sql.SparkSession
val spark = SparkSession.
builder.
master("local").
appName("Get Department Traffic").
getOrCreate
spark.sparkContext.setLogLevel("ERROR")
import spark.implicits._