Skip to content

Instantly share code, notes, and snippets.

package rnd
import kafka.serializer.StringDecoder
import org.apache.spark.sql.SQLContext
import org.apache.spark.streaming.dstream.DStream
import org.apache.spark.streaming.kafka.KafkaUtils
import org.apache.spark.streaming.{Minutes, Seconds, StreamingContext}
import org.apache.spark.{SparkConf, SparkContext}
object KafkaSparkStreamingToES {
df = sc.parallelize([(1, 'Y','F',"Giri",'Y'), (2, 'N','V',"Databricks",'N'),(3,'Y','B',"SparkEdge",'Y'),(4,'N','X',"Spark",'N')]).toDF(["id", "flag1","flag2","name","flag3"])
print 'Show Dataframe'
print 'Actual Schema of the df'
for a_dftype in df.dtypes:
col_name = a_dftype[0]
col_type = a_dftype[1]
# print[0][0]
// Spark 2.0 to SQL Server via External Data Source API and SQL JDBC
// References:
// -
// -
// -
// Run spark-shell
// - Get the SQL Server JDBC JAR fom the above "Using the JDBC driver" link
CsBigDataHub /
Created April 25, 2018 14:52 — forked from albertbori/
Automatically disable Wifi when an Ethernet connection (cable) is plugged in on a Mac


This is a bash script that will automatically turn your wifi off if you connect your computer to an ethernet connection and turn wifi back on when you unplug your ethernet cable/adapter. If you decide to turn wifi on for whatever reason, it will remember that choice. This was improvised from this mac hint to work with Yosemite, and without hard-coding the adapter names. It's supposed to support growl, but I didn't check that part. I did, however, add OSX notification center support. Feel free to fork and fix any issues you encounter.

Most the credit for these changes go to Dave Holland.


  • Mac OSX 10+
  • Administrator privileges
CsBigDataHub / .gitconfig
Created June 13, 2018 16:37 — forked from rambabusaravanan/.gitconfig
Git Diff and Merge Tool - IntelliJ IDEA
# Linux
# add the following to "~/.gitconfig" file
tool = intellij
[mergetool "intellij"]
cmd = /usr/local/bin/idea merge $(cd $(dirname "$LOCAL") && pwd)/$(basename "$LOCAL") $(cd $(dirname "$REMOTE") && pwd)/$(basename "$REMOTE") $(cd $(dirname "$BASE") && pwd)/$(basename "$BASE") $(cd $(dirname "$MERGED") && pwd)/$(basename "$MERGED")
trustExitCode = true
CsBigDataHub / .gitconfig
Created June 13, 2018 16:38 — forked from samsalisbury/.gitconfig
Git diff and merge with p4merge (OSX)
keepBackup = false
tool = p4merge
[mergetool "p4merge"]
cmd = /Applications/ "\"$PWD/$BASE\"" "\"$PWD/$REMOTE\"" "\"$PWD/$LOCAL\"" "\"$PWD/$MERGED\""
keepTemporaries = false
trustExitCode = false
keepBackup = false
tool = p4merge
CsBigDataHub /
Created June 13, 2018 16:38 — forked from tony4d/
Setup p4merge as a visual diff and merge tool for git
CsBigDataHub / sed cheatsheet
Created July 31, 2018 12:52 — forked from un33k/sed cheatsheet
magic of sed -- find and replace "text" in a string or a file
# double space a file
sed G
# double space a file which already has blank lines in it. Output file
# should contain no more than one blank line between lines of text.
sed '/^$/d;G'
CsBigDataHub /
Created September 3, 2018 13:09 — forked from mhulse/
Simple Python accessors (@getter/@Setter and @deleter) example/test using mixin and decorators... I'm using Python 2.6 for testing.
import pprint
# pprint.pprint(dir(obj))
# pprint.pprint(list)
# $ python runscript am_script2
class BaseMixin(object):
# Init:
CsBigDataHub / build.gradle
Created October 1, 2018 14:03 — forked from mychaelstyle/build.gradle
Example Gradle build Java with FindBugs and PMD and CPD
apply plugin: "java"
apply plugin: "eclipse"
apply plugin: "maven"
apply plugin: "findbugs"
apply plugin: "pmd"
def defaultEncoding = 'UTF-8'
[compileJava, compileTestJava]*.options*.encoding = defaultEncoding
sourceCompatibility = 1.7