RajaShyam/Spark_streaming

## Spark_streaming
Dropwizard metrics:
==================
1. Push metrics into Ganglia, Graphite etc..(Can be enabed using SQL configuration)
    spark.conf.set("spark.sql.streaming.metricsEnabled","true")

2. Enable INFO or DEBUG logging levels for org.apache.spark.sql.kafka010.KafkaSource to see what happens inside.
   Add the following line to conf/log4j.properties:
     log4j.logger.org.apache.spark.sql.kafka010.KafkaSource=DEBUG


## Streaming techniques
Techniques:
==========
Source: https://databricks.com/session/apache-spark-streaming-programming-techniques-you-should-know
 - Self contained stream generation
 - Refreshing external data
 - Structured streaming capability
 - keeping arbitary state
 - Probabilistic accumulators
	Dropwizard metrics:
	==================
	1. Push metrics into Ganglia, Graphite etc..(Can be enabed using SQL configuration)
	spark.conf.set("spark.sql.streaming.metricsEnabled","true")

	2. Enable INFO or DEBUG logging levels for org.apache.spark.sql.kafka010.KafkaSource to see what happens inside.
	Add the following line to conf/log4j.properties:
	log4j.logger.org.apache.spark.sql.kafka010.KafkaSource=DEBUG
	Techniques:
	==========
	Source: https://databricks.com/session/apache-spark-streaming-programming-techniques-you-should-know
	- Self contained stream generation
	- Refreshing external data
	- Structured streaming capability
	- keeping arbitary state
	- Probabilistic accumulators