Divay Jindal divayjindal95

## terminologies
--------------------------------------------------------  Edit to Enlarge  ----------------------------------------------


Apache spark - Apache Spark is an open-source data analytics cluster computing framework originally developed in the AMPLab at UC Berkeley.[1] Spark fits into the Hadoop open-source community, building on top of the Hadoop Distributed File System (HDFS).[2] However, Spark is not tied to the two-stage MapReduce paradigm, and promises performance up to 100 times faster than Hadoop MapReduce for certain applications.

Database pipelining - http://www.tuplejump.com/img/ff08.theplatform.png
                      As you will notice it's just not about processing the data, but involves a lot of other components. Collection, storage, exploration, ML and visualization are critical to the proect's success.


SOLR -  Solr to build a highly scalable data analytics engine to enable customers to engage in lightning fast, real-time knowledge discovery.
	-------------------------------------------------------- Edit to Enlarge ----------------------------------------------


	Apache spark - Apache Spark is an open-source data analytics cluster computing framework originally developed in the AMPLab at UC Berkeley.[1] Spark fits into the Hadoop open-source community, building on top of the Hadoop Distributed File System (HDFS).[2] However, Spark is not tied to the two-stage MapReduce paradigm, and promises performance up to 100 times faster than Hadoop MapReduce for certain applications.

	Database pipelining - http://www.tuplejump.com/img/ff08.theplatform.png
	As you will notice it's just not about processing the data, but involves a lot of other components. Collection, storage, exploration, ML and visualization are critical to the proect's success.


	SOLR - Solr to build a highly scalable data analytics engine to enable customers to engage in lightning fast, real-time knowledge discovery.