Murtaza Kanchwala mkanchwala

## solr-cloud-update-schema.sh
// CREATE COLLECTION
solr create_collection -c my_collection -shards 2 -d path-to-my-conf

// CHECK COLLECTION SCHEMA
curl http://solr-host.dev:8983/solr/my_collection/schema?wt=schema.xml
// SCHEMA ITS GOOD


// UPDATE SCHEMA

## Kafka MultiNode - MultiBroker Cluster.md

      
              1 file
            
          
              14 forks
            
          
                18 comments
              
            
              30 stars
            
          
                mkanchwala
                / Kafka MultiNode - MultiBroker Cluster.md
            
            
              Last active
              July 2, 2022 10:44
            
              
                Create Kafka Multi Node, Multi Broker Cluster
              
          
    How to create a MultiNode - MultiBroker Cluster for Kafka on AWS

PreRequisites :


Kafka Binary files : http://kafka.apache.org/downloads.html


Atleast 2 AWS machines : AWS EMR or EC2 will be preferable


A Kafka Manager Utility to watch up the Cluster : https://cwiki.apache.org/confluence/display/KAFKA/Ecosystem


## Spark-SingleRDDfrmMultipleFiles
Best way to create Mutliple files into a single RDD
==================================

val fileRDD = sc.textFile(filename).repartition(1)

Where the filename is the location of your directory only.

## gist:faf3f5e034c8638390ed

      
              1 file
            
          
              0 forks
            
          
                0 comments
              
            
              0 stars
            
          
                mkanchwala
                / gist:faf3f5e034c8638390ed
            
            
              Created
              April 30, 2015 08:57
            
              
                Spark : How to create a Single RDD from Multiple Files
              
          
    Best way to create Mutliple files into a single RDD

val fileRDD = sc.textFile(filename).repartition(1)
Where the filename is the location of your directory only.

  
## flume-ng-ec2-syslog-s3-setup.md

      
              1 file
            
          
              1 fork
            
          
                0 comments
              
            
              0 stars
            
          
                mkanchwala
                / flume-ng-ec2-syslog-s3-setup.md
            
            
              Last active
              January 20, 2016 17:10
                — forked from crowdmatt/flume-ng-ec2-syslog-s3-setup.md
            
          
    Setting up Flume NG, listening to syslog over UDP, with an S3 Sink

My goal was to set up Flume on my web instances, and write all events into s3, so I could easily use other tools like Amazon Elastic Map Reduce, and Amazon Red Shift.
I didn't want to have to deal with log rotation myself, so I setup Flume to read from a syslog UDP source.  In this case, Flume NG acts as a syslog server, so as long as Flume is running, my web application can simply write to it in syslog format on the specified port.  Most languages have plugins for this.
At the time of this writing, I've been able to get Flume NG up and running on 3 ec2 instances, and all writing to the same bucket.
Install Flume NG on instances
	// CREATE COLLECTION
	solr create_collection -c my_collection -shards 2 -d path-to-my-conf

	// CHECK COLLECTION SCHEMA
	curl http://solr-host.dev:8983/solr/my_collection/schema?wt=schema.xml
	// SCHEMA ITS GOOD



	// UPDATE SCHEMA
	Best way to create Mutliple files into a single RDD
	==================================

	val fileRDD = sc.textFile(filename).repartition(1)

	Where the filename is the location of your directory only.