Jumping Qu jumping

## flume-site-agent.xml
<!-- a1 (agent)  -->
<configuration>
  <property>
    <name>flume.master.servers</name>
    <value>$master_IP</value>
    <description>This is the address for the config servers status server (http)</description>
  </property>

  <property>
    <name>flume.collector.event.host</name>

## autoscale_sample
#################################################################################
# Import modules
#################################################################################

import os
import time
import sys
import socket
import string

## autoscaling_boto.py
"""
The MIT License (MIT)
Copyright (c) 2011 Numan Sachwani

Permission is hereby granted, free of charge, to any person obtaining a copy of
this software and associated documentation files (the "Software"), to deal in
the Software without restriction, including without limitation the rights to
use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies
of the Software, and to permit persons to whom the Software is furnished to do
so, subject to the following conditions:

## golang_job_queue.md

      
              4 files
            
          
              0 forks
            
          
              0 comments
            
          
              0 stars
            
          
                jumping
                / golang_job_queue.md
            
            
              Created
              June 20, 2016 02:08
                — forked from harlow/golang_job_queue.md
            
              
                Job queues in Golang
              
          
    Golang Job Queue

A running example of the code from:

http://marcio.io/2015/07/handling-1-million-requests-per-minute-with-golang
http://nesv.github.io/golang/2014/02/25/worker-queues-in-go.html

Step 1

Small refactorings made to original code:

  
## sysctl.conf
# Configuration file for runtime kernel parameters.
# See sysctl.conf(5) for more information.

# See also http://www.nateware.com/linux-network-tuning-for-2013.html for
# an explanation about some of these parameters, and instructions for
# a few other tweaks outside this file.

# Protection from SYN flood attack.
net.ipv4.tcp_syncookies = 1

## 1_simple.go
package main

import (
  "fmt"
  "reflect"
)

// Name of the struct tag used in examples
const tagName = "validate"

## emr_spark_thrift_on_yarn
#on cluster
thrift /spark/sbin/start-thriftserver.sh --master yarn-client
#ssh tunnel, direct 10000 to unused 8157
ssh -i ~/caserta-1.pem -N -L 8157:ec2-54-221-27-21.compute-1.amazonaws.com:10000 hadoop@ec2-54-221-27-21.compute-1.amazonaws.com
#see this for JDBC config on client http://blogs.aws.amazon.com/bigdata/post/TxT7CJ0E7CRX88/Using-Amazon-EMR-with-SQL-Workbench-and-other-BI-Tools

## Linux Static IP
## Configure eth0
#
# vi /etc/sysconfig/network-scripts/ifcfg-eth0

DEVICE="eth0"
NM_CONTROLLED="yes"
ONBOOT=yes
HWADDR=A4:BA:DB:37:F1:04
TYPE=Ethernet
BOOTPROTO=static

## CDHTez.md

      
              1 file
            
          
              0 forks
            
          
              0 comments
            
          
              0 stars
            
          
                jumping
                / CDHTez.md
            
            
              Created
              June 19, 2018 06:32
                — forked from epiphani/CDHTez.md
            
              
                Getting Tez enabled on CDH5.4+
              
          
    So Hive in CDH is horribly, painfully slow.  Cloudera ships Hive 1.1, which is actually moderately modern.  It is, however, very badly configured out of the box and patched with custom code from Cloudera.  With a bit of effort, we managed to improve hive performance considerably.  We really shouldn't have to do this, but Cloudera is actively working against supporting a performant Hive.
First, building Tez was fairly straightforward.  Using the instructions at https://github.com/apache/tez/blob/master/docs/src/site/markdown/install.md, the only change was to use the version string "2.6.0" for the build.  I believe that was the default.  Don't use the CDH string, it won't work.
At the bottom of the installation instructions, there's mention of the fact that to use the local hadoop jars (rather than those packaged with tez) you must unpack the jars in HDFS rather than using the tarball. In this case, unpack the tez-minimal tarball and upload the contents to /apps/tez-0.7.0 (or whatever you prefer). Don't fo

  
## dummy-web-server.py
#!/usr/bin/env python
"""
Very simple HTTP server in python.

Usage::
    ./dummy-web-server.py [<port>]

Send a GET request::
    curl http://localhost
	<!-- a1 (agent) -->
	<configuration>
	<property>
	<name>flume.master.servers</name>
	<value>$master_IP</value>
	<description>This is the address for the config servers status server (http)</description>
	</property>

	<property>
	<name>flume.collector.event.host</name>
	#################################################################################
	# Import modules
	#################################################################################

	import os
	import time
	import sys
	import socket
	import string
	"""
	The MIT License (MIT)
	Copyright (c) 2011 Numan Sachwani

	Permission is hereby granted, free of charge, to any person obtaining a copy of
	this software and associated documentation files (the "Software"), to deal in
	the Software without restriction, including without limitation the rights to
	use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies
	of the Software, and to permit persons to whom the Software is furnished to do
	so, subject to the following conditions:
	# Configuration file for runtime kernel parameters.
	# See sysctl.conf(5) for more information.

	# See also http://www.nateware.com/linux-network-tuning-for-2013.html for
	# an explanation about some of these parameters, and instructions for
	# a few other tweaks outside this file.

	# Protection from SYN flood attack.
	net.ipv4.tcp_syncookies = 1
	package main

	import (
	"fmt"
	"reflect"
	)

	// Name of the struct tag used in examples
	const tagName = "validate"
	#on cluster
	thrift /spark/sbin/start-thriftserver.sh --master yarn-client
	#ssh tunnel, direct 10000 to unused 8157
	ssh -i ~/caserta-1.pem -N -L 8157:ec2-54-221-27-21.compute-1.amazonaws.com:10000 hadoop@ec2-54-221-27-21.compute-1.amazonaws.com
	#see this for JDBC config on client http://blogs.aws.amazon.com/bigdata/post/TxT7CJ0E7CRX88/Using-Amazon-EMR-with-SQL-Workbench-and-other-BI-Tools
	## Configure eth0
	#
	# vi /etc/sysconfig/network-scripts/ifcfg-eth0

	DEVICE="eth0"
	NM_CONTROLLED="yes"
	ONBOOT=yes
	HWADDR=A4:BA:DB:37:F1:04
	TYPE=Ethernet
	BOOTPROTO=static
	#!/usr/bin/env python
	"""
	Very simple HTTP server in python.

	Usage::
	./dummy-web-server.py [<port>]

	Send a GET request::
	curl http://localhost