Lyuben Todorov lyubent

## install.sh

R version 3.4.3 (2017-11-30) -- "Kite-Eating Tree"
Copyright (C) 2017 The R Foundation for Statistical Computing
Platform: x86_64-redhat-linux-gnu (64-bit)

R is free software and comes with ABSOLUTELY NO WARRANTY.
You are welcome to redistribute it under certain conditions.
Type 'license()' or 'licence()' for distribution details.

  Natural language support but running in an English locale

## handle_invalid.py
%python

from pyspark.sql import Row

list = [("a", 318), ("b",32), ("c","im_gona_break_everything!")]

# // val dfFail = list.toDF()
rddWorks = spark.sparkContext.parallelize(list)

def func(a):

## gist:b6a76592306b45be0ac2d582ab05968f
scala> val df1 = sc.parallelize(List((1, "Jack"), (2, "Link"))).toDF("id", "name")
df1: org.apache.spark.sql.DataFrame = [id: int, name: string]

scala> val df2 = sc.parallelize(List((1, 41), (2, 29))).toDF("id", "age")
df2: org.apache.spark.sql.DataFrame = [id: int, age: int]

scala> import org.apache.spark.sql.functions.broadcast;
import org.apache.spark.sql.functions.broadcast

scala> val df3 = df1.join(broadcast(df2), Seq("id"))

## install_python_27p10.sh
vi /etc/profile
. /etc/profile

yum remove python27

yum groupinstall "Development tools"
yum install zlib-devel bzip2-devel openssl-devel ncurses-devel sqlite-devel
yum install gcc
cd /usr/src
wget https://www.python.org/ftp/python/2.7.10/Python-2.7.10.tgz

## gist:9c8534ba8fb9af84bca37cfadc191183
{
    "ClusterId": "j-EQPER71I4AQYL"
}


for CLUST_ID in `cat stuff.txt`;
do
# raka=${INSTANCE:18:15}
# echo $raka
if [ $CLUST_ID != { ] && [ $CLUST_ID != } ] && [ $CLUST_ID != \"ClusterId\"\: ];

## instructions.txt
Modify index.html so that when you click the "display stuff" button it will use javascript to:
- get the value of the first text input field and store it in a variable in javascript
- get the value of the second text input field and store it in a variable
- add the two variables together and display the result somewhere.
- the result can be displayed using alert(resultGoesHere) or you can create a new field in the html that will contain the value.

Example solution available here: http://bit.ly/2qi4d6L
Review the solution carefully and notice there are new id="input1" tags added to the input fields.
These id tags are used to uniquely identify elements within the webpage and must be unique.
Class elements can be duplicate however, we will use these later.

## cassandra-3-5.sh
## python 2.7.x with gzip
yum remove python
yum groupinstall "Development tools"

cd /usr/src
wget http://python.org/ftp/python/2.7.6/Python-2.7.6.tgz
tar -xvzf Python-2.7.6.tgz
cd Python-2.7.6

yum install zlib-devel bzip2-devel openssl-devel ncurses-devel sqlite-devel readline-devel tk-devel gdbm-devel db4-devel libpcap-devel xz-devel

## groupBySpark.scala
scala> auctionsDF.groupBy("itemtype", "aucid")
res14: org.apache.spark.sql.GroupedData = org.apache.spark.sql.GroupedData@4c155287

scala> auctionsDF.groupBy("itemtype", "aucid").show()
<console>:38: error: value show is not a member of org.apache.spark.sql.GroupedData
              auctionsDF.groupBy("itemtype", "aucid").show()
                                                      ^

scala> auctionsDF.groupBy("itemtype", "aucid").count.show(5)
+--------+----------+-----+

## eclipse-intelliJ.md

      
              1 file
            
          
              0 forks
            
          
              0 comments
            
          
              0 stars
            
          
                lyubent
                / eclipse-intelliJ.md
            
            
              Last active
              December 5, 2016 21:43
            
          
    #Eclipse Scala:


Installed Scala IDE 4.4.1 (http://scala-ide.org/download/sdk.html)


File > New Project > Maven Project


Tick "Skip Archetype Selection" (simple project)


Add maven project info


Change to scala nature (Right click on proj > configure > add scala nature)


## gist:3ddbeef596c466ebab88bf634725f531
/* Simple app to inspect Auction data */
/* The following import statements are importing SparkContext, all subclasses and SparkConf*/
package exercises

import org.apache.spark.SparkContext
import org.apache.spark.SparkContext._
import org.apache.spark.SparkConf
// Will use max, min - import java.Lang.Math
import java.lang.Math
import scala.util.control.NonFatal

	R version 3.4.3 (2017-11-30) -- "Kite-Eating Tree"
	Copyright (C) 2017 The R Foundation for Statistical Computing
	Platform: x86_64-redhat-linux-gnu (64-bit)

	R is free software and comes with ABSOLUTELY NO WARRANTY.
	You are welcome to redistribute it under certain conditions.
	Type 'license()' or 'licence()' for distribution details.

	Natural language support but running in an English locale
	%python

	from pyspark.sql import Row

	list = [("a", 318), ("b",32), ("c","im_gona_break_everything!")]

	# // val dfFail = list.toDF()
	rddWorks = spark.sparkContext.parallelize(list)

	def func(a):
	scala> val df1 = sc.parallelize(List((1, "Jack"), (2, "Link"))).toDF("id", "name")
	df1: org.apache.spark.sql.DataFrame = [id: int, name: string]

	scala> val df2 = sc.parallelize(List((1, 41), (2, 29))).toDF("id", "age")
	df2: org.apache.spark.sql.DataFrame = [id: int, age: int]

	scala> import org.apache.spark.sql.functions.broadcast;
	import org.apache.spark.sql.functions.broadcast

	scala> val df3 = df1.join(broadcast(df2), Seq("id"))
	vi /etc/profile
	. /etc/profile

	yum remove python27

	yum groupinstall "Development tools"
	yum install zlib-devel bzip2-devel openssl-devel ncurses-devel sqlite-devel
	yum install gcc
	cd /usr/src
	wget https://www.python.org/ftp/python/2.7.10/Python-2.7.10.tgz
	{
	"ClusterId": "j-EQPER71I4AQYL"
	}


	for CLUST_ID in `cat stuff.txt`;
	do
	# raka=${INSTANCE:18:15}
	# echo $raka
	if [ $CLUST_ID != { ] && [ $CLUST_ID != } ] && [ $CLUST_ID != \"ClusterId\"\: ];
	Modify index.html so that when you click the "display stuff" button it will use javascript to:
	- get the value of the first text input field and store it in a variable in javascript
	- get the value of the second text input field and store it in a variable
	- add the two variables together and display the result somewhere.
	- the result can be displayed using alert(resultGoesHere) or you can create a new field in the html that will contain the value.

	Example solution available here: http://bit.ly/2qi4d6L
	Review the solution carefully and notice there are new id="input1" tags added to the input fields.
	These id tags are used to uniquely identify elements within the webpage and must be unique.
	Class elements can be duplicate however, we will use these later.
	## python 2.7.x with gzip
	yum remove python
	yum groupinstall "Development tools"

	cd /usr/src
	wget http://python.org/ftp/python/2.7.6/Python-2.7.6.tgz
	tar -xvzf Python-2.7.6.tgz
	cd Python-2.7.6

	yum install zlib-devel bzip2-devel openssl-devel ncurses-devel sqlite-devel readline-devel tk-devel gdbm-devel db4-devel libpcap-devel xz-devel
	scala> auctionsDF.groupBy("itemtype", "aucid")
	res14: org.apache.spark.sql.GroupedData = org.apache.spark.sql.GroupedData@4c155287

	scala> auctionsDF.groupBy("itemtype", "aucid").show()
	<console>:38: error: value show is not a member of org.apache.spark.sql.GroupedData
	auctionsDF.groupBy("itemtype", "aucid").show()
	^

	scala> auctionsDF.groupBy("itemtype", "aucid").count.show(5)
	+--------+----------+-----+
	/* Simple app to inspect Auction data */
	/* The following import statements are importing SparkContext, all subclasses and SparkConf*/
	package exercises

	import org.apache.spark.SparkContext
	import org.apache.spark.SparkContext._
	import org.apache.spark.SparkConf
	// Will use max, min - import java.Lang.Math
	import java.lang.Math
	import scala.util.control.NonFatal