Redwa

## scala spark action examples
scala> val names1 = sc.parallelize(List("abe", "abby", "apple"))
names1: org.apache.spark.rdd.RDD[String] = ParallelCollectionRDD[0] at parallelize at <console>:12

scala> names1.reduce((t1,t2) => t1 + t2)
res0: String = abbyappleabe

scala> names1.flatMap(k => List(k.size) ).reduce((t1,t2) => t1 + t2)
res1: Int = 12

scala> val names2 = sc.parallelize(List("apple", "beatty", "beatrice")).map(a => (a, a.size))

## Code from Part 1 of Spark Transformations in Scala
Welcome to
      ____              __
     / __/__  ___ _____/ /__
    _\ \/ _ \/ _ `/ __/  '_/
   /___/ .__/\_,_/_/ /_/\_\   version 1.1.0
      /_/

Using Scala version 2.10.4 (Java HotSpot(TM) 64-Bit Server VM, Java 1.6.0_65)
Type in expressions to have them evaluated.
Type :help for more information.

## Scala based Spark Transformations Part 2 - Comparing and Contrasting RDDs
Welcome to
      ____              __
     / __/__  ___ _____/ /__
    _\ \/ _ \/ _ `/ __/  '_/
   /___/ .__/\_,_/_/ /_/\_\   version 1.1.0
      /_/

Using Scala version 2.10.4 (Java HotSpot(TM) 64-Bit Server VM, Java 1.6.0_65)
Type in expressions to have them evaluated.
Type :help for more information.

## Spark Transformation Examples Part 3
scala> val babyNames = sc.textFile("baby_names.csv")
babyNames: org.apache.spark.rdd.RDD[String] = baby_names.csv MappedRDD[27] at textFile at <console>:12

scala> val rows = babyNames.map(line => line.split(","))
rows: org.apache.spark.rdd.RDD[Array[String]] = MappedRDD[28] at map at <console>:14

scala> val namesToCounties = rows.map(name => (name(1),name(2)))
namesToCounties: org.apache.spark.rdd.RDD[(String, String)] = MappedRDD[29] at map at <console>:16

scala> namesToCounties.groupByKey.collect

## useful_pandas_snippets.py
# List unique values in a DataFrame column
# h/t @makmanalp for the updated syntax!
df['Column Name'].unique()

# Convert Series datatype to numeric (will error if column has non-numeric values)
# h/t @makmanalp
pd.to_numeric(df['Column Name'])

# Convert Series datatype to numeric, changing non-numeric values to NaN
# h/t @makmanalp for the updated syntax!

## DashMLWebinar.md

      
              1 file
            
          
              0 forks
            
          
              0 comments
            
          
              0 stars
            
          
                Redwa
                / DashMLWebinar.md
            
            
              Created
              August 29, 2018 18:21
                — forked from xhluca/DashMLWebinar.md
            
          
    Dash ML and CV Webinar: Accompanying Document

Link to the webinar: https://www.crowdcast.io/e/interactive
More about Dash


Dash Gallery: https://dash.plot.ly/gallery
Advanced Development, for your custom dash components: https://plot.ly/products/consulting-and-oem/
Dash Deployment Server to deploy your enterprise apps: https://plot.ly/dash/pricing/
Dash DAQ, for data acquisition in Python: https://www.dashdaq.io/
	scala> val names1 = sc.parallelize(List("abe", "abby", "apple"))
	names1: org.apache.spark.rdd.RDD[String] = ParallelCollectionRDD[0] at parallelize at <console>:12

	scala> names1.reduce((t1,t2) => t1 + t2)
	res0: String = abbyappleabe

	scala> names1.flatMap(k => List(k.size) ).reduce((t1,t2) => t1 + t2)
	res1: Int = 12

	scala> val names2 = sc.parallelize(List("apple", "beatty", "beatrice")).map(a => (a, a.size))
	Welcome to
	____ __
	/ __/__ ___ _____/ /__
	_\ \/ _ \/ _ `/ __/ '_/
	/___/ .__/\_,_/_/ /_/\_\ version 1.1.0
	/_/

	Using Scala version 2.10.4 (Java HotSpot(TM) 64-Bit Server VM, Java 1.6.0_65)
	Type in expressions to have them evaluated.
	Type :help for more information.
	scala> val babyNames = sc.textFile("baby_names.csv")
	babyNames: org.apache.spark.rdd.RDD[String] = baby_names.csv MappedRDD[27] at textFile at <console>:12

	scala> val rows = babyNames.map(line => line.split(","))
	rows: org.apache.spark.rdd.RDD[Array[String]] = MappedRDD[28] at map at <console>:14

	scala> val namesToCounties = rows.map(name => (name(1),name(2)))
	namesToCounties: org.apache.spark.rdd.RDD[(String, String)] = MappedRDD[29] at map at <console>:16

	scala> namesToCounties.groupByKey.collect
	# List unique values in a DataFrame column
	# h/t @makmanalp for the updated syntax!
	df['Column Name'].unique()

	# Convert Series datatype to numeric (will error if column has non-numeric values)
	# h/t @makmanalp
	pd.to_numeric(df['Column Name'])

	# Convert Series datatype to numeric, changing non-numeric values to NaN
	# h/t @makmanalp for the updated syntax!