mark-walle/Scala Essential Training for Data Science.md

## Scala Essential Training for Data Science.md

      
    Raw
  

              Scala Essential Training for Data Science.md
            
          
    Scala Essential Training for Data Science

Below are notes taken for the "Scala Essential Training for Data Science":
https://www.lynda.com/Scala-tutorials/Scala-Essential-Training-Data-Science/559182-2.html
In this course we will need to install Scala, Postgres, and Spark. The Lynda course we will be following along with provides manual installation instructions that do not use tools package management tools. To simplify installations directions, I encourage using a command line based package management tool. On MacOS there is Homebrew (https://brew.sh/), in Windows there is Chocolatey (https://chocolatey.org/), and on Linux distributions there is SDKMAN! (https://sdkman.io/) and on Debian based systems APT (https://wiki.debian.org/AptCLI).
In the case of Linux users, I will provide SDKMAN! installation instructions in some cases, and APT instructions in others, depending on which is more straightforward. You may also need to use sudo.
MacOS and Windows users may need to provide your system password to install certain packages.
If you are comfortable taking another approach to install the software used in this course, feel free to, but know that I may be unable to support issues encountered using your approach to installation (I primarily use MacOS).
Section 1 - Introduction to Scala

Begin by installing Scala. The Lynda course uses version 2.11.11 which is not the most recent. Since there were some slight changes to components that we will learn in this course, you should install the same version (at lead the same major.minor version) as used by the instructor. To install, run the appropriate command for your system:
$ brew install scala@2.11 ## This should install 2.11.12  ## MacOS with https://brew.sh/ installed
> choco install scala --version=2.11.4                    ## Windows with https://chocolatey.org/ installed
$ sdk install scala 2.11.11                               ## Linux distributions with https://sdkman.io/ installed

Scala is dependent on the Java Standard Edition Platform. Most package managers will resolve dependencies and will install those prerequisites. For instance, if you do not already have it, Homebrew will install OpenJDK@13 when you use it to install Scala@2.11.

Data Types (Section 1.3)
Arrays, Vectors, and Ranges (Section 1.5)
Maps (Section 1.6)
Expressions (Section 1.7)
Functions (Section 1.8)
Objects and Classes (Section 1.9)

Section 2 - Parallel Processing in Scala

Scala provides a simple high-level abstraction to process over multiple cores or on hyper-threaded processors using its standard library implementation of "parallel collections". The goal of Scala’s parallel collections is in enabling parallelism to be easily brought into more code by taking well understood programming abstractions of sequential collections such as the array, vector and range types discussed in the previous section. Scala's parallel collection types include ParArray, ParVector, ParRange, ParHashMap, and ParSet.
You can read more about parallel collections in the "Parallel Collections Overview" section of the Scala documentation.
When not to use parallel collections

Only use parallel collections when the collection contain thousands or tens of thousands of elements. Small collections configured to be parallel when they need only be sequential incur an unnecessary overhead (albeit a fairly minor overhead).
When creating parallel collections from sequential ones, the conversion may require a deep copy, so also keep your memory quota in mind when defining parallel collections.

Conceptually, Scala’s parallel collections framework parallelizes an operation on a parallel collection by recursively “splitting” a given collection, applying an operation on each partition of the collection in parallel, and re-“combining” all of the results that were completed in parallel.
These concurrent, and “out-of-order” semantics of parallel collections lead to the following two implications:

Side-effecting operations can lead to non-determinism
Non-associative operations lead to non-determinism


By "out-of-order" the authors are referring to it in the temporal sense of the order of operations owing to the fact that different threads operating on a partitioned collection will compute their piece over different temporal durations. They don't mean a spatial mis-ordering of the operations. A parallel collection broken into partitions A, B, C, in that order, will be reassembled once again in the order; not some other arbitrary order like B, C, A.
_{From Scala's Parallel Collections Overview documentation. See the Wikipedia for more information on non-deterministic algorithms.}
When you use side-effecting procedures

Avoid using applying procedures with side-effects. An example is by using an accessor method, like foreach to increment a var declared outside of the closure which is passed to foreach.
scala> var sum = 0
sum: Int = 0

scala> val list = (1 to 1000).toList.par
list: scala.collection.parallel.immutable.ParSeq[Int] = ParVector(1, 2, 3,…

scala> list.foreach(sum += _); sum
res01: Int = 467766

scala> var sum = 0
sum: Int = 0

scala> list.foreach(sum += _); sum
res02: Int = 457073

scala> var sum = 0
sum: Int = 0

scala> list.foreach(sum += _); sum
res03: Int = 468520

Here, summing using foreach over the collection results in different values each time. This is caused by a data race due to concurrent read/write operations on the sum variable; an impact of having split the parallel collection perform foreach across multiple cores or threads. Illustrated here:
ThreadA: read value in sum, sum = 0                value in sum: 0
ThreadB: read value in sum, sum = 0                value in sum: 0
ThreadA: increment sum by 760, write sum = 760     value in sum: 760
ThreadB: increment sum by 12, write sum = 12       value in sum: 12

Contrast this with a sequential collection where the result on sum is repeatable and accurate:
scala> var sum = 0
sum: Int = 0

scala> val list = (1 to 1000)
list: scala.collection.immutable.Range.Inclusive = Range(1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 40, 41, 42, 43, 44, 45, 46, 47, 48, 49, 50, 51, 52, 53, 54, 55, 56, 57, 58, 59, 60, 61, 62, 63, 64, 65, 66, 67, 68, 69, 70, 71, 72, 73, 74, 75, 76, 77, 78, 79, 80, 81, 82, 83, 84, 85, 86, 87, 88, 89, 90, 91, 92, 93, 94, 95, 96, 97, 98, 99, 100, 101, 102, 103, 104, 105, 106, 107, 108, 109, 110, 111, 112, 113, 114, 115, 116, 117, 118, 119, 120, 121, 122, 123, 124, 125, 126, 127, 128, 129, 130, 131, 132, 133, 134, 135, 136, 137, 138, 139, 140, 141, 142, 143, 144, 145, 146, 147, 148, 149, 150, 151, 152, 153, 154, 155, 156, 157, 158, 159, 160, 161, 162, 163, 164, 165, 166, 167, 168, 169, 170...

scala> list.foreach(sum += _); sum
res11: Int = 500500

scala> var sum = 0
sum: Int = 0

scala> list.foreach(sum += _); sum
res12: Int = 500500

When you use non-associative computation

Also avoid non-associative operations on parallel collections.
Example of associative vs. non-associative operations:
Subtraction is non-associative:
(1 − 2) − 3 = −4
1 − (2 − 3) = 2

whereas addition is associative:
(1 + 2) + 3 = 6
1 + (2 + 3) = 6

Since the order of a function applied to a parallel collection is arbitrary due to their multithreading, you cannot perform non-associative operations since state information cannot be relied on at any point in time.
We can see the impact of non-associative operations on a parallel collection in Scala using subtraction in the reduce method on a parVector here:
scala> val list = (1 to 1000).toList.par
list: scala.collection.parallel.immutable.ParSeq[Int] = ParVector(1, 2, 3,…

scala> list.reduce(_-_)
res01: Int = -67860

scala> list.reduce(_-_)
res02: Int = 2350

scala> list.reduce(_-_)
res03: Int = -234948

While the same parVector having reduce on by an associative operation like addition works just fine:
scala> list.reduce(_+_)
res18: Int = 500500

scala> list.reduce(_+_)
res19: Int = 500500


Creating Parallel Collections (Section 2.2)
Mapping Functions Over Parallel Collections (Section 2.3)
Filtering Parallel Collections (Section 2.4)

Section 3 - Using SQL in Scala

Using SQL in Scala first requires a SQL RDMS as the data source; in the course we run PostgreSQL, but any RDMS that you may have already installed on your computer should work. Installing Postgres is straightforward. The video gives installation instructions; but you may find it easier to run one of the following commands (providing you have installed a package manager: Brew on MacOS or Chocolatey on Windows).
$ brew install postgresql     ## MacOS with https://brew.sh/ installed
$ apt-get install postgresql  ## Debian-like Linux distributions
> choco install postgresql    ## Windows with https://chocolatey.org/ installed

You also will need the PostgreSQL JDBC. The video lectures appear to skip the installation step for the JDBC driver. The driver is a binary jar file, downloadable from https://jdbc.postgresql.org/ by navigating to Downloads. I've confirmed that the current version as of this writing works.
When you invoke scala, you need to pass the path to the driver jar file following the -classpath flag, such as $ scala -classpath ~/Downloads/postgresql-42.2.10.jar, for example. If you used a different SQL database management system, like MySQL or SQLite.
Worth a mention that executing queries on your SQL database through the JDBC driver connection returns ResultSet objects, which maintain its own cursor object. Functions of the ResultsSet Class allow for your interaction with the cursor. A ResultSet cursor is initially positioned before the first row; the first call to the method next makes the first row the current row; the second call makes the second row the current row, and so on. More details on the native Java SQL API is available on Oracles Java Specification documentation website: https://docs.oracle.com/javase/8/docs/api/java/sql/package-summary.html

Loading Data into PostgreSQL (Section 3.2)
Connecting to PostgreSQL (Section 3.3)
Querying with SQL strings (Section 3.4)
Querying with Prepared Statements (Section 3.5)

Section 4 - Scala and Spark RDDs


Getting Started with Spark RDDs (Section 4.3)
Mapping Functions Over RDDs (Section 4.4)
Statistics Over RDDs (Section 4.5)

Section 5 - Scala and Spark DataFrames


Creating DataFrames (Section 5.1)
Grouping and Filtering on DataFrames (Section 5.2)
Joining DataFrames (Section 5.3)
Working with JSON Files (Section 5.4)


## section_1.3-scala_data_types.txt
$ scala
Welcome to Scala 2.11.12 (OpenJDK 64-Bit Server VM, Java 1.8.0_242).
Type in expressions for evaluation. Or try :help.

scala> // Defining mutable variables with explicit types

scala> var a_int : Int = 3
a_int: Int = 3

scala> var a_char : Char = 'd'
a_char: Char = d

scala> var a_long : Long = 8345679
a_long: Long = 8345679

scala> // Defining variables allowing Scala to infer the data type

scala> var b = 3
b: Int = 3

scala> var b_char = 'd'
b_char: Char = d

scala> var b_long = 834567989
b_long: Int = 834567989

scala> // Scala attempts to infer an Int for a number too large for an Int.

scala> var b_long = 8345679899
                    ^
       error: integer number too large

scala> // Scala attempts to infer an Int on what needs to be a Long. Append with L to have Scala infer a Long

scala> var b_long = 8345679899L
b_long: Long = 8345679899

scala> // Scala infers decimal variables as Doubles.

scala> var c_float = 1.2345
c_float: Double = 1.2345

scala> // By appending F to the number, Scala will infer the number as a Float

scala> var d_float = 1.2345F
d_float: Float = 1.2345

scala> // Defining immutable values follows the same patterns as defining mutable variables, but uses the `val` keyword

scala> val z = 8345678489L
z: Long = 8345678489

## section_1.5-scala_collections.txt
$ scala
Welcome to Scala 2.11.12 (OpenJDK 64-Bit Server VM, Java 1.8.0_242).
Type in expressions for evaluation. Or try :help.

scala> /* Collections in scala may be mutable (changeable) or immutable (unchanging).
     |    Scala will make a copy of an immutable collection when a changes is made to it.
     |    */

scala> // Arrays are mutable indexed collections of values - here the Int elements of the Array are inferred by Scala

scala> val temps = Array(50, 51, 56, 53, 40)
temps: Array[Int] = Array(50, 51, 56, 53, 40)

scala> temps(1)
res0: Int = 51

scala> temps.length
res2: Int = 5

scala> temps(0) = 52

scala> temps(0)
res4: Int = 52

scala> // Instantiate a new Array of length 10 specifying the elements to be of type Int

scala> val temps2: Array[Int] = new Array[Int](10)
temps2: Array[Int] = Array(0, 0, 0, 0, 0, 0, 0, 0, 0, 0)

scala> // A multidimensional Array is instantiated by invoking the ofDim method of the Array object type

scala> val temps3 = Array.ofDim[Int](10,10)
temps3: Array[Array[Int]] = Array(Array(0, 0, 0, 0, 0, 0, 0, 0, 0, 0), Array(0, 0, 0, 0, 0, 0, 0, 0, 0, 0), Array(0, 0, 0, 0, 0, 0, 0, 0, 0, 0), Array(0, 0, 0, 0, 0, 0, 0, 0, 0, 0), Array(0, 0, 0, 0, 0, 0, 0, 0, 0, 0), Array(0, 0, 0, 0, 0, 0, 0, 0, 0, 0), Array(0, 0, 0, 0, 0, 0, 0, 0, 0, 0), Array(0, 0, 0, 0, 0, 0, 0, 0, 0, 0), Array(0, 0, 0, 0, 0, 0, 0, 0, 0, 0), Array(0, 0, 0, 0, 0, 0, 0, 0, 0, 0))

scala> // Array is a built-in collection type, but there is also a standard library of Array functions

scala> import Array._
import Array._

scala> concat(temps,temps2)
res6: Array[Int] = Array(52, 51, 56, 53, 40, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0)

scala> // The Scala REPL has Tab completion. E.g., we can list the functions associated with an Array by pressing [TAB].

scala> temps.
++             combinations    endsWith          headOption           lazyZip         partition           sameElements      splitAt        toSet
++:            companion       exists            indexOf              length          partitionMap        scan              startsWith     toStream
+:             compose         filter            indexOfSlice         lengthCompare   patch               scanLeft          stepper        toTraversable
/:             concat          filterNot         indexWhere           lengthIs        permutations        scanRight         sum            toVector
:+             contains        find              indices              lift            prefixLength        search            tail           transform
:++            containsSlice   findLast          init                 map             prepended           segmentLength     tails          transpose
:\             copyToArray     flatMap           inits                mapInPlace      prependedAll        seq               take           unapply
addString      copyToBuffer    flatten           intersect            max             product             size              takeRight      union
aggregate      corresponds     fold              isDefinedAt          maxBy           reduce              sizeCompare       takeWhile      unzip
andThen        count           foldLeft          isEmpty              maxByOption     reduceLeft          sizeIs            tapEach        unzip3
appended       diff            foldRight         isTraversableAgain   maxOption       reduceLeftOption    slice             to             update
appendedAll    distinct        forall            iterableFactory      min             reduceOption        sliding           toArray        updated
apply          distinctBy      foreach           iterator             minBy           reduceRight         sortBy            toBuffer       view
applyOrElse    drop            groupBy           knownSize            minByOption     reduceRightOption   sortInPlace       toIndexedSeq   withFilter
array          dropRight       groupMap          last                 minOption       repr                sortInPlaceBy     toIterable     zip
canEqual       dropWhile       groupMapReduce    lastIndexOf          mkString        reverse             sortInPlaceWith   toIterator     zipAll
clone          elemTag         grouped           lastIndexOfSlice     nonEmpty        reverseIterator     sortWith          toList         zipWithIndex
collect        elementWise     hasDefiniteSize   lastIndexWhere       orElse          reverseMap          sorted            toMap
collectFirst   empty           head              lastOption           padTo           runWith             span              toSeq

scala> // Vectors are an immutable type, but are similar to Arrays

scala> val vec1 : Vector[Int] = Vector(1,2,3,4,5,6,7,8,9,10)
vec1: Vector[Int] = Vector(1, 2, 3, 4, 5, 6, 7, 8, 9, 10)

scala> vec1(2)
res21: Int = 3

scala>  // Ranges can representing integer values in a range from a certain start value to an end value

scala> myRange
res25: scala.collection.immutable.Range.Inclusive = Range 1 to 10

scala>  /* Ranges can also be created using explicit variable assignment
     |     WARNING: Scala@1.13 thows an Error on this command: "class Range is abstract; cannot be instantiated".
     |              I have tested these commands using Scala@2.11 (the version the Dan Sullivan uses in his course)
     |              and it works.
     */

scala> val myRange2 : Range = new Range(1,101,2)
myRange2: Range = Range(1, 3, 5, 7, 9, 11, 13, 15, 17, 19, 21, 23, 25, 27, 29, 31, 33, 35, 37, 39, 41, 43, 45, 47, 49, 51, 53, 55, 57, 59, 61, 63, 65, 67, 69, 71, 73, 75, 77, 79, 81, 83, 85, 87, 89, 91, 93, 95, 97, 99)

## section_1.6-scala_maps.txt
$ scala
Welcome to Scala 2.11.12 (OpenJDK 64-Bit Server VM, Java 1.8.0_242).
Type in expressions for evaluation. Or try :help.

scala> // Maps are collections of Keys and Values.

scala> val capitals = Map("Argentina" -> "Buenos Aires", "Canada" -> "Ottawa", "Egypt" -> "Cairo", "Liberia" -> "Monrovia", "Netherlands" -> "Amsterdam", "United States" -> "Washington D.C.")
capitals: scala.collection.immutable.Map[String,String] = Map(United States -> Washington D.C., Argentina -> Buenos Aires, Egypt -> Cairo, Canada -> Ottawa, Liberia -> Monrovia, Netherlands -> Amsterdam)

scala> // To get the Keys or Values from a Map:

scala> capitals.keys
res3: Iterable[String] = Set(United States, Argentina, Egypt, Canada, Liberia, Netherlands)

scala> capitals.values
res4: Iterable[String] = MapLike(Washington D.C., Buenos Aires, Cairo, Ottawa, Monrovia, Amsterdam)

scala> // The get command is useful to lookup Values by Key:

scala> // It will return "None" if a Key is not found

scala> capitals get "Greenland"
res7: Option[String] = None

scala> // You can also lookup Values from Keys using this parenthases notation

scala> capitals("Canada")
res8: String = Ottawa

scala> // And you can test if a Map contains a key

scala> capitals contains "Egypt"
res9: Boolean = true

scala> capitals contains "Thailand"
res10: Boolean = false

scala> // To return a default value when a key value pair is not found, you can use the get or else operation

scala> capitals getOrElse("China","No capitals found")
res11: String = No capitals found

scala> capitals getOrElse("Canada","No capitals found")
res12: String = Ottawa

scala> /* We can add a key value pair using the plus operator.
     |    Maps are immutable so this has created a new Map (`res13` here).
     |    We can see that `capitals` still exists without the "Ireland" key after the previous `+` operation */
scala> capitals + ("Ireland" -> "Dublin")
res13: scala.collection.immutable.Map[String,String] = Map(United States -> Washington D.C., Argentina -> Buenos Aires, Egypt -> Cairo, Canada -> Ottawa, Liberia -> Monrovia, Ireland -> Dublin, Netherlands -> Amsterdam)

scala> capitals
res14: scala.collection.immutable.Map[String,String] = Map(United States -> Washington D.C., Argentina -> Buenos Aires, Egypt -> Cairo, Canada -> Ottawa, Liberia -> Monrovia, Netherlands -> Amsterdam)

scala> /* Similarly, there is a `-` operation that acts similarly on the immutable capitals map,
     |    except it removes a key value mapping from the Map */

scala> capitals - "Liberia"
res15: scala.collection.immutable.Map[String,String] = Map(United States -> Washington D.C., Argentina -> Buenos Aires, Egypt -> Cairo, Canada -> Ottawa, Netherlands -> Amsterdam)

## section_1.7-scala_expressions.txt
$ scala
Welcome to Scala 2.11.12 (OpenJDK 64-Bit Server VM, Java 1.8.0_242).
Type in expressions for evaluation. Or try :help.

scala> // Expressions are computable statements. Scala has a typical arithemtic set of expressions:

scala> 2+2
res16: Int = 4

scala> 100-80
res17: Int = 20

scala> 4*6
res18: Int = 24

scala> 33/4
res19: Int = 8

scala> 33%8
res20: Int = 1

scala> // and logical expressions for comparing values and performing logical conjunctions

scala> 3 > 4
res21: Boolean = false

scala> 5 <= 10
res22: Boolean = true

scala> // the && operator is the logical "and"

scala> (3>4) && (5 <= 10)
res23: Boolean = false

scala> // the || operator is the logical "or"

scala> (3>4) || (5<=10)
res24: Boolean = true

scala> !(3>4)
res25: Boolean = true

scala> /* These also work with variables
     |    and in addition assignments on those variables is possible too (since var are mutable)
     |    using a typical order of operations as found in most other programming languages (like Java) */

scala> var a = 10
a: Int = 10

scala> var b = 20
b: Int = 20

scala> var c = 30
c: Int = 30

scala> c += a

scala> c
res27: Int = 40

scala> c *= a

scala> c
res29: Int = 400

scala > /* We can tream multiple expressions as a single block.
    |      We also create a new `val a` within this expression, overwriting the previous `var a` above
    |      I've indented the block manually. Indenting unnecessary in code blocks in Scala, but helps readability. */

scala> println({
     |    val a = 2 * 3
     |    a + 4
     | })
10

## section_1.8-scala_functions.txt
$ scala
Welcome to Scala 2.11.12 (OpenJDK 64-Bit Server VM, Java 1.8.0_242).
Type in expressions for evaluation. Or try :help.

scala > /* Functions are expressions that can be called with parameters to compute a value.
    |      Functions are useful for grouping related expressions into logical units of work
    |      and specifying computations that you may want to use throughout your code */

scala > /* Here we define the function starting with the keyword `def`,
    |      give the function a name (`myFunction`),
    |      indicate the parameters including their type for the function to accept as arguments (`a:Int` and `b:Int`),
    |      and indicate the functions return return data type (int)
    |      then in an expression block, we perform the logic we are interested in for our function.
    |      we will multiply the arguments and assign that to a locally scoped `c` value, then return `c`.
    |      note that the returned value on calling the expression is not called `c` due to the scoping */

scala> def myFunction(a:Int, b:Int) : Int = {
     |   val c = a * b
     |   return c
     | }
myFunction: (a: Int, b: Int)Int

scala> myFunction(2,3)
res0: Int = 6

scala> /* The next function myProcedure uses a `Unit` return type,
     |    which is the equivalent of a Java void, or None in Python */

scala> def myProcedure(inStr: String) : Unit = {
     |    println(inStr)
     | }
myProcedure: (inStr: String)Unit

scala> myProcedure("Hello World")
Hello World

## section_1.9-scala_objects_and_classes.txt
$ scala
Welcome to Scala 2.11.12 (OpenJDK 64-Bit Server VM, Java 1.8.0_242).
Type in expressions for evaluation. Or try :help.

scala> /* Scala is an Object Oriented language, and all varaibles are objects.
    |     Classes are definitions of structures and operations on those structures.
    |     When we create a variable or value of a particular data type, we can access
    |     operations defined by the object representing that type.
    |
    |     Classes and objects are useful for organizing structure definitions and functions on those structures
    |     and can help organize and simplify code */

scala> /* We have used objects already. For instance, an Array has the `sorted` function.
    |     The `sorted` operation will base its function on the type of elements contained by the Array
    |     An Array inferred to contain String elements will sort alphabetically,
    |     while one containing Int elements will return the elements in ascending numerical order */

scala> val y = Array("England", "Liberia", "Haiti", "Australia", "Sweden")
y: Array[String] = Array(England, Liberia, Haiti, Australia, Sweden)

scala> y.sorted
res2: Array[String] = Array(Australia, England, Haiti, Liberia, Sweden)

scala> val z = Array(4,5,7,1,3,6)
z: Array[Int] = Array(4, 5, 7, 1, 3, 6)

scala> z.sorted
res3: Array[Int] = Array(1, 3, 4, 5, 6, 7)

scala> /* We can define our own objects as classes using the `class` keyword followed by the class name and parameters.
    |     For instance, a GPS sensor data object might have a some coordinate and altitude data.
    |     A simple class does not need to have anything beyond the definintion.
    |     We will define a location class with no contained expressions or functions as an example */

scala> class location(var latitude:Int, var lat_direction:Char, var longitude:Int, var long_direction:Char, val altitude:Int)
defined class location

scala> val loc1 = new location(45,'N',120,'W',300)
loc1: location = location@4342c13

scala> // tab completion can show us all the public members of a class that we can access through expressions

scala> loc1.
altitude   lat_direction   latitude   long_direction   longitude

scala> loc1.altitude
res4: Int = 300

scala> loc1.lat_direction
res5: Char = N

scala> /* Private classes also exist; they will not be accessible through outside expressions.
    |     We can also initialize defaults inside the scope of the class parameters.
    |     Here we default the Int parameters x, y, and z to 0, and make z a private memeber of the class.

scala> class myPublicPrivate(val x:Int=0, val y:Int=0, private val z:Int=0)
defined class myPublicPrivate

scala> val myPP = new myPublicPrivate
myPP: myPublicPrivate = myPublicPrivate@1e6e29d6

scala> // Tab completion shows us only the public members

scala> myPP.
x   y

scala> // And we can access the public members, but not the private members

scala> myPP.x
res6: Int = 0

scala> myPP.z
<console>:13: error: value z in class myPublicPrivate cannot be accessed in myPublicPrivate
       myPP.z
            ^

scala> // Here we define a class containing a function inside the class allowing us to create a 2D point, and move it

scala> class Point2D(coord1:Int, coord2:Int) {
     |   var a: Int = coord1
     |   var b: Int = coord2
     |   def move(delta_a: Int, delta_b:Int) {
     |     a = a + delta_a
     |     b = b + delta_b
     |   }
     | }
defined class Point2D

scala> val point1 = new Point2D(10,20)
point1: Point2D = Point2D@28068327

scala> /* Tab completion works with regards to showing the public members of a class (values, variables, and functions).
    |     Here is the tab completion on the point1 val object which instantiates the Point2D class we just defined.

scala> point1.
a   b   move

scala> point1.a
res8: Int = 10

scala> point1.b
res9: Int = 20

scala> point1.move(5,15)

scala> point1.a
res11: Int = 15

scala> point1.b
res12: Int = 35

## section_2.2-mapping_functions_over_parallel_collections.txt
$ scala
Welcome to Scala 2.11.12 (OpenJDK 64-Bit Server VM, Java 1.8.0_242).
Type in expressions for evaluation. Or try :help.

scala> /* First we will create a sequential range, and then run range's
    |     `par` method in to create a parallel version of the same range. */

scala> val rng100 = 1 to 100
rng100: scala.collection.immutable.Range.Inclusive = Range(1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 40, 41, 42, 43, 44, 45, 46, 47, 48, 49, 50, 51, 52, 53, 54, 55, 56, 57, 58, 59, 60, 61, 62, 63, 64, 65, 66, 67, 68, 69, 70, 71, 72, 73, 74, 75, 76, 77, 78, 79, 80, 81, 82, 83, 84, 85, 86, 87, 88, 89, 90, 91, 92, 93, 94, 95, 96, 97, 98, 99, 100)

scala> var prng100 = rng100.par
prng100: scala.collection.parallel.immutable.ParRange = ParRange(1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 40, 41, 42, 43, 44, 45, 46, 47, 48, 49, 50, 51, 52, 53, 54, 55, 56, 57, 58, 59, 60, 61, 62, 63, 64, 65, 66, 67, 68, 69, 70, 71, 72, 73, 74, 75, 76, 77, 78, 79, 80, 81, 82, 83, 84, 85, 86, 87, 88, 89, 90, 91, 92, 93, 94, 95, 96, 97, 98, 99, 100)

scala> /* Now tab complete on each of rng100 and prng100 to see their methods.
     |    Notice that prng100 has things like BuilderOps, ParRangeIterator,
     |    SSCTask, SignallingOps, and TaskOps; which rng100 does not. */

scala> prng100.
++                 drop              headOption                   nonEmpty            segmentLength   toList
+:                 dropWhile         indexOf                      padTo               seq             toMap
/:                 endsWith          indexWhere                   par                 size            toSeq
:+                 equals            init                         partition           slice           toSet
:\                 exists            intersect                    patch               span            toStream
BuilderOps         filter            isDefinedAt                  prefixLength        splitAt         toString
ParRangeIterator   filterNot         isEmpty                      product             splitter        toTraversable
SSCTask            find              isStrictSplitterCollection   range               startsWith      toVector
SignallingOps      flatMap           isTraversableAgain           reduce              stringPrefix    transpose
TaskOps            flatten           iterator                     reduceLeft          sum             union
aggregate          fold              last                         reduceLeftOption    tail            unzip
apply              foldLeft          lastIndexOf                  reduceOption        take            unzip3
canEqual           foldRight         lastIndexWhere               reduceRight         takeWhile       updated
collect            forall            lastOption                   reduceRightOption   tasksupport     view
companion          foreach           length                       repr                tasksupport_=   withFilter
copyToArray        genericBuilder    map                          reverse             to              zip
corresponds        genericCombiner   max                          reverseMap          toArray         zipAll
count              groupBy           maxBy                        sameElements        toBuffer        zipWithIndex
debugBuffer        hasDefiniteSize   min                          scan                toIndexedSeq
diff               hashCode          minBy                        scanLeft            toIterable
distinct           head              mkString                     scanRight           toIterator

scala> rng100.
++              diff              headOption           maxBy               sameElements      to
++:             distinct          inclusive            min                 scan              toArray
+:              drop              indexOf              minBy               scanLeft          toBuffer
/:              dropRight         indexOfSlice         mkString            scanRight         toIndexedSeq
:+              dropWhile         indexWhere           nonEmpty            segmentLength     toIterable
:\              end               indices              numRangeElements    seq               toIterator
WithFilter      endsWith          init                 orElse              size              toList
addString       equals            inits                padTo               slice             toMap
aggregate       exists            intersect            par                 sliding           toSeq
andThen         filter            isDefinedAt          partition           sortBy            toSet
apply           filterNot         isEmpty              patch               sortWith          toStream
applyOrElse     find              isInclusive          permutations        sorted            toString
by              flatMap           isTraversableAgain   prefixLength        span              toTraversable
canEqual        flatten           iterator             product             splitAt           toVector
collect         fold              last                 reduce              start             transpose
collectFirst    foldLeft          lastElement          reduceLeft          startsWith        union
combinations    foldRight         lastIndexOf          reduceLeftOption    step              unzip
companion       forall            lastIndexOfSlice     reduceOption        stringPrefix      unzip3
compose         foreach           lastIndexWhere       reduceRight         sum               updated
contains        genericBuilder    lastOption           reduceRightOption   tail              view
containsSlice   groupBy           length               repr                tails             withFilter
copyToArray     grouped           lengthCompare        reverse             take              zip
copyToBuffer    hasDefiniteSize   lift                 reverseIterator     takeRight         zipAll
corresponds     hashCode          map                  reverseMap          takeWhile         zipWithIndex
count           head              max                  runWith             terminalElement

scala> /* To explicitly instantiate a parallel type like ParVector, we import
    |     that parallel collection from the Scala standard library and use
    |     the imported type when initializing the new val. */

scala> import scala.collection.parallel.immutable.ParVector

scala> val pvec200 = ParVector.range(0,200)
pvec200: scala.collection.parallel.immutable.ParVector[Int] = ParVector(0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 40, 41, 42, 43, 44, 45, 46, 47, 48, 49, 50, 51, 52, 53, 54, 55, 56, 57, 58, 59, 60, 61, 62, 63, 64, 65, 66, 67, 68, 69, 70, 71, 72, 73, 74, 75, 76, 77, 78, 79, 80, 81, 82, 83, 84, 85, 86, 87, 88, 89, 90, 91, 92, 93, 94, 95, 96, 97, 98, 99, 100, 101, 102, 103, 104, 105, 106, 107, 108, 109, 110, 111, 112, 113, 114, 115, 116, 117, 118, 119, 120, 121, 122, 123, 124, 125, 126, 127, 128, 129, 130, 131, 132, 133, 134, 135, 136, 137, 138, 139, 140, 141, 142, 143, 144, 145, 146, 147, 148, 149, 150, 151, 152, 153, 154, 155, 156, 157, 158, 159, 160, 161, 162, 163, 164, 165, 166, ...

## section_2.3-creating_parallel_collections.txt
$ scala
Welcome to Scala 2.11.12 (OpenJDK 64-Bit Server VM, Java 1.8.0_242).
Type in expressions for evaluation. Or try :help.

scala> /* The map method differs from the map collection discussed in
    |     section 1.6: map collections are are groups of key value pairs.
    |     The map method is a functional programing construct to apply
    |     a function to each member of a collection. The map method works on
    |     sequential and parallel collections. Below, we create sequential and
    |     parallel versions of the same range (1 to 100). */

scala> val v = (1 to 100).toArray
v: Array[Int] = Array(1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 40, 41, 42, 43, 44, 45, 46, 47, 48, 49, 50, 51, 52, 53, 54, 55, 56, 57, 58, 59, 60, 61, 62, 63, 64, 65, 66, 67, 68, 69, 70, 71, 72, 73, 74, 75, 76, 77, 78, 79, 80, 81, 82, 83, 84, 85, 86, 87, 88, 89, 90, 91, 92, 93, 94, 95, 96, 97, 98, 99, 100)

scala> val pv = v.par
pv: scala.collection.parallel.mutable.ParArray[Int] = ParArray(1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 40, 41, 42, 43, 44, 45, 46, 47, 48, 49, 50, 51, 52, 53, 54, 55, 56, 57, 58, 59, 60, 61, 62, 63, 64, 65, 66, 67, 68, 69, 70, 71, 72, 73, 74, 75, 76, 77, 78, 79, 80, 81, 82, 83, 84, 85, 86, 87, 88, 89, 90, 91, 92, 93, 94, 95, 96, 97, 98, 99, 100)

scala> /* now we map each member of the vector range to a multiplication
    |     operation. this is performed on the underscore character.
    |     The underscore keyword is an anonymous variable that serves to
    |     represent each positionally matched member of the range. */

scala> v.map(_ * 2)
res0: Array[Int] = Array(2, 4, 6, 8, 10, 12, 14, 16, 18, 20, 22, 24, 26, 28, 30, 32, 34, 36, 38, 40, 42, 44, 46, 48, 50, 52, 54, 56, 58, 60, 62, 64, 66, 68, 70, 72, 74, 76, 78, 80, 82, 84, 86, 88, 90, 92, 94, 96, 98, 100, 102, 104, 106, 108, 110, 112, 114, 116, 118, 120, 122, 124, 126, 128, 130, 132, 134, 136, 138, 140, 142, 144, 146, 148, 150, 152, 154, 156, 158, 160, 162, 164, 166, 168, 170, 172, 174, 176, 178, 180, 182, 184, 186, 188, 190, 192, 194, 196, 198, 200)

scala> pv.map(_ * 2)
res1: scala.collection.parallel.mutable.ParArray[Int] = ParArray(2, 4, 6, 8, 10, 12, 14, 16, 18, 20, 22, 24, 26, 28, 30, 32, 34, 36, 38, 40, 42, 44, 46, 48, 50, 52, 54, 56, 58, 60, 62, 64, 66, 68, 70, 72, 74, 76, 78, 80, 82, 84, 86, 88, 90, 92, 94, 96, 98, 100, 102, 104, 106, 108, 110, 112, 114, 116, 118, 120, 122, 124, 126, 128, 130, 132, 134, 136, 138, 140, 142, 144, 146, 148, 150, 152, 154, 156, 158, 160, 162, 164, 166, 168, 170, 172, 174, 176, 178, 180, 182, 184, 186, 188, 190, 192, 194, 196, 198, 200)

scala> // Or if you find it more interesting, we could map it to odd numbers

scala> pv.map(_ * 2 + 1)
res1: scala.collection.parallel.mutable.ParArray[Int] = ParArray(3, 5, 7, 9, 11, 13, 15, 17, 19, 21, 23, 25, 27, 29, 31, 33, 35, 37, 39, 41, 43, 45, 47, 49, 51, 53, 55, 57, 59, 61, 63, 65, 67, 69, 71, 73, 75, 77, 79, 81, 83, 85, 87, 89, 91, 93, 95, 97, 99, 101, 103, 105, 107, 109, 111, 113, 115, 117, 119, 121, 123, 125, 127, 129, 131, 133, 135, 137, 139, 141, 143, 145, 147, 149, 151, 153, 155, 157, 159, 161, 163, 165, 167, 169, 171, 173, 175, 177, 179, 181, 183, 185, 187, 189, 191, 193, 195, 197, 199, 201)

scala> // We'll now map using a function. Define square() and confirm it works:

scala> def square (x:Int) : Int = { return x * x }
square: (x: Int)Int

scala> square(4)
res2: Int = 16

scala> /* square the members of the vector ranges using the positional match
    |     anonymous variable to the square function under the map method. */

scala> v.map(square(_))
res3: Array[Int] = Array(1, 4, 9, 16, 25, 36, 49, 64, 81, 100, 121, 144, 169, 196, 225, 256, 289, 324, 361, 400, 441, 484, 529, 576, 625, 676, 729, 784, 841, 900, 961, 1024, 1089, 1156, 1225, 1296, 1369, 1444, 1521, 1600, 1681, 1764, 1849, 1936, 2025, 2116, 2209, 2304, 2401, 2500, 2601, 2704, 2809, 2916, 3025, 3136, 3249, 3364, 3481, 3600, 3721, 3844, 3969, 4096, 4225, 4356, 4489, 4624, 4761, 4900, 5041, 5184, 5329, 5476, 5625, 5776, 5929, 6084, 6241, 6400, 6561, 6724, 6889, 7056, 7225, 7396, 7569, 7744, 7921, 8100, 8281, 8464, 8649, 8836, 9025, 9216, 9409, 9604, 9801, 10000)

scala> pv.map(square(_))
res4: scala.collection.parallel.mutable.ParArray[Int] = ParArray(1, 4, 9, 16, 25, 36, 49, 64, 81, 100, 121, 144, 169, 196, 225, 256, 289, 324, 361, 400, 441, 484, 529, 576, 625, 676, 729, 784, 841, 900, 961, 1024, 1089, 1156, 1225, 1296, 1369, 1444, 1521, 1600, 1681, 1764, 1849, 1936, 2025, 2116, 2209, 2304, 2401, 2500, 2601, 2704, 2809, 2916, 3025, 3136, 3249, 3364, 3481, 3600, 3721, 3844, 3969, 4096, 4225, 4356, 4489, 4624, 4761, 4900, 5041, 5184, 5329, 5476, 5625, 5776, 5929, 6084, 6241, 6400, 6561, 6724, 6889, 7056, 7225, 7396, 7569, 7744, 7921, 8100, 8281, 8464, 8649, 8836, 9025, 9216, 9409, 9604, 9801, 10000)

## section_2.4-filtering_parallel_collections.txt
$ scala
Welcome to Scala 2.11.12 (OpenJDK 64-Bit Server VM, Java 1.8.0_242).
Type in expressions for evaluation. Or try :help.

scala> //filters let us select a subset of elements from a collection

scala> val v = (1 to 10000).toArray
v: Array[Int] = Array(1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 40, 41, 42, 43, 44, 45, 46, 47, 48, 49, 50, 51, 52, 53, 54, 55, 56, 57, 58, 59, 60, 61, 62, 63, 64, 65, 66, 67, 68, 69, 70, 71, 72, 73, 74, 75, 76, 77, 78, 79, 80, 81, 82, 83, 84, 85, 86, 87, 88, 89, 90, 91, 92, 93, 94, 95, 96, 97, 98, 99, 100, 101, 102, 103, 104, 105, 106, 107, 108, 109, 110, 111, 112, 113, 114, 115, 116, 117, 118, 119, 120, 121, 122, 123, 124, 125, 126, 127, 128, 129, 130, 131, 132, 133, 134, 135, 136, 137, 138, 139, 140, 141, 142, 143, 144, 145, 146, 147, 148, 149, 150, 151, 152, 153, 154, 155, 156, 157, 158, 159, 160, 161, 162, 163, 164, 165, 166, 167, 168, 169, 170, 171, 172, 173, 174, 175, 176, 177...
scala> val pv = v.par
pv: scala.collection.parallel.mutable.ParArray[Int] = ParArray(1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 40, 41, 42, 43, 44, 45, 46, 47, 48, 49, 50, 51, 52, 53, 54, 55, 56, 57, 58, 59, 60, 61, 62, 63, 64, 65, 66, 67, 68, 69, 70, 71, 72, 73, 74, 75, 76, 77, 78, 79, 80, 81, 82, 83, 84, 85, 86, 87, 88, 89, 90, 91, 92, 93, 94, 95, 96, 97, 98, 99, 100, 101, 102, 103, 104, 105, 106, 107, 108, 109, 110, 111, 112, 113, 114, 115, 116, 117, 118, 119, 120, 121, 122, 123, 124, 125, 126, 127, 128, 129, 130, 131, 132, 133, 134, 135, 136, 137, 138, 139, 140, 141, 142, 143, 144, 145, 146, 147, 148, 149, 150, 151, 152, 153, 154, 155, 156, 157, 158, 159, 160, 161, 162, 163, 164, 165, 166, 167, 168, 16...
scala>

scala> // since we don't see all 10000 members, lets just check their lengths:

scala> v.length
res0: Int = 10000

scala> pv.length
res1: Int = 10000

scala> // The filter method takes a boolean operation:

scala> val pvf = pv.filter(_ > 5000)
pvf: scala.collection.parallel.mutable.ParArray[Int] = ParArray(5001, 5002, 5003, 5004, 5005, 5006, 5007, 5008, 5009, 5010, 5011, 5012, 5013, 5014, 5015, 5016, 5017, 5018, 5019, 5020, 5021, 5022, 5023, 5024, 5025, 5026, 5027, 5028, 5029, 5030, 5031, 5032, 5033, 5034, 5035, 5036, 5037, 5038, 5039, 5040, 5041, 5042, 5043, 5044, 5045, 5046, 5047, 5048, 5049, 5050, 5051, 5052, 5053, 5054, 5055, 5056, 5057, 5058, 5059, 5060, 5061, 5062, 5063, 5064, 5065, 5066, 5067, 5068, 5069, 5070, 5071, 5072, 5073, 5074, 5075, 5076, 5077, 5078, 5079, 5080, 5081, 5082, 5083, 5084, 5085, 5086, 5087, 5088, 5089, 5090, 5091, 5092, 5093, 5094, 5095, 5096, 5097, 5098, 5099, 5100, 5101, 5102, 5103, 5104, 5105, 5106, 5107, 5108, 5109, 5110, 5111, 5112, 5113, 5114, 5115, 5116, 5117, 5118, 5119, 5120, 5121, 5122, 5...
scala> pvf.length
res2: Int = 5000

scala> // The filterNot method works similarly, but performs the negated filter

scala> val pvf2 = pv.filterNot(_ > 5000)
pvf2: scala.collection.parallel.mutable.ParArray[Int] = ParArray(1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 40, 41, 42, 43, 44, 45, 46, 47, 48, 49, 50, 51, 52, 53, 54, 55, 56, 57, 58, 59, 60, 61, 62, 63, 64, 65, 66, 67, 68, 69, 70, 71, 72, 73, 74, 75, 76, 77, 78, 79, 80, 81, 82, 83, 84, 85, 86, 87, 88, 89, 90, 91, 92, 93, 94, 95, 96, 97, 98, 99, 100, 101, 102, 103, 104, 105, 106, 107, 108, 109, 110, 111, 112, 113, 114, 115, 116, 117, 118, 119, 120, 121, 122, 123, 124, 125, 126, 127, 128, 129, 130, 131, 132, 133, 134, 135, 136, 137, 138, 139, 140, 141, 142, 143, 144, 145, 146, 147, 148, 149, 150, 151, 152, 153, 154, 155, 156, 157, 158, 159, 160, 161, 162, 163, 164, 165, 166, 167, 168, ...
scala> pvf2.length
res3: Int = 5000

scala> /* like map, you can pass functions to filters to apply a more complex
    |     boolean operation, such as one that will filter out those values
    |     divisible by three. */

scala> def div3 (x:Int) : Boolean = { val y:Int = ( x % 3); return (y == 0) }
div3: (x: Int)Boolean

scala> div3(3)
res4: Boolean = true

scala> div3(9)
res5: Boolean = true

scala> div3(5)
res6: Boolean = false

scala>

scala> pv.filter(div3(_))
res7: scala.collection.parallel.mutable.ParArray[Int] = ParArray(3, 6, 9, 12, 15, 18, 21, 24, 27, 30, 33, 36, 39, 42, 45, 48, 51, 54, 57, 60, 63, 66, 69, 72, 75, 78, 81, 84, 87, 90, 93, 96, 99, 102, 105, 108, 111, 114, 117, 120, 123, 126, 129, 132, 135, 138, 141, 144, 147, 150, 153, 156, 159, 162, 165, 168, 171, 174, 177, 180, 183, 186, 189, 192, 195, 198, 201, 204, 207, 210, 213, 216, 219, 222, 225, 228, 231, 234, 237, 240, 243, 246, 249, 252, 255, 258, 261, 264, 267, 270, 273, 276, 279, 282, 285, 288, 291, 294, 297, 300, 303, 306, 309, 312, 315, 318, 321, 324, 327, 330, 333, 336, 339, 342, 345, 348, 351, 354, 357, 360, 363, 366, 369, 372, 375, 378, 381, 384, 387, 390, 393, 396, 399, 402, 405, 408, 411, 414, 417, 420, 423, 426, 429, 432, 435, 438, 441, 444, 447, 450, 453, 456, 459, 462...

## section_3.2-loading_data_into_postgresql.txt
$ createdb scala_db
$ createuser scala_db
$ ls
emps.sql
$ psql -U scala_db -d scala_db -a -f emps.sql
$ psql -U scala_db -d scala_db
psql (12.2)
Type "help" for help.

scala_db=> select count(*) from emps
scala_db-> ;
 count
-------
  1000
(1 row)

scala_db=> select * from emps limit 5;
 id | last_name  |           email            | gender | department | start_date | salary |         job_title          | region_id
----+------------+----------------------------+--------+------------+------------+--------+----------------------------+-----------
  1 | Kelley     | rkelley0@soundcloud.com    | Female | Computers  | 2009-10-02 |  67470 | Structural Engineer        |         2
  2 | Armstrong  | sarmstrong1@infoseek.co.jp | Male   | Sports     | 2008-03-31 |  71869 | Financial Advisor          |         2
  3 | Carr       | fcarr2@woothemes.com       | Male   | Automotive | 2009-07-12 | 101768 | Recruiting Manager         |         3
  4 | Murray     | jmurray3@gov.uk            | Female | Jewelery   | 2014-12-25 |  96897 | Desktop Support Technician |         3
  5 | Ellis      | jellis4@sciencedirect.com  | Female | Grocery    | 2002-09-19 |  63702 | Software Engineer III      |         7
(5 rows)

scala_db=> \q

## section_3.3-connecting_to_postgresql.txt
$ # We need to download the PostgreSQL JDBC driver in order to interact with
$ # the scala_db database we have created in PostgreSQL via Scala. Recall that
$ # scala uses the JVM, so JAR files are executed by Scala just as Java could.

$ # visit https://jdbc.postgresql.org/ for download links. I will use wget to
$ # download a into my home directory.

$ cd ~

$ wget https://jdbc.postgresql.org/download/postgresql-42.2.10.jar
--2020-03-07 14:22:36--  https://jdbc.postgresql.org/download/postgresql-42.2.10.jar
Resolving jdbc.postgresql.org (jdbc.postgresql.org)... 72.32.157.228
Connecting to jdbc.postgresql.org (jdbc.postgresql.org)|72.32.157.228|:443... connected.
HTTP request sent, awaiting response... 200 OK
Length: 927447 (906K) [application/java-archive]
Saving to: ‘postgresql-42.2.10.jar’

postgresql-42.2.10.jar                100%[=========================================================================>] 905.71K  3.05MB/s    in 0.3s

2020-03-07 14:22:37 (3.05 MB/s) - ‘postgresql-42.2.10.jar’ saved [927447/927447]

$ # We need to invoke the jdbc driver by passing the classpath option when we
$ # invoke the Scala REPL.

$ scala -classpath postgresql-42.2.10.jar
Welcome to Scala 2.11.12 (OpenJDK 64-Bit Server VM, Java 1.8.0_242).
Type in expressions for evaluation. Or try :help.

scala> /* Import the Java Classes that provide Scala an ability to interact
    |     with SQL: a basic JDBC managing service and the a connection
    |     interface  */

scala> import java.sql.DriverManager
import java.sql.DriverManager

scala> import java.sql.Connection
import java.sql.Connection

scala> /* Database connection strings.
    |     NOTE: Setting `val driver = "org.postgresql.Driver"` and executing
    |     `Class.forName(driver)` should not be required as of Java 1.6.
    |     The PGJDBC driver jar supports the Java Service Provider mechanism.
    |     The driver will be loaded by the JVM when the application connects to
    |     PostgreSQL as long as the driver's Jar file is on the classpath.
    |     see: https://jdbc.postgresql.org/documentation/head/load.html.
    |     Explicitly loading the driver does no harm though. */

scala> val driver = "org.postgresql.Driver"  // OPTIONAL (implicitly handled)
driver: String = org.postgresql.Driver

scala> Class.forName(driver)  // OPTIONAL (implicitly handled by JVM)
res0: Class[_] = class org.postgresql.Driver

scala> /* Attempt to establish a connection to the database URL. DriverManager
    |     will select an appropriate driver from the set of registered JDBC
    |     drivers of any driver Jars which were on the classpath (which is why
    |     the previous driver loading step was unnecessary). */

scala> val url = "jdbc:postgresql://localhost/scala_db?user=scala_db"
url: String = jdbc:postgresql://localhost/scala_db?user=scala_db

scala> var connection: Connection = DriverManager.getConnection(url)
connection: java.sql.Connection = org.postgresql.jdbc.PgConnection@2aa7399c

scala> /* Create a Statement object for sending SQL statements to the database,
    |     and execute a statement to select every row from the emps table.

scala> val statement = connection.createStatement()
statement: java.sql.Statement = org.postgresql.jdbc.PgStatement@5856dbe4

scala> /* `executeQuery` on a Statement returns a ResultSet object, which is an
    |     iterable table with a cursor pointing to its current row of data */

scala> val resultSet = statement.executeQuery("SELECT * FROM emps")
resultSet: java.sql.ResultSet = org.postgresql.jdbc.PgResultSet@8c43966

scala> /* The `resultSet` cursor is  initially positioned before the first row,
    |     so we call next to move it to the first row of data, and then we can
    |     call getter functions on it to retrieve values from the table.
    |     For details on the resultSet class, review:
    |     https://docs.oracle.com/javase/8/docs/api/java/sql/ResultSet.html

scala> resultSet.next
res1: Boolean = true

scala> resultSet.getString("last_name")
res2: String = Kelley

## section_3.4-querying_with_sql_strings.txt
$ scala -classpath postgresql-42.2.10.jar
Welcome to Scala 2.11.12 (OpenJDK 64-Bit Server VM, Java 1.8.0_242).
Type in expressions for evaluation. Or try :help.

scala> import java.sql.DriverManager
import java.sql.DriverManager

scala> import java.sql.Connection
import java.sql.Connection

scala> val url = "jdbc:postgresql://localhost/scala_db?user=scala_db"
url: String = jdbc:postgresql://localhost/scala_db?user=scala_db

scala> var connection: Connection = DriverManager.getConnection(url)
connection: java.sql.Connection = org.postgresql.jdbc.PgConnection@1623134f

scala> val statement = connection.createStatement()
statement: java.sql.Statement = org.postgresql.jdbc.PgStatement@4f2613d1

scala> val resultSet = statement.executeQuery("SELECT * FROM emps")
resultSet: java.sql.ResultSet = org.postgresql.jdbc.PgResultSet@1992eaf4

scala> resultSet.next
res0: Boolean = true

scala> resultSet.getString("department")
res1: String = Computers

scala> resultSet.getString("last_name")
res2: String = Kelley

scala> resultSet.getString("start_date")
res3: String = 2009-10-02

scala> val resultSet2 = statement.executeQuery("SELECT * FROM company_divisions")
resultSet2: java.sql.ResultSet = org.postgresql.jdbc.PgResultSet@4fd4cae3

scala> while ( resultSet2.next() ) {
     |   val dept = resultSet2.getString("department")
     |   val comp_div = resultSet2.getString("company_division")
     |   println(dept + "  " + comp_div)
     | }
Automotive  Auto & Hardware
Baby  Domestic
Beauty  Domestic
Clothing  Domestic
Computers  Electronic Equipment
Electronics  Electronic Equipment
Games  Domestic
Garden  Outdoors & Garden
Grocery  Domestic
Health  Domestic
Home  Domestic
Industrial  Auto & Hardware
Jewelery  Fashion
Kids  Domestic
Movies  Entertainment
Music  Entertainment
Outdoors  Outdoors & Garden
Shoes  Domestic
Sports  Games & Sports
Tools  Auto & Hardware
Toys  Games & Sports

## section_3.5-querying_with_prepared_statements.txt
$ scala -classpath postgresql-42.2.10.jar
Welcome to Scala 2.11.12 (OpenJDK 64-Bit Server VM, Java 1.8.0_242).
Type in expressions for evaluation. Or try :help.

scala> /* Prepared statement allows us to execute a query repeatedly without
    |     forcing the database to parse and build a query execution plan each
    |     time we execute that statement. */

scala> import java.sql.DriverManager
import java.sql.DriverManager

scala> import java.sql.Connection
import java.sql.Connection

scala> val url = "jdbc:postgresql://localhost/scala_db?user=scala_db"
url: String = jdbc:postgresql://localhost/scala_db?user=scala_db

scala> var connection: Connection = DriverManager.getConnection(url)
connection: java.sql.Connection = org.postgresql.jdbc.PgConnection@1623134f

scala> // in `query_str`, the `?` character is a parameter placeholder

scala> val query_str = "SELECT * FROM company_regions WHERE region_id > ?"
query_str: String = SELECT * FROM company_regions WHERE region_id > ?

scala> /* prepareStatement() parses the query string, identifies the parameter
    |     placeholder(s) and precompiles an efficient execution plan. For
    |     details read:
    |     https://docs.oracle.com/javase/8/docs/api/java/sql/PreparedStatement.html
    |     */

scala> val ps = connection.prepareStatement(query_str)
ps: java.sql.PreparedStatement = SELECT * FROM company_regions WHERE region_id > ?

scala> ps.setInt(1,5)  // 1 is the parameterIndex, and 5 is the Integer value

scala> val rs = ps.executeQuery()
rs: java.sql.ResultSet = org.postgresql.jdbc.PgResultSet@6a4d7f76

scala> rs.next
res1: Boolean = true

scala> rs.getInt("region_id")
res2: Int = 6

scala> rs.getString("company_regions")
res3: String = Quebec

scala> rs.getString("country")
res4: String = Canada

scala> ps.setInt(1,3)

scala> /* At this point the instructor fails to re-run the ps.executeQuery()
    |     command, so the next returned is the next row of the original
    |     execution. I'll demonstrate the proof for this and show what the
    |     instructor had probably intended to show (the potential for the
    |     reusability of prepared statements).

scala> rs.next
res6: Boolean = true

scala> rs.getInt("region_id")
res7: Int = 7

scala> rs.getString("company_regions")
res8: String = Nova Scotia

scala> rs.getString("country")
res9: String = Canada

scala> /* If we're still on the result set of the original query execution,
    |     then rs.next should return false (since that result set had only two
    |     rows. We know this by running the query:
    |     SELECT * FROM company_regions WHERE region_id > 5 ;
    |     in psql. That looks like this:
    |
    |     ### (Running in a different terminal session)
    |     scala_db=> SELECT * FROM company_regions WHERE region_id > 5;
    |      region_id | company_regions | country
    |     -----------+-----------------+---------
    |              6 | Quebec          | Canada
    |              7 | Nova Scotia     | Canada
    |     (2 rows)
    |
    |     */

scala> rs.next
res10: Boolean = false

scala> // So let's execute the prepared statement that set ps.setInt(1,3):

scala> val rs = ps.executeQuery()
rs: java.sql.ResultSet = org.postgresql.jdbc.PgResultSet@a52ca2e

scala> rs.next
res11: Boolean = true

scala> rs.getInt("region_id")
res12: Int = 4

scala> rs.getString("company_regions")
res13: String = Southwest

scala> rs.getString("country")
res14: String = USA

scala> /* This makes sense from the result of that statement executed in psql:
    |
    |     ### (Running in a different terminal session)
    |     scala_db=> SELECT * FROM company_regions WHERE region_id > 5;
    |      region_id | company_regions  | country
    |     -----------+------------------+---------
    |              4 | Southwest        | USA
    |              5 | British Columbia | Canada
    |              6 | Quebec           | Canada
    |              7 | Nova Scotia      | Canada
    |     (4 rows)
    |
    |     */
	$ scala
	Welcome to Scala 2.11.12 (OpenJDK 64-Bit Server VM, Java 1.8.0_242).
	Type in expressions for evaluation. Or try :help.

	scala> // Defining mutable variables with explicit types

	scala> var a_int : Int = 3
	a_int: Int = 3

	scala> var a_char : Char = 'd'
	a_char: Char = d

	scala> var a_long : Long = 8345679
	a_long: Long = 8345679

	scala> // Defining variables allowing Scala to infer the data type

	scala> var b = 3
	b: Int = 3

	scala> var b_char = 'd'
	b_char: Char = d

	scala> var b_long = 834567989
	b_long: Int = 834567989

	scala> // Scala attempts to infer an Int for a number too large for an Int.

	scala> var b_long = 8345679899
	^
	error: integer number too large

	scala> // Scala attempts to infer an Int on what needs to be a Long. Append with L to have Scala infer a Long

	scala> var b_long = 8345679899L
	b_long: Long = 8345679899

	scala> // Scala infers decimal variables as Doubles.

	scala> var c_float = 1.2345
	c_float: Double = 1.2345

	scala> // By appending F to the number, Scala will infer the number as a Float

	scala> var d_float = 1.2345F
	d_float: Float = 1.2345

	scala> // Defining immutable values follows the same patterns as defining mutable variables, but uses the `val` keyword

	scala> val z = 8345678489L
	z: Long = 8345678489
	$ createdb scala_db
	$ createuser scala_db
	$ ls
	emps.sql
	$ psql -U scala_db -d scala_db -a -f emps.sql
	$ psql -U scala_db -d scala_db
	psql (12.2)
	Type "help" for help.

	scala_db=> select count(*) from emps
	scala_db-> ;
	count
	-------
	1000
	(1 row)

	scala_db=> select * from emps limit 5;
	id \| last_name \| email \| gender \| department \| start_date \| salary \| job_title \| region_id
	----+------------+----------------------------+--------+------------+------------+--------+----------------------------+-----------
	1 \| Kelley \| rkelley0@soundcloud.com \| Female \| Computers \| 2009-10-02 \| 67470 \| Structural Engineer \| 2
	2 \| Armstrong \| sarmstrong1@infoseek.co.jp \| Male \| Sports \| 2008-03-31 \| 71869 \| Financial Advisor \| 2
	3 \| Carr \| fcarr2@woothemes.com \| Male \| Automotive \| 2009-07-12 \| 101768 \| Recruiting Manager \| 3
	4 \| Murray \| jmurray3@gov.uk \| Female \| Jewelery \| 2014-12-25 \| 96897 \| Desktop Support Technician \| 3
	5 \| Ellis \| jellis4@sciencedirect.com \| Female \| Grocery \| 2002-09-19 \| 63702 \| Software Engineer III \| 7
	(5 rows)

	scala_db=> \q
	$ # We need to download the PostgreSQL JDBC driver in order to interact with
	$ # the scala_db database we have created in PostgreSQL via Scala. Recall that
	$ # scala uses the JVM, so JAR files are executed by Scala just as Java could.

	$ # visit https://jdbc.postgresql.org/ for download links. I will use wget to
	$ # download a into my home directory.

	$ cd ~

	$ wget https://jdbc.postgresql.org/download/postgresql-42.2.10.jar
	--2020-03-07 14:22:36-- https://jdbc.postgresql.org/download/postgresql-42.2.10.jar
	Resolving jdbc.postgresql.org (jdbc.postgresql.org)... 72.32.157.228
	Connecting to jdbc.postgresql.org (jdbc.postgresql.org)\|72.32.157.228\|:443... connected.
	HTTP request sent, awaiting response... 200 OK
	Length: 927447 (906K) [application/java-archive]
	Saving to: ‘postgresql-42.2.10.jar’

	postgresql-42.2.10.jar 100%[=========================================================================>] 905.71K 3.05MB/s in 0.3s

	2020-03-07 14:22:37 (3.05 MB/s) - ‘postgresql-42.2.10.jar’ saved [927447/927447]

	$ # We need to invoke the jdbc driver by passing the classpath option when we
	$ # invoke the Scala REPL.

	$ scala -classpath postgresql-42.2.10.jar
	Welcome to Scala 2.11.12 (OpenJDK 64-Bit Server VM, Java 1.8.0_242).
	Type in expressions for evaluation. Or try :help.

	scala> /* Import the Java Classes that provide Scala an ability to interact
	\| with SQL: a basic JDBC managing service and the a connection
	\| interface */

	scala> import java.sql.DriverManager
	import java.sql.DriverManager

	scala> import java.sql.Connection
	import java.sql.Connection

	scala> /* Database connection strings.
	\| NOTE: Setting `val driver = "org.postgresql.Driver"` and executing
	\| `Class.forName(driver)` should not be required as of Java 1.6.
	\| The PGJDBC driver jar supports the Java Service Provider mechanism.
	\| The driver will be loaded by the JVM when the application connects to
	\| PostgreSQL as long as the driver's Jar file is on the classpath.
	\| see: https://jdbc.postgresql.org/documentation/head/load.html.
	\| Explicitly loading the driver does no harm though. */

	scala> val driver = "org.postgresql.Driver" // OPTIONAL (implicitly handled)
	driver: String = org.postgresql.Driver

	scala> Class.forName(driver) // OPTIONAL (implicitly handled by JVM)
	res0: Class[_] = class org.postgresql.Driver

	scala> /* Attempt to establish a connection to the database URL. DriverManager
	\| will select an appropriate driver from the set of registered JDBC
	\| drivers of any driver Jars which were on the classpath (which is why
	\| the previous driver loading step was unnecessary). */

	scala> val url = "jdbc:postgresql://localhost/scala_db?user=scala_db"
	url: String = jdbc:postgresql://localhost/scala_db?user=scala_db

	scala> var connection: Connection = DriverManager.getConnection(url)
	connection: java.sql.Connection = org.postgresql.jdbc.PgConnection@2aa7399c

	scala> /* Create a Statement object for sending SQL statements to the database,
	\| and execute a statement to select every row from the emps table.

	scala> val statement = connection.createStatement()
	statement: java.sql.Statement = org.postgresql.jdbc.PgStatement@5856dbe4

	scala> /* `executeQuery` on a Statement returns a ResultSet object, which is an
	\| iterable table with a cursor pointing to its current row of data */

	scala> val resultSet = statement.executeQuery("SELECT * FROM emps")
	resultSet: java.sql.ResultSet = org.postgresql.jdbc.PgResultSet@8c43966

	scala> /* The `resultSet` cursor is initially positioned before the first row,
	\| so we call next to move it to the first row of data, and then we can
	\| call getter functions on it to retrieve values from the table.
	\| For details on the resultSet class, review:
	\| https://docs.oracle.com/javase/8/docs/api/java/sql/ResultSet.html

	scala> resultSet.next
	res1: Boolean = true

	scala> resultSet.getString("last_name")
	res2: String = Kelley