Elie A. eliasah

## SQLTransformerWithJoin.scala
Welcome to
      ____              __
     / __/__  ___ _____/ /__
    _\ \/ _ \/ _ `/ __/  '_/
   /___/ .__/\_,_/_/ /_/\_\   version 2.0.0
      /_/

Using Scala version 2.11.8 (Java HotSpot(TM) 64-Bit Server VM, Java 1.8.0_77)
Type in expressions to have them evaluated.
Type :help for more information.

## multiple-files-remove-prefix.md

      
              1 file
            
          
              0 forks
            
          
              0 comments
            
          
              0 stars
            
          
                eliasah
                / multiple-files-remove-prefix.md
            
            
              Created
              September 27, 2019 14:35
            
              
                Remove prefix from multiple files in Linux console
              
          
    Bash
for file in prefix*; do mv "$file" "${file#prefix}"; done;

The for loop iterates over all files with the prefix. The do removes from all those files iterated over the prefix.
Here is an example to remove "bla_" form the following files:
bla_1.txt
bla_2.txt


## custom_s3_endpoint_in_spark.md

      
              1 file
            
          
              0 forks
            
          
              0 comments
            
          
              0 stars
            
          
                eliasah
                / custom_s3_endpoint_in_spark.md
            
            
              Created
              August 30, 2019 08:54
                — forked from tobilg/custom_s3_endpoint_in_spark.md
            
              
                Description on how to use a custom S3 endpoint (like Rados Gateway for Ceph)
              
          
    Custom S3 endpoints with Spark

To be able to use custom endpoints with the latest Spark distribution, one needs to add an external package (hadoop-aws). Then, custum endpoints can be configured according to docs.
Use the hadoop-aws package

bin/spark-shell --packages org.apache.hadoop:hadoop-aws:2.7.2

SparkContext configuration


## strip_glm.R
strip_glm <- function(cm) {
  cm$y = c()
  cm$model = c()

  cm$residuals = c()
  cm$fitted.values = c()
  cm$effects = c()
  cm$qr$qr = c()
  cm$linear.predictors = c()
  cm$weights = c()

## bootstrap-install-zeppelin-0.8-aws-linux.sh
#!/bin/bash -ex

# ATTENTION:
#
# 1. ensure you have about 1Gb on the storage of /usr/lib/ for the Zeppelin huge bundle chosen by default below,
#    or choose a smaller bundle from Zeppelin web-site
#
# 2. adjust values of ZEPPELIN_NOTEBOOK_S3_BUCKET
#    and ZEPPELIN_NOTEBOOK_S3_USER if you need S3-persistance of your Zeppelin Notebooks to your S3 bucket
#    otherwize just remove all three last exports lines starting from 'export ZEPPELIN_NOTEBOOK_S'

## terminal-git-branch-name.md

      
              1 file
            
          
              0 forks
            
          
              0 comments
            
          
              0 stars
            
          
                eliasah
                / terminal-git-branch-name.md
            
            
              Created
              October 2, 2018 08:28
                — forked from joseluisq/terminal-git-branch-name.md
            
              
                Add Git Branch Name to Terminal Prompt (Mac)
              
          
    Add Git Branch Name to Terminal Prompt (Mac)


Open ~/.bash_profile in your favorite editor and add the following content to the bottom.
# Git branch in prompt.

parse_git_branch() {

  
## minikube.md

      
              1 file
            
          
              0 forks
            
          
              0 comments
            
          
              0 stars
            
          
                eliasah
                / minikube.md
            
            
              Last active
              May 23, 2018 08:54
                — forked from codesword/minikube.md
            
              
                Installing minikube using xhyve driver
              
          
    Install docker-machine-driver-xhyve

docker-machine-driver-xhyve is a docker machine driver plugin for xhyve native OS X Hypervisor. xhyve is a lightweight OS X virtualization solution. In my opinion, it's a far better option than virtualbox for running minikube.
Brew

On MacOS sierra, download latest using
brew install docker-machine-driver-xhyve --HEAD


## spark_job.py
import pandas as pd
import numpy as np
import scipy
import scipy.stats as sts
import random

import pyspark
import pyspark.sql.types as stypes
import pyspark.sql.functions as sfunctions

## IntelliJ_IDEA__Perf_Tuning.txt
-server
-Xms2048m
-Xmx2048m
-XX:NewSize=512m
-XX:MaxNewSize=512m
-XX:PermSize=512m
-XX:MaxPermSize=512m
-XX:+UseParNewGC
-XX:ParallelGCThreads=4
-XX:MaxTenuringThreshold=1

## 0.question_svm_with_sgd.md

      
              3 files
            
          
              0 forks
            
          
              2 comments
            
          
              0 stars
            
          
                eliasah
                / 0.question_svm_with_sgd.md
            
            
              Last active
              November 30, 2016 15:50
                — forked from aditya1702/Data used for training SVM model
            
              
                SVM with SGD clarification for question http://apache-spark-user-list.1001560.n3.nabble.com/Need-help-with-SVM-tt27955.html
              
          
    Hello,
I am using linear SVM to train my model and generate a line through my data. However my model always predicts 1 for all the feature examples. Here is my code:
print data_rdd.take(5)
[LabeledPoint(1.0, [1.9643,4.5957]), LabeledPoint(1.0, [2.2753,3.8589]), LabeledPoint(1.0, [2.9781,4.5651]), LabeledPoint(1.0, [2.932,3.5519]), LabeledPoint(1.0, [3.5772,2.856])]

from pyspark.mllib.classification import SVMWithSGD
from pyspark.mllib.linalg import Vectors
from sklearn.svm import SVC
	Welcome to
	____ __
	/ __/__ ___ _____/ /__
	_\ \/ _ \/ _ `/ __/ '_/
	/___/ .__/\_,_/_/ /_/\_\ version 2.0.0
	/_/

	Using Scala version 2.11.8 (Java HotSpot(TM) 64-Bit Server VM, Java 1.8.0_77)
	Type in expressions to have them evaluated.
	Type :help for more information.
	strip_glm <- function(cm) {
	cm$y = c()
	cm$model = c()

	cm$residuals = c()
	cm$fitted.values = c()
	cm$effects = c()
	cm$qr$qr = c()
	cm$linear.predictors = c()
	cm$weights = c()
	#!/bin/bash -ex

	# ATTENTION:
	#
	# 1. ensure you have about 1Gb on the storage of /usr/lib/ for the Zeppelin huge bundle chosen by default below,
	# or choose a smaller bundle from Zeppelin web-site
	#
	# 2. adjust values of ZEPPELIN_NOTEBOOK_S3_BUCKET
	# and ZEPPELIN_NOTEBOOK_S3_USER if you need S3-persistance of your Zeppelin Notebooks to your S3 bucket
	# otherwize just remove all three last exports lines starting from 'export ZEPPELIN_NOTEBOOK_S'
	import pandas as pd
	import numpy as np
	import scipy
	import scipy.stats as sts
	import random

	import pyspark
	import pyspark.sql.types as stypes
	import pyspark.sql.functions as sfunctions
	-server
	-Xms2048m
	-Xmx2048m
	-XX:NewSize=512m
	-XX:MaxNewSize=512m
	-XX:PermSize=512m
	-XX:MaxPermSize=512m
	-XX:+UseParNewGC
	-XX:ParallelGCThreads=4
	-XX:MaxTenuringThreshold=1