Skip to content

Instantly share code, notes, and snippets.

View hsm207's full-sized avatar

Shukri hsm207

View GitHub Profile
@hsm207
hsm207 / auto_ssh_keys.sh
Created April 11, 2020 20:20
Create ssh keys without user input
#!/bin/bash
# from https://unix.stackexchange.com/questions/69314/automated-ssh-keygen-without-passphrase-how
cat /dev/zero |
ssh-keygen -q -N ""
@hsm207
hsm207 / install_kubectl.sh
Last active April 28, 2020 16:54
Install the latest version of kubectl
#!/bin/bash
# from: https://docs.aws.amazon.com/eks/latest/userguide/install-kubectl.html
BUCKET=amazon-eks
LATEST_KUBECTL_VERSION=`aws s3 ls s3://$BUCKET/ |
grep --invert-match "cloudformation\|manifests" |
awk '{print $2}' |
cut -d/ -f 1 |
@hsm207
hsm207 / install_eksctl.sh
Last active April 28, 2020 16:48
Install eksctl
#!/bin/bash
# from: https://docs.aws.amazon.com/eks/latest/userguide/getting-started-eksctl.html
# download and extract the latest version of eksctl (including prerelease)
# from: https://gist.github.com/steinwaywhw/a4cd19cda655b8249d908261a62687f8
LATEST_EKSCTL=`curl -s https://api.github.com/repos/weaveworks/eksctl/releases |
jq ".[0].assets | map(select(.name == \"eksctl_Linux_amd64.tar.gz\")) | .[0].browser_download_url" |
tr -d '"'`
@hsm207
hsm207 / find_host_ip.sh
Created March 8, 2020 14:04
Command to find the host's ip address from inside a container
# Thanks to: https://forums.docker.com/t/accessing-host-machine-from-within-docker-container/14248/10
docker run --network host \
--rm \
openjdk:8 bash -c "apt update; apt install net-tools; route | awk '/^default/ { print \$2 }' | grep -v 0.0.0.0"
@hsm207
hsm207 / create_build_env.sh
Created March 8, 2020 13:20
Script to create a docker container to build Spark
docker run -it \
-v $(pwd):/spark \
-v /c/path/to/.m2:/root/.m2 \
-v /c/path/to/.sbt:/root/.sbt \
-v /c/path/to/.ivy2:/root/.ivy2 \
-w /spark \
--network host \
openjdk:8 bash
@hsm207
hsm207 / copy_redis.sh
Last active February 5, 2020 17:33
Script to migrate all keys inside a redis instance into another redis instance
# source: https://stackoverflow.com/questions/37166947/copying-all-keys-in-redis-database-using-migrate
# make sure the redis in SOURCE_HOST and DESTINATION_HOST are identical!
SOURCE_HOST=127.0.0.1
SOURCE_PORT=6381
DESTINATION_HOST=foo.bar
DESTINATION_PORT=6379
@hsm207
hsm207 / working_with_random_samples.html
Last active December 24, 2019 12:43
Working with random samples on Spark
<!DOCTYPE html>
<html>
<head>
<meta name="databricks-html-version" content="1">
<title>working_with_random_samples - Databricks</title>
<meta charset="utf-8">
<meta name="google" content="notranslate">
<meta name="robots" content="nofollow">
<meta http-equiv="Content-Language" content="en">
@hsm207
hsm207 / keras_estimator_feature_column_bug.ipynb
Created October 22, 2019 15:56
keras_estimator_feature_column_bug.ipynb
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
@hsm207
hsm207 / bert_pretraining_share.ipynb
Created October 1, 2019 23:50
BERT_pretraining_share.ipynb
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
@hsm207
hsm207 / getAnalytics.scala
Last active September 16, 2019 13:27
implementation of getAnalytics
def getAnalytics(bucketName: String, brand: String, bucketYear: String, bucketMonth: String, bucketDay: String, candidate_field: String=candidateField, groups: String=Groups): DataFrame = {
var rankingOrderedIds = Window.partitionBy("c12").orderBy("id")
val s3PathAnalytics = getS3Path(bucketName, brand, bucketFolder, year=bucketYear, month=bucketMonth, day=bucketDay)
readJSON(s3PathAnalytics)
.distinct
.withColumn("x", explode($"payload"))
// a few more calls to withColumn to create columns
.withColumn("c10", explode(when(size(col("x1")) > 0, col("x1")).otherwise(array(lit(null).cast("string")))))
// a few more calls to withColumn to create columns
.withColumn("id", monotonically_increasing_id)