Skip to content

Instantly share code, notes, and snippets.

View hsm207's full-sized avatar

Shukri hsm207

View GitHub Profile
@hsm207
hsm207 / subreddit_latest.py
Created March 11, 2017 01:33 — forked from dangayle/subreddit_latest.py
Get all available submissions within a subreddit newer than x
import sys
from datetime import datetime, timedelta
import praw
user_agent = "hot test 1.0 by /u/dangayle"
r = praw.Reddit(user_agent=user_agent)
class SubredditLatest(object):
"""Get all available submissions within a subreddit newer than x."""
from __future__ import division
import urlparse
import os
import numpy
import boto3
import tensorflow
from tensorflow.python.keras._impl import keras
from tensorflow.python.estimator.export.export_output import PredictOutput
@hsm207
hsm207 / rpr_with_einsum_gpu.ipynb
Last active April 21, 2019 11:08
Code to accompany my blog post at https://bit.ly/2PmRjiC
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
@hsm207
hsm207 / getAnalytics.scala
Last active September 16, 2019 13:27
implementation of getAnalytics
def getAnalytics(bucketName: String, brand: String, bucketYear: String, bucketMonth: String, bucketDay: String, candidate_field: String=candidateField, groups: String=Groups): DataFrame = {
var rankingOrderedIds = Window.partitionBy("c12").orderBy("id")
val s3PathAnalytics = getS3Path(bucketName, brand, bucketFolder, year=bucketYear, month=bucketMonth, day=bucketDay)
readJSON(s3PathAnalytics)
.distinct
.withColumn("x", explode($"payload"))
// a few more calls to withColumn to create columns
.withColumn("c10", explode(when(size(col("x1")) > 0, col("x1")).otherwise(array(lit(null).cast("string")))))
// a few more calls to withColumn to create columns
.withColumn("id", monotonically_increasing_id)
@hsm207
hsm207 / keras_estimator_feature_column_bug.ipynb
Created October 22, 2019 15:56
keras_estimator_feature_column_bug.ipynb
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
@hsm207
hsm207 / working_with_random_samples.html
Last active December 24, 2019 12:43
Working with random samples on Spark
<!DOCTYPE html>
<html>
<head>
<meta name="databricks-html-version" content="1">
<title>working_with_random_samples - Databricks</title>
<meta charset="utf-8">
<meta name="google" content="notranslate">
<meta name="robots" content="nofollow">
<meta http-equiv="Content-Language" content="en">
@hsm207
hsm207 / copy_redis.sh
Last active February 5, 2020 17:33
Script to migrate all keys inside a redis instance into another redis instance
# source: https://stackoverflow.com/questions/37166947/copying-all-keys-in-redis-database-using-migrate
# make sure the redis in SOURCE_HOST and DESTINATION_HOST are identical!
SOURCE_HOST=127.0.0.1
SOURCE_PORT=6381
DESTINATION_HOST=foo.bar
DESTINATION_PORT=6379
@hsm207
hsm207 / create_build_env.sh
Created March 8, 2020 13:20
Script to create a docker container to build Spark
docker run -it \
-v $(pwd):/spark \
-v /c/path/to/.m2:/root/.m2 \
-v /c/path/to/.sbt:/root/.sbt \
-v /c/path/to/.ivy2:/root/.ivy2 \
-w /spark \
--network host \
openjdk:8 bash
@hsm207
hsm207 / find_host_ip.sh
Created March 8, 2020 14:04
Command to find the host's ip address from inside a container
# Thanks to: https://forums.docker.com/t/accessing-host-machine-from-within-docker-container/14248/10
docker run --network host \
--rm \
openjdk:8 bash -c "apt update; apt install net-tools; route | awk '/^default/ { print \$2 }' | grep -v 0.0.0.0"
@hsm207
hsm207 / auto_ssh_keys.sh
Created April 11, 2020 20:20
Create ssh keys without user input
#!/bin/bash
# from https://unix.stackexchange.com/questions/69314/automated-ssh-keygen-without-passphrase-how
cat /dev/zero |
ssh-keygen -q -N ""