Skip to content

Instantly share code, notes, and snippets.

jose-goncabel

Block or report user

Report or block jose-goncabel

Hide content and notifications from this user.

Learn more about blocking users

Contact Support about this user’s behavior.

Learn more about reporting abuse

Report abuse
View GitHub Profile
View PySpark Row Modification via Dicts
from pyspark.sql import Row
my_df_schema = my_df.schema
def replace_content(a_row):
a_row_dict = a_row.asDict()
# Modify the contents of the dict
a_row_dict["key"] = "new value"
@jose-goncabel
jose-goncabel / volume-ec2-backup.sh
Created Jun 21, 2018
Backup Script for EC2 EBS Volumes - The script takes snapshots of volumes if marked with certain tags and deletes the old ones once the are older than the specified days. Perfect for Jenkins.
View volume-ec2-backup.sh
#!/bin/bash
# The script will generate new snapshots
# every execution. It does not look at the amount
# of snapshots, only at their age.
# ---------- CONSTANTS ----------
retention_period_days=30
last_date_to_back=$(date --date "$retention_period_days days ago")
last_date_to_back_seconds=$(date +%s --date "$retention_period_days days ago")
@jose-goncabel
jose-goncabel / multiple-gpu-example.py
Created May 21, 2018
Keras + Tensorflow + Spark - A PySpark script of how to use multiple GPUs for prediction within a Spark environment loading a pre-trained Keras model on each worker.
View multiple-gpu-example.py
#####
# IMPORTS
#####
from pyspark import TaskContext
import os
#####
# PATHS
#####
path_model = "/path/to/pretrained/model.h5"
You can’t perform that action at this time.