Skip to content

Instantly share code, notes, and snippets.

View mkaranasou's full-sized avatar
🏠
Working from home

Maria Karanasou mkaranasou

🏠
Working from home
View GitHub Profile
@mkaranasou
mkaranasou / bbtree.py
Created March 17, 2017 10:29 — forked from olomix/bbtree.py
Balanced binary tree in Python
#!/usr/bin/env python2.7
import random
import subprocess
class Node(object):
def __init__(self, key, value):
self.key = key
self.value = value
@mkaranasou
mkaranasou / relations.py
Created May 22, 2018 14:18 — forked from cyrexcyborg/relations.py
Flask-Admin-SQLAlchemy one-to-one, one-to-many between two tables
# -*- coding: utf-8 -*-
# Many thanks to http://stackoverflow.com/users/400617/davidism
# This code under "I don't care" license
# Take it, use it, learn from it, make it better.
# Start this from cmd or shell or whatever
# Go to favourite browser and type localhost:5000/admin
import sys
from flask import Flask
from flask.ext.sqlalchemy import SQLAlchemy
from flask.ext.admin import Admin
@mkaranasou
mkaranasou / pyspark_uneven_df_union.py
Last active November 2, 2018 11:57
Function to union pyspark data frames with different columns
def union_uneven(df_base, df_new, default=None):
"""
Union dfs with different columns
:param: pyspark.DataFrame df_base: the dataframe to join to
:param: pyspark.DataFrame df_new: the dataframe to be joined
:return: the union of the two dataframes, having the missing columns filled with the default value
:rtype: pyspark.DataFrame
"""
base_columns = set(df_base.columns)
df_new_columns = set(df_new.columns)
@mkaranasou
mkaranasou / pyspark_parse_json_and_expand_into_columns.py
Last active March 1, 2019 17:21
Parse a json column in pyspark and expand the dict into columns
json_col = 'json_col'
# either infer the features schema:
schema = self.spark.read.json(df.select(json_col).rdd.map(lambda x: x[0])).schema
# parse the features string into a map
df = df.withColumn(json_col, (F.from_json(F.col(json_col), schema)))
# access the feature columns by name
df.select(F.col(json_col)['some_key']).show()
@mkaranasou
mkaranasou / ImageTools.js
Created April 6, 2019 08:08 — forked from SagiMedina/ImageTools.js
Resize and crop images in the Browser with orientation fix using exif
import EXIF from 'exif-js';
const hasBlobConstructor = typeof (Blob) !== 'undefined' && (function checkBlobConstructor() {
try {
return Boolean(new Blob());
} catch (error) {
return false;
}
}());
@mkaranasou
mkaranasou / pyspark_simple_read_text_file.py
Last active October 4, 2019 14:16
Use pyspark to read a text file and identify a term
from pyspark import SparkConf
from pyspark.sql import SparkSession, functions as F
conf = SparkConf()
# optional but it would be good to set the amount of ram the driver can use to
# a reasonable (regarding the size of the file we want to read) amount, so that we don't get an OOM exception
conf.set('spark.driver.memory', '6G')
# create a spark session - nothing can be done without this:
spark = SparkSession.builder \
@mkaranasou
mkaranasou / pyspark_simple_file_read_short.py
Last active October 4, 2019 13:48
Read a txt file with pyspark
from pyspark import SparkConf
from pyspark.sql import SparkSession, functions as F
conf = SparkConf()
# optional but it would be good to set the amount of ram the driver can use to
# a reasonable (regarding the size of the file we want to read) amount, so that we don't get an OOM exception
conf.set('spark.driver.memory', '6G')
spark = SparkSession.builder \
.config(conf=conf) \
@mkaranasou
mkaranasou / pyspark_autoincrement_ids_rdd_version.py
Last active September 23, 2021 02:02
Add auto-increment ids to a pyspark data frame using RDDs
>>> from pyspark.sql import SparkSession, functions as F
>>> from pyspark import SparkConf
>>> conf = SparkConf()
>>> spark = SparkSession.builder \
.config(conf=conf) \
.appName('Dataframe with Indexes') \
.getOrCreate()
@mkaranasou
mkaranasou / python_yaml_environment_variables.py
Last active May 14, 2024 16:33
Python Load a yaml configuration file and resolve any environment variables
import os
import re
import yaml
def parse_config(path=None, data=None, tag='!ENV'):
"""
Load a yaml configuration file and resolve any environment variables
The environment variables must have !ENV before them and be in this format
to be parsed: ${VAR_NAME}.
@mkaranasou
mkaranasou / use_env_variables_in_config_example.py
Last active December 17, 2020 01:41
Example of using parse_config
# To run this:
# export DB_PASS=very_secret_and_complex
# python use_env_variables_in_config_example.py -c /path/to/yaml
# do stuff with conf, e.g. access the database password like this: conf['database']['DB_PASS']
if __name__ == '__main__':
parser = argparse.ArgumentParser(description='My awesome script')
parser.add_argument(
"-c", "--conf", action="store", dest="conf_file",
help="Path to config file"