Skip to content

Instantly share code, notes, and snippets.

@robenalt
robenalt / pedantically_commented_playbook.yml
Created October 19, 2017 16:12 — forked from marktheunissen/pedantically_commented_playbook.yml
Insanely complete Ansible playbook, showing off all the options
---
# ^^^ YAML documents must begin with the document separator "---"
#
#### Example docblock, I like to put a descriptive comment at the top of my
#### playbooks.
#
# Overview: Playbook to bootstrap a new host for configuration management.
# Applies to: production
# Description:
# Ensures that a host is configured for management with Ansible.
@robenalt
robenalt / helloworld-win32-service.py
Created September 23, 2017 00:29 — forked from drmalex07/helloworld-win32-service.py
An example Windows service implemented with pywin32 wrappers. #python #windows-service #pywin32
import win32serviceutil
import win32service
import win32event
import servicemanager
import socket
import time
import logging
logging.basicConfig(
filename = 'c:\\Temp\\hello-service.log',
@robenalt
robenalt / read-flowfile-contents.py
Created September 1, 2017 20:16 — forked from ijokarumawak/read-flowfile-contents.py
Example Python script to use from NiFi ExecuteScript processor which reads the first line from an incoming flow file.
from org.apache.nifi.processors.script import ExecuteScript
from org.apache.nifi.processor.io import InputStreamCallback
from java.io import BufferedReader, InputStreamReader
class ReadFirstLine(InputStreamCallback) :
__line = None;
def __init__(self) :
pass
@robenalt
robenalt / pspark_config.py
Created November 9, 2016 18:37
Sample pyspark context setting with configs params
# Set up spark configuration
conf = SparkConf().setMaster("yarn-client").setAppName("sparK-mer")
#conf = SparkConf().setMaster("local[16]").setAppName("sparK-mer")
conf.set("yarn.nodemanager.resource.cpu_vcores",args.C)
# Saturate with executors
conf.set("spark.executor.instances",executorInstances)
conf.set("spark.executor.heartbeatInterval","5s")
# cores per executor
conf.set("spark.executor.cores",args.E)
# set driver cores
@robenalt
robenalt / iterm2-solarized.md
Created July 18, 2016 12:50 — forked from kevin-smets/iterm2-solarized.md
iTerm2 + oh my zsh + solarized + Meslo powerline font (OS X / macOS)

Solarized

@robenalt
robenalt / sklearn_randomforest_snip.py
Created June 10, 2016 19:38
sklearn random forest template
%matplotlib inline
import pandas as pd
import numpy as np
from sklearn.ensemble import RandomForestClassifier
from sklearn.cross_validation import train_test_split
from sklearn.metrics import classification_report
from sklearn.metrics import confusion_matrix
from sklearn.metrics import accuracy_score
from sklearn.metrics import recall_score
@robenalt
robenalt / save_dataframe_pyspark.py
Last active June 10, 2016 19:30
pyspark save dataframe
#from pyspark.sql import HiveContext
#sqlContext = HiveContext(sc)
query = """
select * from db.sometable where col>50
"""
results = sqlContext.sql(query)
result_writer = pyspark.sql.DataFrameWriter(results)
result_writer.saveAsTable('db.new_table_name',format='parquet', mode='overwrite',path='/path/to/new/data/files')
@robenalt
robenalt / scala_spark_logger.scala
Last active June 10, 2016 19:18
spark scala set logger level
// Set logging level for spark scala
Logger.getLogger("org").setLevel(Level.WARN)
Logger.getLogger("akka").setLevel(Level.WARN)
@robenalt
robenalt / read_avro_spark_1.3.0.py
Last active June 10, 2016 19:08
1.3.0 pyspark read avro
# pyspark --packages com.databricks:spark-avro_2.10:1.0.0
# read avro files from 1.3.0 spark
df = sqlCtx.load("/path/to/my_avro", "com.databricks.spark.avro")
@robenalt
robenalt / run_commandline.scala
Last active June 10, 2016 19:01
run a commandline and capture output
import scala.sys.process._
//"ls -la".!!
val result = "ls -la".!!