Al algarecu

## spark-gcs-example.py
import os
import sys

spark_home = '/usr/local/spark'
sys.path.insert(0, spark_home + "/python")
sys.path.insert(0, os.path.join(spark_home, 'python/lib/py4j-0.10.1-src.zip'))
os.environ['PYSPARK_SUBMIT_ARGS'] = """--jars gcs-connector-latest-hadoop2.jar pyspark-shell"""

from pyspark import SparkContext, SparkConf
from pyspark.sql import SparkSession

## gitflow-breakdown.md

      
              1 file
            
          
              444 forks
            
          
              36 comments
            
          
              1562 stars
            
          
                JamesMGreene
                / gitflow-breakdown.md
            
            
              Last active
              April 16, 2024 16:36
            
              
                `git flow` vs. `git`: A comparison of using `git flow` commands versus raw `git` commands.
              
          
    Initialize


gitflow
git


git flow init
git init


git commit --allow-empty -m "Initial commit"


git checkout -b develop master


Connect to the remote repository


## application-notes-zero-downtime-deploys-unicorn-nginx-runit-rvm-chef.md

      
              6 files
            
          
              0 forks
            
          
              0 comments
            
          
              1 star
            
          
                algarecu
                / application-notes-zero-downtime-deploys-unicorn-nginx-runit-rvm-chef.md
            
            
              Created
              April 12, 2014 08:02
                — forked from czarneckid/application-notes-zero-downtime-deploys-unicorn-nginx-runit-rvm-chef.md
            
          
    Zero downtime deploys with unicorn + nginx + runit + rvm + chef

Below are the actual files we use in one of our latest production applications at Agora Games to achieve zero downtime deploys with unicorn. You've probably already read the GitHub blog post on Unicorn and would like to try zero downtime deploys for your application. I hope these files and notes help. I am happy to update these files or these notes if there are comments/questions. YMMV (of course).
Other application notes:

Our application uses MongoDB, so we don't have database migrations to worry about as with MySQL or postgresql. That does not mean that we won't have to worry about issues with the database with indexes being built in MongoDB or what have you.
We use capistrano for deployment.

Salient points for each file:

  
## pandas_dbms.py
# -*- coding: utf-8 -*-
"""
LICENSE: BSD (same as pandas)
example use of pandas with oracle mysql postgresql sqlite
    - updated 9/18/2012 with better column name handling; couple of bug fixes.
    - used ~20 times for various ETL jobs.  Mostly MySQL, but some Oracle.

    to do:
            save/restore index (how to check table existence? just do select count(*)?),
            finish odbc,

## gist:716254
# ATOMIC DATA TYPES IN R

# Character
first.name <- "Kirk"

# Integer
age <- 25

# Numeric
hourly_wage <- 27.25
	import os
	import sys

	spark_home = '/usr/local/spark'
	sys.path.insert(0, spark_home + "/python")
	sys.path.insert(0, os.path.join(spark_home, 'python/lib/py4j-0.10.1-src.zip'))
	os.environ['PYSPARK_SUBMIT_ARGS'] = """--jars gcs-connector-latest-hadoop2.jar pyspark-shell"""

	from pyspark import SparkContext, SparkConf
	from pyspark.sql import SparkSession
gitflow	git
`git flow init`	`git init`
	`git commit --allow-empty -m "Initial commit"`
	`git checkout -b develop master`
	# -- coding: utf-8 --
	"""
	LICENSE: BSD (same as pandas)
	example use of pandas with oracle mysql postgresql sqlite
	- updated 9/18/2012 with better column name handling; couple of bug fixes.
	- used ~20 times for various ETL jobs. Mostly MySQL, but some Oracle.

	to do:
	save/restore index (how to check table existence? just do select count(*)?),
	finish odbc,
	# ATOMIC DATA TYPES IN R

	# Character
	first.name <- "Kirk"

	# Integer
	age <- 25

	# Numeric
	hourly_wage <- 27.25