Skip to content

Instantly share code, notes, and snippets.

@igama
igama / mysql_to_big_query.sh
Created October 5, 2015 11:05 — forked from shantanuo/mysql_to_big_query.sh
Copy MySQL table to big query. If you need to copy all tables, use the loop given at the end. Exit with error code 3 if blob or text columns are found. The csv files are first copied to google cloud before being imported to big query.
#!/bin/sh
TABLE_SCHEMA=$1
TABLE_NAME=$2
mytime=`date '+%y%m%d%H%M'`
hostname=`hostname | tr 'A-Z' 'a-z'`
file_prefix="trimax$TABLE_NAME$mytime$TABLE_SCHEMA"
bucket_name=$file_prefix
splitat="4000000000"
bulkfiles=200
@igama
igama / bigquery_schema.py
Created October 20, 2015 10:32 — forked from danielecook/bigquery_schema.py
Sense / infer / generate a big query schema string for import #bigquery
import mimetypes
import sys
from collections import OrderedDict
filename = sys.argv[1]
def file_type(filename):
type = mimetypes.guess_type(filename)
return type
@igama
igama / json-bq-schema-generator.rb
Created October 20, 2015 10:32 — forked from igrigorik/json-bq-schema-generator.rb
BigQuery JSON schema generator
require 'open-uri'
require 'zlib'
require 'yajl'
# References
# - https://developers.google.com/bigquery/preparing-data-for-bigquery#dataformats
# - https://developers.google.com/bigquery/docs/data#nested
#
def type(t)
@igama
igama / haproxy.cfg
Created October 26, 2015 14:12 — forked from nakato/haproxy.cfg
A quickly throw together haproxy config for RabbitMQ
global
log 127.0.0.1 local2
chroot /var/lib/haproxy
pidfile /var/run/haproxy.pid
maxconn 4000
user haproxy
group haproxy
daemon
stats socket /var/lib/haproxy/stats
tune.bufsize 128000
@igama
igama / nginx.conf
Created March 3, 2016 11:25 — forked from plentz/nginx.conf
Best nginx configuration for improved security(and performance). Complete blog post here http://tautt.com/best-nginx-configuration-for-security/
# to generate your dhparam.pem file, run in the terminal
openssl dhparam -out /etc/nginx/ssl/dhparam.pem 2048
@igama
igama / docker-log-gist.md
Created March 10, 2016 15:27 — forked from afolarin/docker-log-gist.md
docker-logs
@igama
igama / README.md
Created April 22, 2016 18:11 — forked from dannguyen/README.md
Using Python 3.x and Google Cloud Vision API to OCR scanned documents to extract structured data

Using Python 3 + Google Cloud Vision API's OCR to extract text from photos and scanned documents

Just a quickie test in Python 3 (using Requests) to see if Google Cloud Vision can be used to effectively OCR a scanned data table and preserve its structure, in the way that products such as ABBYY FineReader can OCR an image and provide Excel-ready output.

The short answer: No. While Cloud Vision provides bounding polygon coordinates in its output, it doesn't provide it at the word or region level, which would be needed to then calculate the data delimiters.

On the other hand, the OCR quality is pretty good, if you just need to identify text anywhere in an image, without regards to its physical coordinates. I've included two examples:

####### 1. A low-resolution photo of road signs

@igama
igama / Dockerfile
Created May 13, 2016 10:15 — forked from yefim/Dockerrun.aws.json
Build a Docker image, push it to AWS EC2 Container Registry, then deploy it to AWS Elastic Beanstalk
# Example Dockerfile
FROM hello-world
@igama
igama / Jenkins_Protractor_Headless_Chrome_Setup_Ubuntu_14.04.md
Created May 19, 2016 09:17 — forked from praphull27/Jenkins_Protractor_Headless_Chrome_Setup_Ubuntu_14.04.md
Jenkins, Protractor and Headless Chrome Browser Setup on Ubuntu 14.04

Jenkins, Protractor and Headless Chrome Browser Setup on Ubuntu 14.04

Update Ubuntu

sudo apt-get update
sudo apt-get upgrade

Install Java

@igama
igama / ambari-service-move.md
Created December 5, 2016 10:55 — forked from pcheliniy/ambari-service-move.md
Manual move services in ambari
remove components
curl -u admin:admin -H "X-Requested-By: ambari" -X DELETE \
http://ambari:8080/api/v1/clusters/analytics/services/STORM/components/NIMBUS
Add new componentns not attached to host
curl -u admin:admin -H "X-Requested-By: ambari" -X POST \
http://ambari:8080/api/v1/clusters/analytics/services/STORM/components/NIMBUS