Tarashish Mishra sunu

## foo_names.md

      
              2 files
            
          
              1 fork
            
          
              0 comments
            
          
              1 star
            
          
                sunu
                / foo_names.md
            
            
              Created
              July 26, 2016 13:04
                — forked from mihaitodor/foo_names.md
            
              
                Map slugs to course names
              
          
    https://archive.org/download/archiveteam_coursera_20160627114043/coursera_20160627114043.megawarc.warc.gz

bigdata = Web Intelligence and Big Data
clinical skills = Teaching and Assessing Clinical Skills
comp finance = Introduction to Computational Finance and Financial Econometrics
data sci = Introduction to Data Science
dmathgen = 离散数学概论 Discrete Mathematics Generality
global introuslaw = The Global Student's Introduction to U.S. Law
global theatre = Theatre and Globalization
global theatre = Theatre and Globalization
inforiskman = Information Security and Risk Management in Context


## README.md

      
              2 files
            
          
              0 forks
            
          
              0 comments
            
          
              1 star
            
          
                sunu
                / README.md
            
            
              Created
              March 25, 2016 05:55
                — forked from dannguyen/README.md
            
              
                Using Google Cloud Vision API to OCR scanned documents to extract structured data
              
          
    Using Google Cloud Vision API's OCR to extract text from photos and scanned documents

Just a quickie test in Python 3 (using Requests) to see if Google Cloud Vision can be used to effectively OCR a scanned data table and preserve its structure, in the way that products such as ABBYY FineReader can OCR an image and provide Excel-ready output.
The short answer: No. While Cloud Vision provides bounding polygon coordinates in its output, it doesn't provide it at the word or region level, which would be needed to then calculate the data delimiters.
On the other hand, the OCR quality is pretty good, if you just need to identify text anywhere in an image, without regards to its physical coordinates. I've included two examples:
####### 1. A low-resolution photo of road signs

  
## 1_ubuntu_terminal_command
# to execute this gist, run the line bellow in terminal
\curl -L https://gist.githubusercontent.com/sunu/a3107443677231e815fa/raw/9f25268168fa8b37cd3b230956fd8f8d19dca069/install_source_code_pro.sh | sh

## gist:9599655

      
              1 file
            
          
              0 forks
            
          
              0 comments
            
          
              0 stars
            
          
                sunu
                / gist:9599655
            
            
              Last active
              August 29, 2015 13:57
                — forked from ygjb/gist:4543418
            
          
    Description

Kitherder is a web application that is designed to facilitate participation in the Security Mentorships program.  Note that while this program is currently limited to security projects, the goal of KitHerder is to provide the matchmaking and relationship management features required to open the program to the Mozilla community.
The requirements here are driven by the documentation from the mentorship program and it is expected that the system will leverage Mozillians.org accounts to reduce the amount of personal data stored in Kitherder, and issue badges using the Mozilla Foundation badge system based on participation criteria.
Terms


Mozillian - a user with an account on Mozillians.org
Vouched Mozillian - a user who has been "vouched" on Mozillians.org


## flask_geventwebsocket_example.py
from geventwebsocket.handler import WebSocketHandler
from gevent.pywsgi import WSGIServer
from flask import Flask, request, render_template

app = Flask(__name__)

@app.route('/')
def index():
    return render_template('index.html')

## snowjob.py
#!/usr/bin/env python
import os
import random
import time
import platform

snowflakes = {}

try:
    # Windows Support

## gist:4324693
# enable syntax completion
try:
    import readline
except ImportError:
    print "Module readline not available."
else:
    import rlcompleter
    readline.parse_and_bind("tab: complete")
	# to execute this gist, run the line bellow in terminal
	\curl -L https://gist.githubusercontent.com/sunu/a3107443677231e815fa/raw/9f25268168fa8b37cd3b230956fd8f8d19dca069/install_source_code_pro.sh \| sh
	from geventwebsocket.handler import WebSocketHandler
	from gevent.pywsgi import WSGIServer
	from flask import Flask, request, render_template

	app = Flask(__name__)

	@app.route('/')
	def index():
	return render_template('index.html')
	#!/usr/bin/env python
	import os
	import random
	import time
	import platform

	snowflakes = {}

	try:
	# Windows Support
	# enable syntax completion
	try:
	import readline
	except ImportError:
	print "Module readline not available."
	else:
	import rlcompleter
	readline.parse_and_bind("tab: complete")