Brendan Bennett rldotai

## github-pandoc.css
/*! normalize.css v2.1.3 | MIT License | git.io/normalize */

/* ==========================================================================
   HTML5 display definitions
   ========================================================================== */

/**
 * Correct `block` display not defined in IE 8/9.
 */

## Flask-Restful_S3_File_Upload.py
# -*- coding: utf-8 -*-
"""
An example flask application showing how to upload a file to S3
while creating a REST API using Flask-Restful.

Note: This method of uploading files is fine for smaller file sizes,
      but uploads should be queued using something like celery for
      larger ones.
"""
from cStringIO import StringIO

## gist:8172796

      
              1 file
            
          
              404 forks
            
          
              23 comments
            
          
              1649 stars
            
          
                debasishg
                / gist:8172796
            
            
              Last active
              August 24, 2024 13:55
            
              
                A collection of links for streaming algorithms and data structures
              
          
    General Background and Overview


Probabilistic Data Structures for Web Analytics and Data Mining : A great overview of the space of probabilistic data structures and how they are used in approximation algorithm implementation.
Models and Issues in Data Stream Systems
Philippe Flajolet’s contribution to streaming algorithms : A presentation by Jérémie Lumbroso that visits some of the hostorical perspectives and how it all began with Flajolet
Approximate Frequency Counts over Data Streams by Gurmeet Singh Manku & Rajeev Motwani : One of the early papers on the subject.
[Methods for Finding Frequent Items in Data Streams](http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.187.9800&amp;rep=rep1&amp;t


## epub2pdf.sh
# depends on Calibre (http://calibre-ebook.com)
# the CSS snippet prevents images from filling the page
# adapt margins, page size and font size as needed
ebook-convert doc.epub doc.pdf  \
--smarten-punctuation \
--pretty-print \
--preserve-cover-aspect-ratio \
--insert-blank-line \
--margin-top 60 \
--margin-left 60 \

## dl-frameworks.rst

      
              1 file
            
          
              15 forks
            
          
              2 comments
            
          
              56 stars
            
          
                bartvm
                / dl-frameworks.rst
            
            
              Last active
              December 7, 2020 18:18
            
              
                A comparison of deep learning frameworks
              
          
    A comparison of Theano with other deep learning frameworks, highlighting a series of low-level design choices in no particular order.

Overview

Differentiation


Differentiation

Symbolic: Theano, CGT; Automatic: Torch, MXNet
Symbolic and automatic differentiation are often confused or used interchangeably, although their implementations are significantly different.


## rich-text-html-editors.md

      
              1 file
            
          
              25 forks
            
          
              7 comments
            
          
              391 stars
            
          
                manigandham
                / rich-text-html-editors.md
            
            
              Last active
              July 23, 2024 12:07
            
              
                Rich text / HTML editors and frameworks
              
          
    Strictly Frameworks


Mobiledoc - github.com/bustle/mobiledoc-kit - framework to build editors with a standardized JSON structure
ShareDB - github.com/share/sharedb - framework to sync any JSON document using operational transforms, add real-time collaborative editing to anything else
Bangle.dev - github.com/bangle-io/bangle.dev - toolkit built for building editors, based on prosemirror

Abstracted Editors

These use separate document structures instead of HTML, some are more modular libraries than full editors

  
## EXAMPLE_WATSON_API_README.md

      
              4 files
            
          
              6 forks
            
          
              0 comments
            
          
              28 stars
            
          
                dannguyen
                / EXAMPLE_WATSON_API_README.md
            
            
              Last active
              November 23, 2020 13:32
            
              
                Transcribing ProPublica podcast with Python and Watson Speech to Text API
              
          
    Using IBM Watson Speech to Text API to translate a ProPublica podcast

An example of using the Watson Speech to Text API to translate a podcast from ProPublica: How a Reporter Pierced the Hype Behind Theranos
This is just a simpler demo of the same technique I demonstrate to make automated video supercuts in this repo: https://github.com/dannguyen/watson-word-watcher
The transcription takes just a few minutes (less if you parallelize the requests to IBM) and is free...but it isn't perfect by any means. It doesn't fare super well on proper nouns:

Charles Ornstein's last name is transcribed as Orenstein
John Carreyrou's last name becomes John Kerry Roo


## selenium-screenshotting.md

      
              1 file
            
          
              19 forks
            
          
              4 comments
            
          
              70 stars
            
          
                dannguyen
                / selenium-screenshotting.md
            
            
              Last active
              February 15, 2023 15:59
            
              
                Using Selenium and Python to screenshot a javascript-heavy page
              
          
    Using Selenium and Python to screenshot a javascript-heavy page

As websites become more JavaScript heavy, it's harder to automate things like screenshotting for archival purposes. I've seen examples and suggestions to use PhantomJS for visual testing/archiving of websites, but have run into issues such as the non-rendering of webfonts. I've never tried out Selenium until today...and while I'm not thinking about performance implications yet, Selenium seems far more accurate than PhantomJS...which makes sense since it actually opens a real browser. And it's not too hard to script to do complex interactions: here's an [example of how to log in to Twitter, write a tweet, upload an image, and send a tweet via Selenium and DOM element selection](https://gist.github.com/dannguyen/8a6fa49253c1d6a0eb92

  
## README.md

      
              2 files
            
          
              69 forks
            
          
              9 comments
            
          
              406 stars
            
          
                dannguyen
                / README.md
            
            
              Last active
              July 6, 2024 16:36
            
              
                Using Python 3.x and Google Cloud Vision API to OCR scanned documents to extract structured data
              
          
    Using Python 3 + Google Cloud Vision API's OCR to extract text from photos and scanned documents

Just a quickie test in Python 3 (using Requests) to see if Google Cloud Vision can be used to effectively OCR a scanned data table and preserve its structure, in the way that products such as ABBYY FineReader can OCR an image and provide Excel-ready output.
The short answer: No. While Cloud Vision provides bounding polygon coordinates in its output, it doesn't provide it at the word or region level, which would be needed to then calculate the data delimiters.
On the other hand, the OCR quality is pretty good, if you just need to identify text anywhere in an image, without regards to its physical coordinates. I've included two examples:
####### 1. A low-resolution photo of road signs

  
## obreros-bottle.ipynb

      
              1 file
            
          
              0 forks
            
          
              1 comment
            
          
              3 stars
            
          
                disarticulate
                / obreros-bottle.ipynb
            
            
              Last active
              May 7, 2016 22:55
            
          
        Loading

      Sorry, something went wrong. Reload?
      Sorry, we cannot display this file.
      Sorry, this file is invalid so it cannot be displayed.
      
          Viewer requires iframe.
	/! normalize.css v2.1.3 \| MIT License \| git.io/normalize /

	/* ==========================================================================
	HTML5 display definitions
	========================================================================== */

	/**
	* Correct `block` display not defined in IE 8/9.
	*/
	# -- coding: utf-8 --
	"""
	An example flask application showing how to upload a file to S3
	while creating a REST API using Flask-Restful.

	Note: This method of uploading files is fine for smaller file sizes,
	but uploads should be queued using something like celery for
	larger ones.
	"""
	from cStringIO import StringIO
	# depends on Calibre (http://calibre-ebook.com)
	# the CSS snippet prevents images from filling the page
	# adapt margins, page size and font size as needed
	ebook-convert doc.epub doc.pdf \
	--smarten-punctuation \
	--pretty-print \
	--preserve-cover-aspect-ratio \
	--insert-blank-line \
	--margin-top 60 \
	--margin-left 60 \