Skip to content

Instantly share code, notes, and snippets.

View josephmisiti's full-sized avatar

Joseph Misiti josephmisiti

View GitHub Profile
@josephmisiti
josephmisiti / AnkiMultipleChoiceTemplate.html
Created July 12, 2019 16:07 — forked from hgiesel/AnkiMultipleChoiceBackTemplate.html
A Multiple Choice Template for Anki Cards
<script>
// MULTIPLE CHOICE TEMPLATE v1.2 {{{
// https://gist.github.com/hgiesel/2e8361afccca5713414a6a4ee66b7ece
const query = 'div#thecard'
const colors = ['orange', 'olive', 'maroon', 'aqua', 'fuchsia', 'navy', 'lime']
const fieldPadding = '4px'
const syntax = {
openDelim: '(^',
closeDelim: '^)',
@josephmisiti
josephmisiti / README.md
Last active June 14, 2022 02:00 — forked from mango314/README.md
a map of all 2166 Census Tracts of New York City in Python Matplotlib

Census Tracts of New York City

Here at PyData NYC, I heard a tutorial of how to use numpy and iPython notebooks. In a previous gist, I wrote drew all the zip codes of the Bronx in d3.js

This would be great for reproducing inforgraphics like Educational Attainment in New York City -- Brooklyn which looks a bit like a jigsaw puzzle:

Where to Obtain the Data

@josephmisiti
josephmisiti / helloevolve.py
Created November 11, 2016 18:19
helloevolve.py - a simple genetic algorithm in Python
"""
helloevolve.py implements a genetic algorithm that starts with a base
population of randomly generated strings, iterates over a certain number of
generations while implementing 'natural selection', and prints out the most fit
string.
The parameters of the simulation can be changed by modifying one of the many
global variables. To change the "most fit" string, modify OPTIMAL. POP_SIZE
controls the size of each generation, and GENERATIONS is the amount of
generations that the simulation will loop through before returning the fittest
@josephmisiti
josephmisiti / chunkify.js
Created March 14, 2017 13:33 — forked from woollsta/chunkify.js
Fixes an issue with Google Chrome Speech Synthesis where long texts pause mid-speaking. The function takes in a speechUtterance object and intelligently chunks it into smaller blocks of text that are stringed together one after the other. Basically, you can play any length of text. See http://stackoverflow.com/questions/21947730/chrome-speech-sy…
/**
* Chunkify
* Google Chrome Speech Synthesis Chunking Pattern
* Fixes inconsistencies with speaking long texts in speechUtterance objects
* Licensed under the MIT License
*
* Peter Woolley and Brett Zamir
*/
var speechUtteranceChunker = function (utt, settings, callback) {

Text Classification

To demonstrate text classification with Scikit Learn, we'll build a simple spam filter. While the filters in production for services like Gmail will obviously be vastly more sophisticated, the model we'll have by the end of this chapter is effective and surprisingly accurate.

Spam filtering is the "hello world" of document classification, but something to be aware of is that we aren't limited to two classes. The classifier we will be using supports multi-class classification, which opens up vast opportunities like author identification, support email routing, etc… However, in this example we'll just stick to two classes: SPAM and HAM.

For this exercise, we'll be using a combination of the Enron-Spam data sets and the SpamAssassin public corpus. Both are publicly available for download and are retreived from the internet during the setup phase of the example code that goes with this chapter.

Loading Examples

@josephmisiti
josephmisiti / postgres-cheatsheet.md
Created June 23, 2017 09:56 — forked from Kartones/postgres-cheatsheet.md
PostgreSQL command line cheatsheet

PSQL

Magic words:

psql -U postgres

If run with -E flag, it will describe the underlaying queries of the \ commands (cool for learning!).

Most \d commands support additional param of __schema__.name__ and accept wildcards like *.*

@josephmisiti
josephmisiti / imagemagick.bash
Created May 22, 2019 14:14 — forked from bensie/imagemagick.bash
ImageMagick Static Binaries for AWS Lambda
#!/usr/bin/env bash
# Must be run on an Amazon Linux AMI that matches AWS Lambda's runtime which can be found at:
# https://docs.aws.amazon.com/lambda/latest/dg/current-supported-versions.html
#
# As of May 21, 2019, this is:
# Amazon Linux AMI 2018.03.0 (ami-0756fbca465a59a30)
#
# You need to prepend PATH with the folder containing these binaries in your Lambda function
# to ensure these newer binaries are used.
@josephmisiti
josephmisiti / StateBoundaries.sql
Created November 8, 2018 00:19 — forked from jakebathman/StateBoundaries.sql
The approximate max/min latitude and longitude for all states and major territories
-- Create the table
CREATE TABLE IF NOT EXISTS `StateBoundaries` (
`State` varchar(10) DEFAULT NULL,
`Name` varchar(255) DEFAULT NULL,
`MinLat` varchar(50) DEFAULT NULL,
`MaxLat` varchar(50) DEFAULT NULL,
`MinLon` varchar(50) DEFAULT NULL,
`MaxLon` varchar(50) DEFAULT NULL
) ENGINE=InnoDB DEFAULT CHARSET=utf8;
@josephmisiti
josephmisiti / README.md
Created July 13, 2016 17:39 — forked from joshdover/README.md
Idiomatic React Testing Patterns

Idiomatic React Testing Patterns

Testing React components seems simple at first. Then you need to test something that isn't a pure interaction and things seem to break down. These 4 patterns should help you use a pattern that is repeatable and readable for the type of test you need.

Setup

I recommend doing all setup in the most functional way possible. If you can avoid it, don't set variables in a beforeEach. This will help ensure tests are isolated and make things a bit easier to reason about. I use a pattern

@josephmisiti
josephmisiti / spacy_intro.ipynb
Created February 21, 2018 18:28 — forked from aparrish/spacy_intro.ipynb
NLP Concepts with spaCy. Code examples released under CC0 https://creativecommons.org/choose/zero/, other text released under CC BY 4.0 https://creativecommons.org/licenses/by/4.0/
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.