Skip to content

Instantly share code, notes, and snippets.

View amontalenti's full-sized avatar

Andrew Montalenti amontalenti

View GitHub Profile
@amontalenti
amontalenti / sitemap_spider.py
Last active October 6, 2021 15:44
Simple script that uses BeautifulSoup, requests, and urlparse to spider a sitemap.xml file (CNN used as example)
import os
import requests
from BeautifulSoup import BeautifulSoup
from urlparse import urlparse
sitemap_xml = "http://www.cnn.com/sitemaps/sitemap-specials-2013-11.xml"
sitemap_response = requests.get(sitemap_xml)
soup = BeautifulSoup(sitemap_response.content)
@amontalenti
amontalenti / inlet.js
Last active December 27, 2015 19:19
simple bars
var data = [1, 2, 3, 4, 5];
var width = 200;
var height = 200;
var x = d3.scale
.ordinal()
.domain(data)
.rangeBands([0, width]);
var y = d3.scale
Latency Comparison Numbers Time Light Distance Approximate Light Distance
-------------------------- ---- -------------- --------------------------
L1 cache reference 0.5 ns 0.15 m Diagonal across your smartphone
Branch mispredict 5 ns 1.5 m Height of Natalie Portman
L2 cache reference 7 ns 2.1 m Height of Shaq
Mutex lock/unlock 25 ns 7.5 m Height of a school flag pole
Main memory reference 100 ns 30 m Half a Manhattan city block (North/South)
Compress 1K bytes with Zippy 3,000 ns 900 m Width of Central Park
Send 1K bytes over 1 Gbps network 10,000 ns 3,000 m Width of Manhattan
Read 4K randomly from SSD* 150,000 ns 45,000 m NYC to Hempstead on Long Island
@amontalenti
amontalenti / bigram_freq.py
Created December 15, 2013 16:57
example of using nltk to get bigram frequencies
>>> from nltk import word_tokenize
>>> from nltk.collocations import BigramCollocationFinder
>>> text = "obama says that obama says that the war is happening"
>>> finder = BigramCollocationFinder.from_words(word_tokenize(text))
>>> finder.items()[0:5]
[(('obama', 'says'), 2),
(('says', 'that'), 2),
(('is', 'happening'), 1),
(('that', 'obama'), 1),
(('that', 'the'), 1)]
@amontalenti
amontalenti / JavaMapsAndLists.java
Created December 24, 2013 14:53
example illustrating the complete lack of lightweight data modelling in Java
import java.util.List;
import java.util.ArrayList;
import java.util.Arrays;
import java.util.Map;
import java.util.HashMap;
public class JavaMapsAndLists {
public static void main(String [] args) {
List<Integer> someItems = Arrays.asList(new Integer[] {1, 2, 3, 4});
for (Integer item : someItems) {
@amontalenti
amontalenti / python_maps_and_lists.py
Created December 24, 2013 14:56
Python example of lightweight data modelling
def main():
some_items = [1, 2, 3, 4]
for item in some_items:
print item
some_mapping = {"ST": "started", "IP": "in progress", "DN": "done"}
for key, val in some_mapping.iteritems():
print key, "=>", val
if __name__ == "__main__":
main()
@amontalenti
amontalenti / clojure_maps_and_lists.clj
Last active January 1, 2016 08:19
Clojure example of lightweight data modelling
(ns cljtests.data)
(defn main []
(let [some-items [1 2 3 4]
some-mapping {:ST "started", :IP "in progress", :DN "done"}]
(doseq [item some-items]
(println item))
(doseq [[key val] some-mapping]
(println (str key " => " val)))))
@amontalenti
amontalenti / groovy_maps_and_lists.groovy
Last active January 2, 2016 15:19
Groovy example of lightweight data modelling
package org.pixelmonkey.groovytests;
someItems = [1, 2, 3, 4]
someItems.each {
println it
}
someMapping = ["ST": "started", "IP": "in progress", "DN": "done"]
someMapping.each { key, val ->
println "${key} => ${val}"
}
@amontalenti
amontalenti / readtweets.clj
Last active January 3, 2016 17:39
reading tweets from a file using Clojure I/O, string utilities, and clojure.data.json.
(use 'clojure.java.io)
(use '[clojure.string :only (split)])
(require '[clojure.data.json :as json])
(defn read-tweets []
;; this opens a file
(with-open [rdr (reader "data/tweets.log")]
;; this returns a lazy-seq, which is bad since file is opened first?
;; results in IOException Stream closed
(for [line (line-seq rdr)]
$ lein repl
nREPL server started on port 59905 on host 127.0.0.1
REPL-y 0.3.0
Clojure 1.5.1
Docs: (doc function-name-here)
(find-doc "part-of-name-here")
Source: (source function-name-here)
Javadoc: (javadoc java-object-or-class-here)
Exit: Control+D or (exit) or (quit)
Results: Stored in vars *1, *2, *3, an exception in *e