Skip to content

Instantly share code, notes, and snippets.

View pkpp1233's full-sized avatar

Paul Katsen pkpp1233

View GitHub Profile
@pkpp1233
pkpp1233 / scraper.py
Created December 4, 2014 20:21
Scrape server
import urllib
from bs4 import BeautifulSoup
from flask import Flask, jsonify, request
app = Flask(__name__)
@app.route('/scrape')
def scrape():
sport = request.args['sport']
html = urllib.urlopen("http://www.espn.com/" + sport).read()
@pkpp1233
pkpp1233 / scraper.py
Created December 4, 2014 20:19
Scrape Function
import urllib
from bs4 import BeautifulSoup
sport = "nhl"
html = urllib.urlopen("http://www.espn.com/" + sport).read()
headlines = [headline.get_text() for headline in BeautifulSoup(html).find(class_="headlines").find_all('li')]
print headlines
@pkpp1233
pkpp1233 / index.html
Created December 4, 2014 17:59
Forecaster
<script src="http://d3js.org/d3.v3.min.js" charset="utf-8"></script>
<script src="//ajax.googleapis.com/ajax/libs/jquery/2.1.1/jquery.min.js"></script>
<style> /* set the CSS */
#forecastExampleContainer {
font: 12px Arial;
}
#forecastExampleContainer path {
@pkpp1233
pkpp1233 / index.html
Created December 4, 2014 11:29
Forecaster
<html>
<head>
<style> /* set the CSS */
#forecastExampleContainer {
font: 12px Arial;
}
#forecastExampleContainer path {
stroke: steelblue;
@pkpp1233
pkpp1233 / block.js
Last active August 29, 2015 14:10
With Blockspring
var request = require('request');
var cheerio = require('cheerio');
var blockspring = require('blockspring');
blockspring.define(function(req, res){
request('http://www.espn.com/' + req.params['sport'], function(error, response, html){
if(!error){
var $ = cheerio.load(html);
var headlines = []
@pkpp1233
pkpp1233 / index.ejs
Last active August 29, 2015 14:10
Pull ESPN Headers with UI
<html>
<body>
<form accept-charset="UTF-8" action="/scrape" method="post">
<input autofocus="autofocus" placeholder="nba" type="text" name="sport">
<input type="submit" value="Get my headlines!">
<br>
</form>
</body>
</html>
@pkpp1233
pkpp1233 / package.json
Last active August 29, 2015 14:10
Pull ESPN Headlines Server
{
"name" : "espn-scrape",
"version" : "0.0.1",
"description" : "Scrape espn headlines.",
"main" : "server.js",
"author" : "Blockspring",
"dependencies" : {
"express" : "latest",
"request" : "latest",
"cheerio" : "latest"
@pkpp1233
pkpp1233 / espn.js
Last active August 29, 2015 14:10
Pull ESPN Headlines Function
var request = require('request');
var cheerio = require('cheerio');
var sport = "nhl";
request('http://www.espn.com/' + sport, function(error, response, html){
if(!error){
var $ = cheerio.load(html);
var headlines = []
@pkpp1233
pkpp1233 / index.html
Created November 28, 2014 20:47
Grab Color Palette
<html>
<head>
<style> /* set the CSS */
#paletteExampleContainer {
font: 12px Arial;
}
.swatches {
width: 100%;
@pkpp1233
pkpp1233 / fuzzy_match.py
Created November 26, 2014 17:34
Fuzzy match script
from fuzzywuzzy import fuzz
from fuzzywuzzy import process
import pandas as pd
import numpy as np
# inputs
fuzzy_match = [["My IDs"], ["Red"], ["Green"], ["Blue"], ["Black"], ["Yellow"], ["Pink"]]
match_against = [["Other IDs"], ["Rdd"], ["Grown"], ["Grodkj"], ["Blome"], ["Bluz"], ["Yell"], ["Yelloow$"], ["Punk"], ["Pank"], ["Other"]]
count_match = 2