Skip to content

Instantly share code, notes, and snippets.

View stav's full-sized avatar
💭
c0d1ng

Steven Almeroth stav

💭
c0d1ng
View GitHub Profile
mpv --input-test --force-window --idle
mpv --input-keylist
mpv --list-options
Keyboard
r and t
Move subtitles up/down
@stav
stav / find
Created May 15, 2016 21:56
Find files using a shorter syntax: `f .jpg`
#!/bin/bash
if [ `expr index "$1" "\+"` = 0 ]; then
PATTERN="*$1*"
else
#PATTERN="$1"
PATTERN="${1/+/*}"
fi
if [ -n "$2" ]; then
@stav
stav / clean
Created May 15, 2016 21:55
Remove Python bytecodes and deployment cache
#!/bin/bash
if [ -n "$1" ]; then
TARGET="$1"
else
TARGET="."
fi
# Remove Python compiled bytecode files
command="find $TARGET -name '*.pyc' -type f -delete -print 2>/dev/null";
@stav
stav / gist:4356269
Created December 21, 2012 22:24
Scrapy partial response downloader middleware
class PartialResponse(object):
""" Downloader middleware to only return the first n bytes
"""
def process_response(self, request, response, spider):
max_size = getattr(spider, 'response_max_size', 0)
if max_size and len(response.body) > max_size:
h = response.headers.copy()
h['Content-Length'] = max_size
response = response.replace(
body=response.body.encode('utf-8')[:max_size],
@stav
stav / gist:5337954
Last active December 15, 2015 23:09
Scrapinghub API Job Log Sorter
import sys
import json
import argparse
from os.path import exists
from pprint import pprint
from urllib import urlencode, urlretrieve
from urllib2 import urlopen
from urlparse import urlsplit, parse_qs
@stav
stav / gist:5152476
Last active December 14, 2015 21:39
Crawler project running Scrapy from a script
# main.py:
from project.spiders.log_test import TestSpider as EstiloMASpider
from scrapy.xlib.pydispatch import dispatcher
from scrapy.crawler import Crawler
from twisted.internet import reactor
from scrapy.utils.project import get_project_settings
from scrapy import log, signals
@stav
stav / gist:5137869
Last active December 14, 2015 19:39
Scrapy blocking spider that renders JavaScript with PyQt4
from PyQt4.QtCore import QUrl
from PyQt4.QtGui import QApplication
from PyQt4.QtWebKit import QWebPage
from scrapy.contrib.linkextractors.sgml import SgmlLinkExtractor
from scrapy.spider import BaseSpider
from scrapy.http import HtmlResponse
class Render(QWebPage):
def __init__(self, url):
@stav
stav / gist:4191165
Created December 2, 2012 21:33
Generic PHP debug printer
<?php
/**
* generic debug printer
*
* Because I didn't like having to pass two arguments to a debug printer, namely the
* evaluated and un-evaluated expressions, like: $baker->bread and "baker.bread", i.e.,
* I only wanted to pass the un-evaluated string and let the print routine do the
* evaluating. This script does that proceduraly, i.e., not in a function, so expression
* scope is not changed.
*
@stav
stav / gist:3520611
Created August 29, 2012 23:54
Google Places API Search
#!/usr/bin/python
# Google Places Search
#
# Use the Google Places API to text search for the supplied keywords and output
# the first result to standard out.
import sys
import json
import argparse
@stav
stav / RouteSpider.py
Created August 14, 2015 19:56
Scrapy route spider pseudo code
class MySpider(RouteSpider):
name = "example.com"
def start_routes(self, response):
for city in response.css('#location .dropdown-bg li'):
yield Route(
url=city.xpath('a/@href'),
callback=self.parse_city,