Skip to content

Instantly share code, notes, and snippets.

@zcorpan
zcorpan / gist:2c2b3114ddd636515b95
Created August 26, 2015 14:49
httparchive results for 'content' without :before/:after
SELECT page, url
FROM [httparchive:runs.2014_08_15_requests_body]
WHERE mimeType CONTAINS "text/css"
AND REGEXP_MATCH(LOWER(body), r"([;\{]|\s)content\s*:\s*[\"']")
AND NOT LOWER(body) CONTAINS ":before"
AND NOT LOWER(body) CONTAINS ":after"
page,url
http://www.orcsweb.com/,http://www.orcsweb.com/wp-content/plugins/wordpress-post-tabs/css/styles/default/style.css?ver=1.4
http://www.kabelbw.de/,http://www.kabelbw.de/content/www-kabelbw-de/templated/utilities/footnotes/footnotes-content/_jcr_content/template.a233dd4b6908ca42d0bd29dc31ccb0d9.css
uaua.info
descargasnsn.com
evgenyfireform.com
migente.com
tumejortv.com
medya73.com
propertyroom.com
reverb.com
kosmetista.ru
rayfile.com
@zcorpan
zcorpan / foster-parent.txt
Created May 6, 2014 23:14
Rough search for foster parentings in http://webdevdata.org/ data set 2013-09-01 102,000 pages
This file has been truncated, but you can view the full file.
./00/aawsat.com_001b36ab685661428c656f1098c6dba3.html.txt: <td><INPUT type=image src='/01common/pix/go.gif' name=I4 ></td><INPUT type=hidden name='cache' value='0'></form>
./00/penoestribo.com.br_00bc19cab0dae2549442afc6d59c5e37.html.txt:</td><br>
./01/erotema.ru_0164e896c185b678118ddf0b033a4482.html.txt: <tr><noindex>
./01/erotema.ru_0164e896c185b678118ddf0b033a4482.html.txt: <td>&nbsp;</td><noindex>
./01/erotema.ru_0164e896c185b678118ddf0b033a4482.html.txt: <td>&nbsp;</td><noindex>
./02/kingsex.eu_02edff6fbbd667807153b0d5f534ebe5.html.txt:</tr> <div class="clear"></div>
./02/mybabyclothes.com_028919e368cd0a19e815e4a2ef80ef2b.html.txt: <tr><td></td></tr><div class="titleMain">featured items</div><tr><td class="main"><!-- D featured_products_mainpage updated vinod //-->
./02/supplementwarehouse.com_023afe2df21244ed4d10152504abe976.html.txt: </DIV></TD><div id="autocomplete_choices" class="autocomplete"></div>
./03/laserpointerforums.com_03b35d25fa9b3e46e202e466d76d58e4.html
@zcorpan
zcorpan / fetch-img.py
Last active August 29, 2015 14:05
Exif research
import re
from HTMLParser import HTMLParser
from urlparse import urljoin
import os
src = ''
width = False
height = False
class MyHTMLParser(HTMLParser):
@zcorpan
zcorpan / gist:5f0e36efdd35b800a6be
Created September 2, 2015 08:19
WHATWG HTML svn revision to git commit hash
revs = {
'8891': '80ff74291cf0f97e98e00298e4f463ac63ac2139',
'8890': 'b67efcb6068b71915a37896300987e903be0e632',
'8889': '4fb1fd704451cf9ca4008ded53b080c3b85e99ce',
'8888': '6dd7f19415c2f04388b57073f2617556b446a3d9',
'8887': '91799e2e56aa82d40de08ac5aece4d844e5c5447',
'8886': 'e2f88d6709a85741c4c57c8f65b119df9530f819',
'8885': '07084548449afb4a3a867cc2782f91095bae1e89',
'8884': 'ddbab10c90ab369e6c23e750315b02a5edc48cc2',
'8883': '61777f9d2457eead549bce1a16448d194041d3a9',
<!doctype html>
<meta charset=utf-8>
<script src=https://w3c-test.org/resources/testharness.js></script>
<script>
function collectCharacters(input, pos, chars) {
var startPos = pos;
while (chars.indexOf(input[pos]) != -1) {
pos++;
if (input[pos] === undefined) {
break;
<!doctype html>
<script src=bliss.js></script>
<script>
onload = function() {
var start = performance.now();
var rows = [];
for (var i = 0; i <= 50000; ++i) {
rows.push($.create({
./aa/wetter.com_aa72bc7f58f52a956aefd7930808f787.html.txt:<area class="ForecastCityOverlay" alt="Berlin" title="Berlin" shape="poly" coords="264.788457758,119.230041959,296.788457758,119.230041959,296.788457758,105.230041959,264.788457758,105.230041959,264.788457758,119.230041959" href="/deutschland/berlin/DE0001020.html" >
./aa/wetter.com_aa72bc7f58f52a956aefd7930808f787.html.txt:<area class="ForecastCityOverlay" alt="Hamburg" title="Hamburg" shape="poly" coords="129.189202721,75.7755327482,181.189202721,75.7755327482,181.189202721,61.7755327482,129.189202721,61.7755327482,129.189202721,75.7755327482" href="/deutschland/hamburg/DE0004130.html" >
./aa/wetter.com_aa72bc7f58f52a956aefd7930808f787.html.txt:<area class="ForecastCityOverlay" alt="München" title="München" shape="poly" coords="220.179682511,307.487537772,280.179682511,307.487537772,280.179682511,293.487537772,220.179682511,293.487537772,220.179682511,307.487537772" href="/deutschland/muenchen/DE0006515.html" >
./aa/wetter.com_aa72bc7f58f52a956aefd79
<!doctype html>
<meta charset=utf-8>
<title>coords</title>
<style>
table { table-layout:fixed; width:100%; border-collapse:collapse }
td { max-width:25%; overflow:hidden; border:2px solid gray; padding:0.5em; font-family:monospace }
</style>
<table>
<tr><th>test<th>old parser<th>new parser (POC)<th>new parser (new-spec-compliant)
<script>