Skip to content

Instantly share code, notes, and snippets.

@mitechie
Created April 21, 2012 01:00
Show Gist options
  • Star 0 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save mitechie/2433024 to your computer and use it in GitHub Desktop.
Save mitechie/2433024 to your computer and use it in GitHub Desktop.
list of commits in jerry's branch that need to be processed into readability_lxml
commit cec2d35c55cc8b94f0f6ff582cf700dab377af8a
Author: Jerry Charumilind <git@jcharum.fastmail.net>
Date: Tue Sep 13 10:38:13 2011 -0700
Increment version to 0.07dev
setup.py | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
commit b646ea1ceacf66cc9d32baa432c08a3b5e7fc085
Author: Jerry Charumilind <git@jcharum.fastmail.net>
Date: Tue Sep 13 10:33:33 2011 -0700
Use UrlFetch to read from URL from command-line
UrlFetch conveniently sets a user-agent, so it works more reliably than a raw
urlopen.
readability/readability.py | 24 +++++++++++++++---------
1 file changed, 15 insertions(+), 9 deletions(-)
commit 3a1eae78efd5d4df763229b014c461ee510d9f6d
Author: Jerry Charumilind <git@jcharum.fastmail.net>
Date: Tue Sep 13 10:19:26 2011 -0700
Add docstring for Document class
readability/readability.py | 31 +++++++++++++++++++++++++++++++
1 file changed, 31 insertions(+)
commit 70afb04de3f448302ad21abc9d09809d14283924
Author: Jerry Charumilind <git@jcharum.fastmail.net>
Date: Tue Sep 13 10:18:40 2011 -0700
Add user-agent to default urlopen
Some sites reject HTTP requests without a user-agent.
readability/urlfetch.py | 7 ++++++-
1 file changed, 6 insertions(+), 1 deletion(-)
commit ce4273bc96af59cfc1bed4bfededbf3f301215b3
Author: Jerry Charumilind <git@jcharum.fastmail.net>
Date: Mon Sep 12 10:48:54 2011 -0700
Update version to 0.06dev
This version incorporates the limiting of the number of pages that can be
followed when running on a multi-page document.
setup.py | 6 +++---
1 file changed, 3 insertions(+), 3 deletions(-)
commit 1b08ac47ce5823c9e5aeeae70cdcf6709cf01c11
Author: Jerry Charumilind <git@jcharum.fastmail.net>
Date: Fri Sep 9 14:38:28 2011 -0700
Fix comment for clean_segment
readability/multi_page.py | 10 ++++++----
1 file changed, 6 insertions(+), 4 deletions(-)
commit 874d6a0f398dba23a54440c60ba3070670969805
Author: Jerry Charumilind <git@jcharum.fastmail.net>
Date: Fri Sep 9 14:33:29 2011 -0700
Add comment for PAGE_CLASS
readability/multi_page.py | 3 +++
1 file changed, 3 insertions(+)
commit f3b498d8d5672ff39c7cf4657f951f83de22e85f
Author: Jerry Charumilind <git@jcharum.fastmail.net>
Date: Fri Sep 9 14:28:25 2011 -0700
Limit the number of pages to append to 10
There are cases where the algorithm incorrectly identifies next page links that
would lead it to crawl many, many, many pages.
readability/multi_page.py | 8 ++++++++
1 file changed, 8 insertions(+)
commit 7b4776ae49ed7904a4d8b23a9946afc0e97de173
Author: Jerry Charumilind <git@jcharum.fastmail.net>
Date: Fri Sep 9 13:58:25 2011 -0700
Refactor multi-page functionality into own file
It was a big enough piece to warrant separation. This also does some
refactoring of some shared bits.
readability/htmls.py | 22 ++-
readability/multi_page.py | 388 +++++++++++++++++++++++++++++++++++++++++
readability/readability.py | 414 +-------------------------------------------
readability/regexes.py | 26 +++
4 files changed, 439 insertions(+), 411 deletions(-)
commit bc05708f7c4605cf36491dcec8e7d461718f7582
Author: Jerry Charumilind <git@jcharum.fastmail.net>
Date: Fri Sep 9 12:28:26 2011 -0700
Slightly improve logging
Logging when looking for next page links is slightly improved.
readability/readability.py | 9 +++++++--
1 file changed, 7 insertions(+), 2 deletions(-)
commit 8c361122b12602dfe07a1466f9f5b598c15eb3b9
Author: Jerry Charumilind <git@jcharum.fastmail.net>
Date: Thu Sep 8 14:28:28 2011 -0700
Link debug log level to --verbose option
readability/readability.py | 9 +++++++--
1 file changed, 7 insertions(+), 2 deletions(-)
commit e2c4a1291645900a311b0e7c76186c688983144a
Author: Jerry Charumilind <git@jcharum.fastmail.net>
Date: Thu Sep 8 14:13:54 2011 -0700
Improve command-line readability invocation
Now can generate web page and open in a browser for inspection.
readability/readability.py | 73 +++++++++++++++++++++++++++++++++++++++++---
1 file changed, 68 insertions(+), 5 deletions(-)
commit f199af27a8d1a84f14e69e7fad477fcc8b0e18a8
Author: Jerry Charumilind <git@jcharum.fastmail.net>
Date: Thu Sep 8 12:15:29 2011 -0700
Clean up argument parsing; add -o option
Clean up argument parsing a bit. Add a currently non-functional
--open-browser/-o option for opening the result in a browser.
readability/readability.py | 55 +++++++++++++++++++++++++++++++-------------
1 file changed, 39 insertions(+), 16 deletions(-)
commit cb783e6d9a4a45f5fd375507a654ec3aed98f715
Author: Jerry Charumilind <git@jcharum.fastmail.net>
Date: Fri Aug 19 11:36:08 2011 -0700
Add more comments describing regression test usage
regression_test.py | 39 ++++++++++++++++++++++++++++++++++++++-
1 file changed, 38 insertions(+), 1 deletion(-)
commit c1cb627d3cb4bf1e15580c881012c0a4e09f267a
Author: Jerry Charumilind <git@jcharum.fastmail.net>
Date: Thu Aug 18 17:16:08 2011 -0700
Start adding documentation for regression_test.py
This is mostly just a checkpoint.
regression_test.py | 46 ++++++++++++++++++++++++++++++++++++++++++++++
1 file changed, 46 insertions(+)
commit 32e0c07862ff4ef45831eea8d7e44afaf0a52cd8
Author: Jerry Charumilind <git@jcharum.fastmail.net>
Date: Thu Aug 18 16:27:24 2011 -0700
Fix handling of missing title element
No longer throw an exception from get_title if the document has no title
element. Instead, return an empty string indicating no title.
readability/htmls.py | 17 ++++++++++-------
readability/htmls_test.py | 29 +++++++++++++++++++++++++++++
setup.py | 2 +-
3 files changed, 40 insertions(+), 8 deletions(-)
commit 7bf4ce519092983243fffcf7b956aa00e7148199
Author: Jerry Charumilind <git@jcharum.fastmail.net>
Date: Mon Aug 8 14:19:07 2011 -0700
Increment version number to 0.4dev
setup.py | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
commit e3ddae4ef04bec4e0fac586b22a25860bad669db
Author: Jerry Charumilind <git@jcharum.fastmail.net>
Date: Mon Aug 8 13:13:20 2011 -0700
Improve logging
readability/readability.py | 31 +++++++++++++------------------
1 file changed, 13 insertions(+), 18 deletions(-)
commit b916c494b5b3b5c7663599d5f6fe1d3d5d655109
Author: Jerry Charumilind <git@jcharum.fastmail.net>
Date: Fri Aug 5 15:44:22 2011 -0700
Add ability to run a subset of regression tests
regression_test.py | 34 ++++++++++++++++++++++++++--------
1 file changed, 26 insertions(+), 8 deletions(-)
commit 47b5bbd17448541ac931a65eb78fc6e1fe9fe795
Author: Jerry Charumilind <git@jcharum.fastmail.net>
Date: Fri Aug 5 12:21:36 2011 -0700
Add known issues about wget
README | 11 +++++++++++
1 file changed, 11 insertions(+)
commit f053c4d2d9f956ca877d5f64d963801798f07815
Author: Jerry Charumilind <git@jcharum.fastmail.net>
Date: Fri Aug 5 12:20:46 2011 -0700
Stop adding CSS to original version of page
When writing test output, stop adding our CSS for pretty readability display to
the original page version.
regression_test.py | 17 +++++++++--------
1 file changed, 9 insertions(+), 8 deletions(-)
commit aa4245e6892467592bc12afaf3725fe8ba58f09f
Author: Jerry Charumilind <git@jcharum.fastmail.net>
Date: Fri Aug 5 11:30:55 2011 -0700
Fix typo in test assertion text
readability/readability_test.py | 4 ++--
1 file changed, 2 insertions(+), 2 deletions(-)
commit c87a773922c700f321892d531c2e3c1cf1ebe181
Author: Jerry Charumilind <git@jcharum.fastmail.net>
Date: Thu Aug 4 17:01:29 2011 -0700
Move unit tests into own module
The unit tests were getting a bit unwieldy in the main source file.
readability/readability.py | 309 -------------------------------------
readability/readability_test.py | 321 +++++++++++++++++++++++++++++++++++++++
2 files changed, 321 insertions(+), 309 deletions(-)
commit 44eee68e4d64b90fb6c5be860af930137390362e
Author: Jerry Charumilind <git@jcharum.fastmail.net>
Date: Thu Aug 4 16:50:37 2011 -0700
Add regression tests for Slate content
These are pretty broken right now, but this at least gets them into the suite.
regression_test_data/slate-000.yaml | 68 +
.../ajax/libs/jquery/1.5.1/jquery.min.js | 16 +
.../slate-000/ajax.googleapis.com/robots.txt | 212 +
.../slate/prod/1.5.1/js/id.slate-bottom.full.js | 49 +
.../identity/slate/prod/1.5.1/js/wapo_identity.js | 79 +
.../wapolabs/1.4.2/js/wapolabs.full.js | 6414 ++++++++++++++
.../slate-000/cache-01.cleanprint.net/robots.txt | 4 +
.../cdn.echoenabled.com/clientapps/v2.4.10/auth.js | 1355 +++
.../clientapps/v2.4.10/backplane.js | 274 +
.../clientapps/v2.4.10/curation.js | 1887 ++++
.../clientapps/v2.4.10/jquery-plugins.js | 309 +
.../clientapps/v2.4.10/plugins/community-flag.js | 75 +
.../clientapps/v2.4.10/plugins/like.js | 118 +
.../clientapps/v2.4.10/plugins/reply.js | 196 +
.../clientapps/v2.4.10/stream.js | 3127 +++++++
.../clientapps/v2.4.10/submit.js | 1797 ++++
.../downloads.mailchimp.com/js/jquery.form.js | 872 ++
.../downloads.mailchimp.com/js/jquery.validate.js | 1118 +++
.../slate-000/edge.quantserve.com/robots.txt | 2 +
.../images/redesign2008/search_button.jpg | Bin 0 -> 592 bytes
.../slate-000/img.slate.com/js/slate_dom_lib.js | 239 +
.../img.slate.com/js/slate_js_main_2008.js | 692 ++
.../slate-000/img.slate.com/robots.txt | 18 +
.../slate-000/media.washingtonpost.com/robots.txt | 53 +
.../wp-srv/ad/slate_ad2.js | 492 ++
.../wp-srv/ad/wpni_generic_ad.js | 1341 +++
.../slate-000/pixel.quantserve.com/robots.txt | 2 +
.../slate-000/platform.twitter.com/widgets.js | 13 +
.../Slatest/120283811.jpg.CROP.thumbnail-small.jpg | Bin 0 -> 70887 bytes
.../designs/slatest/images/Logo_63x23_White.png | Bin 0 -> 517 bytes
.../etc/designs/slatest/images/ad_option_icon.gif | Bin 0 -> 3776 bytes
.../designs/slatest/images/article_top_wedge.gif | Bin 0 -> 141 bytes
.../etc/designs/slatest/images/dot-for-border.png | Bin 0 -> 194 bytes
.../etc/designs/slatest/images/flyoutnotch.gif | Bin 0 -> 79 bytes
.../etc/designs/slatest/images/header_tnc.gif | Bin 0 -> 1704 bytes
.../etc/designs/slatest/images/header_video.gif | Bin 0 -> 651 bytes
.../designs/slatest/images/more-stories-header.png | Bin 0 -> 9055 bytes
.../etc/designs/slatest/images/newsletter_go.png | Bin 0 -> 2522 bytes
.../designs/slatest/images/slateLogo_search.gif | Bin 0 -> 445 bytes
.../slatest/images/slate_stream_loading.gif | Bin 0 -> 14420 bytes
.../designs/slatest/images/slst-header-logo.png | Bin 0 -> 2967 bytes
.../designs/slatest/images/slst-sprite-opaque.png | Bin 0 -> 174070 bytes
.../slatest/images/sprite_slatest_headersDiv.gif | Bin 0 -> 18005 bytes
.../slatest/images/sprite_slatest_iconsCTA.gif | Bin 0 -> 5885 bytes
.../etc/designs/slatest/images/streamer_sprite.png | Bin 0 -> 3153 bytes
.../etc/designs/slatest/images/submit_btn.jpg | Bin 0 -> 5384 bytes
.../designs/slatest/images/toobar_background.png | Bin 0 -> 55274 bytes
.../etc/designs/slatest/images/toolbar_divider.gif | Bin 0 -> 95 bytes
.../etc/designs/slatest/js/misc.js | 177 +
.../etc/designs/slatest/js/slatest_identity.js | 175 +
.../etc/designs/slatest/js/slatest_streamer.js | 113 +
.../slatest.slate.com/etc/designs/slatest/lib.css | 1618 ++++
.../slate-000/slatest.slate.com/favicon.ico | Bin 0 -> 5222 bytes
.../images/109695949.jpg.CROP.rectangle-small.jpg | Bin 0 -> 16495 bytes
.../images/120340318.jpg.CROP.rectangle-small.jpg | Bin 0 -> 11504 bytes
.../images/72667859.jpg.CROP.rectangle-small.jpg | Bin 0 -> 9407 bytes
.../images/73950809.jpg.CROP.rectangle-small.jpg | Bin 0 -> 11055 bytes
...debt_compromise_lawmakers_find_a_new_topic.html | 879 ++
...compromise_lawmakers_find_a_new_topic.html.rdbl | 12 +
.../widgets.outbrain.com/outbrainWidget.js | 270 +
.../slate-000/www.facebook.com/robots.txt | 124 +
.../slate-000/www.slate.com/id/2205171/index.html | 3 +
.../slate-000/www.slate.com/js/s_code.js | 475 +
.../www.slate.com/jsmenus_partner.aspx?id=2065896 | 19 +
.../slate-000/www.slate.com/robots.txt | 18 +
regression_test_data/slate-001.yaml | 82 +
.../ajax/libs/jquery/1.5.1/jquery.min.js | 16 +
.../slate-001/ajax.googleapis.com/robots.txt | 212 +
.../slate/prod/1.5.1/js/id.slate-bottom.full.js | 49 +
.../identity/slate/prod/1.5.1/js/wapo_identity.js | 79 +
.../wapolabs/1.4.2/js/wapolabs.full.js | 6414 ++++++++++++++
.../slate-001/cache-01.cleanprint.net/robots.txt | 4 +
.../slate-001/edge.quantserve.com/robots.txt | 2 +
.../6023/6006197788_3527c09ca1.jpg | Bin 0 -> 32508 bytes
.../slate-001/farm7.static.flickr.com/robots.txt | 2 +
.../images/redesign/article_top_wedge.gif | Bin 0 -> 141 bytes
.../images/redesign2008/ad_option_icon.gif | Bin 0 -> 3776 bytes
.../images/redesign2008/search_button.jpg | Bin 0 -> 592 bytes
.../slate-001/img.slate.com/js/slate_dom_lib.js | 239 +
.../img.slate.com/js/slate_js_main_2008.js | 692 ++
.../2300447/2300862/110803_ARCH_stepsTT.jpg | Bin 0 -> 6894 bytes
.../2300447/2300862/110803_SBOX_dollarstore_TT.jpg | Bin 0 -> 16625 bytes
.../2300862/110803_SCOCCA_chineseWomanTT.jpg | Bin 0 -> 6894 bytes
.../2279636/2300447/2300862/110803_TR_bezosTT.jpg | Bin 0 -> 5813 bytes
.../slate-001/img.slate.com/robots.txt | 18 +
.../slate-001/media.washingtonpost.com/robots.txt | 53 +
.../wp-srv/ad/slate_ad2.js | 492 ++
.../wp-srv/ad/wpni_generic_ad.js | 1341 +++
.../slate-001/pixel.quantserve.com/robots.txt | 2 +
.../slate-001/upload.wikimedia.org/robots.txt | 3 +
.../wikipedia/commons/8/81/FL03_109.gif | Bin 0 -> 39622 bytes
.../slate-001/wamo.info/pa/110708_privatejobs.jpg | Bin 0 -> 13064 bytes
.../widgets.outbrain.com/outbrainWidget.js | 270 +
.../slate-001/www.facebook.com/robots.txt | 124 +
.../weigel/2011/08/03/bachmann_vs_romney.html | 1026 +++
.../weigel/2011/08/03/bachmann_vs_romney.html.rdbl | 6 +
...eal_the_72_hour_rule_or_amend_it_at_least_.html | 1036 +++
.../blogs/weigel/2011/08/03/undead_birtherism.html | 996 +++
.../blogs/weigel/jcr:content/sprite.gif | Bin 0 -> 7066 bytes
.../www.slate.com/content/dam/slate/favicon.ico | Bin 0 -> 5222 bytes
...0728_FOOD_Nutella.jpg.CROP.thumbnail-xsmall.jpg | Bin 0 -> 6996 bytes
..._SNUT_seatsIlloTN.jpg.CROP.thumbnail-xsmall.jpg | Bin 0 -> 7855 bytes
...OX_dollarstore_TN.jpg.CROP.thumbnail-xsmall.jpg | Bin 0 -> 8831 bytes
...dEX_pomegraniteTN.jpg.CROP.thumbnail-xsmall.jpg | Bin 0 -> 4793 bytes
...110803_BI_obamaTN.jpg.CROP.thumbnail-xsmall.jpg | Bin 0 -> 8268 bytes
...ScoccaPortrait_TN.jpg.CROP.thumbnail-xsmall.jpg | Bin 0 -> 7021 bytes
...howLaunchTemplate.jpg.CROP.thumbnail-xsmall.jpg | Bin 0 -> 7407 bytes
.../2011/08/03/bolting_the_gop_in_florida.html | 999 +++
.../weigel/2011/08/03/the_narrowing_of_cpac.html | 1007 +++
...hink_debt_deal_will_be_bad_for_the_economy.html | 994 +++
.../slate/blogs/weigel/jcr:content/sprite.gif | Bin 0 -> 7066 bytes
.../www.slate.com/etc/designs/slate/css/blogs.css | 1716 ++++
.../etc/designs/slate/images/0t5_1.png | Bin 0 -> 204 bytes
.../etc/designs/slate/images/Logo_63x23_White.png | Bin 0 -> 517 bytes
.../etc/designs/slate/images/flyoutnotch.gif | Bin 0 -> 79 bytes
.../designs/slate/images/sl-sprite-rightrail.png | Bin 0 -> 109746 bytes
.../designs/slate/images/sl-sprite-toolbars.gif | Bin 0 -> 14654 bytes
.../etc/designs/slate/images/slateLogo_search.gif | Bin 0 -> 445 bytes
.../etc/designs/slate/images/slate_logo.gif | Bin 0 -> 1894 bytes
.../designs/slate/images/slb-sprite-browbeat.gif | Bin 0 -> 7497 bytes
.../etc/designs/slate/images/slb-sprite-scocca.gif | Bin 0 -> 3590 bytes
.../etc/designs/slate/images/slb-sprite-tnc.gif | Bin 0 -> 12898 bytes
.../etc/designs/slate/images/slb-sprite-weigel.gif | Bin 0 -> 5148 bytes
.../designs/slate/images/slb-sprite-xxfactor.gif | Bin 0 -> 4488 bytes
.../etc/designs/slate/lib.js?version=1.0.10 | 9138 ++++++++++++++++++++
.../etc/designs/slatest/images/dot-for-border.png | Bin 0 -> 194 bytes
.../designs/slatest/images/slst-sprite-opaque.png | Bin 0 -> 174070 bytes
.../www.slate.com/etc/designs/slatest/js/misc.js | 199 +
.../etc/designs/slatest/js/slatest_identity.js | 177 +
.../etc/designs/slatest/js/slatest_streamer.js | 113 +
.../slate-001/www.slate.com/id/2205171/index.html | 3 +
.../slate-001/www.slate.com/js/s_code.js | 475 +
.../www.slate.com/jsmenus_partner.aspx?id=2065896 | 19 +
.../slate-001/www.slate.com/robots.txt | 18 +
.../slate-001/www.youtube.com/robots.txt | 24 +
.../v/9bmetqshLNs?version=3&hl=en_US | Bin 0 -> 3199 bytes
.../v/DjOfuPo_vzU?version=3&hl=en_US | Bin 0 -> 3223 bytes
137 files changed, 52745 insertions(+)
commit e3d6c2a47c16761d673107253b554b0086c39ed4
Author: Jerry Charumilind <git@jcharum.fastmail.net>
Date: Thu Aug 4 16:49:43 2011 -0700
Improve handling of relative links
Relative links may still appear in the original version of pages. This handles
them a bit better.
regression_test.py | 15 ++++++++++++++-
1 file changed, 14 insertions(+), 1 deletion(-)
commit d0a0c9e7a16e1c7877363eb97ff2b4a49f8fc64d
Author: Jerry Charumilind <git@jcharum.fastmail.net>
Date: Thu Aug 4 15:46:05 2011 -0700
Show original in regression test with local files
The original page in the regression test was still using external references.
This changes it to use internal references like the rest of the output from the
regression tests.
regression_test.py | 12 +++++++-----
1 file changed, 7 insertions(+), 5 deletions(-)
commit 47e75886d1c69aaa449a3847d33ab2cb5e68f6ef
Author: Jerry Charumilind <git@jcharum.fastmail.net>
Date: Thu Aug 4 15:17:30 2011 -0700
Move CSS used in regression test to own module
regression_test.py | 65 +-----------------------------------------------
regression_test_css.py | 63 ++++++++++++++++++++++++++++++++++++++++++++++
2 files changed, 64 insertions(+), 64 deletions(-)
commit 3224e272fc401ca2ba4ffa709c04457587a3c3bd
Author: Jerry Charumilind <git@jcharum.fastmail.net>
Date: Thu Aug 4 14:37:12 2011 -0700
Add note about failing washingtonpost article
regression_test_data/washingtonpost-001.yaml | 1 +
1 file changed, 1 insertion(+)
commit f3117f57f1afa4aebc9cfc0c56a5f44002386653
Author: Jerry Charumilind <git@jcharum.fastmail.net>
Date: Thu Aug 4 14:30:42 2011 -0700
Add comments to describe double-break code
readability/readability.py | 99 +++++++++++++++++++++++++++++++++++++++-----
1 file changed, 88 insertions(+), 11 deletions(-)
commit 21741963dff0b12b8fd72016567fc33331107ba7
Author: Jerry Charumilind <git@jcharum.fastmail.net>
Date: Thu Aug 4 11:16:22 2011 -0700
Clean up double-break finding code
The code was very confusing before. It is much better now. I still have some
comments to write.
readability/readability.py | 348 ++++++++++++++------
test_data/double-breaks-some-headers-expected.html | 3 +-
2 files changed, 247 insertions(+), 104 deletions(-)
commit bcda170d31fef858ff03a6a63687a403a8017ae8
Author: Jerry Charumilind <git@jcharum.fastmail.net>
Date: Wed Aug 3 14:52:57 2011 -0700
Remove dead code
readability/readability.py | 22 ----------------------
1 file changed, 22 deletions(-)
commit db5f7111ff75dc75610ea7f91b1cd76eb61d9ef7
Author: Jerry Charumilind <git@jcharum.fastmail.net>
Date: Wed Aug 3 13:11:02 2011 -0700
Fix handling of breaks that delineate paragraphs
This replaces some code that was attempting to construct paragraphs within
divs that were possibly delineated by breaks. That code was choking on some
common cases. For example, if a div had paragraphs delineated by breaks, links
within the paragraphs would be broken out into their own paragraphs, even
though they are obviously meant to be in-line.
This implements a slightly more sophisticated algorithm that explicitly looks
for double-breaks and constructs paragraphs in the appropriate places. It also
tries to leave existing block display elements alone.
This adds some unit tests and updated the regression tests for this improved
behavior.
The code is still a bit ugly, but it works. It will be cleaned up in a future
commit.
readability/readability.py | 221 +++++-
regression_test.py | 10 +-
...owser-stats-rapid-release-edition.ars.html.rdbl | 53 +-
.../compare-recommendation-systems-0708.html.rdbl | 4 +-
regression_test_data/washingtonpost-001.yaml | 14 +-
.../2011/06/26/AGtmeftH_story.html | 202 ++---
.../2011/06/26/AGtmeftH_story.html.rdbl | 65 +-
.../2011/06/26/AGtmeftH_story_1.html | 202 ++---
.../2011/06/26/AGtmeftH_story_2.html | 644 ++++++++-------
.../2011/06/26/AGtmeftH_story_3.html | 642 ++++++++-------
...e&m=false&context=wp-static&r=%2Fad%2Faudsci.js | 20 +
...r=%2Fad%2Fwpni_generic_ad.js&r=%2Fad%2Fwp_ad.js | 833 ++++++++++++++++++++
test_data/double-breaks-basic-expected.html | 77 ++
test_data/double-breaks-basic-original.html | 77 ++
test_data/double-breaks-mit-expected.html | 112 +++
test_data/double-breaks-mit-original.html | 115 +++
.../double-breaks-proper-paragraphs-expected.html | 76 ++
.../double-breaks-proper-paragraphs-original.html | 76 ++
test_data/double-breaks-some-headers-expected.html | 80 ++
test_data/double-breaks-some-headers-original.html | 83 ++
20 files changed, 2677 insertions(+), 929 deletions(-)
commit 001b89d3f3f1111eb94ad1b253edd99c32627682
Author: Jerry Charumilind <git@jcharum.fastmail.net>
Date: Fri Jul 29 15:52:37 2011 -0700
Fix links in regression tests
Previous to this change, we used the wget --convert-links option to create
versions of pages that could be fully displayed locally. However, this is a
problem for some multi-page articles. Some sites have the later pages of an
article in a different directory from the original page. For example, the
first page of an ArsTechnica article could be:
http://www.arstechnica.com/section/article.ars
The second page of this article would be:
http://www.arstechnica.com/section/article.ars/2
Images in the first page would have image links look something like
"../images/header.jpg". However, in the second page, they would look like
"../../images/header.jpg".
When we bring the elements of the second page into our readable version, which
we place alongside the original version of the first page, the paths to these
resources will be wrong.
Instead of having wget convert the links, we leave the links alone. As part of
the readability pass, we convert everything to absolute links. That way, when
we pull in elements from other pages, the links remain correct. Finally, we
use the URL map that maps all of the prerequisite URLs to local paths,
downloaded by wget, to turn the absolute URLs into local paths for the
presentation of results.
readability/readability.py | 5 +
readability/urlfetch.py | 33 +-
readability/wget_parser.py | 223 +++
regression_test.py | 20 +-
regression_test_data/arstechnica-000.yaml | 108 +-
.../dragons/brains.gif?id=51247&791674727 | Bin 0 -> 43 bytes
.../public/shared/scripts/da-1.5.js | Bin 11719 -> 4013 bytes
...ne-browser-stats-rapid-release-edition.ars.html | 262 ++--
...owser-stats-rapid-release-edition.ars.html.rdbl | 14 +-
.../public/v6/footer.html?1311799944.html | 229 ++++
.../public/v6/scripts/site.min.js?1311799944 | 71 +
.../light/images/masthead/logo.png?1311799945 | Bin 0 -> 3460 bytes
.../v6/styles/light/light.c.css?1311799945.css | 1 +
.../v6/styles/print/print.css?1311799945.css | 1 +
regression_test_data/arstechnica-001.yaml | 105 ++
.../dragons/brains.gif?id=50988&167540464 | Bin 0 -> 43 bytes
.../dragons/brains.gif?id=50988&520836144 | Bin 0 -> 43 bytes
.../public/shared/scripts/da-1.5.js | Bin 11719 -> 4013 bytes
.../1.html | 253 ++--
.../1.html.rdbl | 10 +-
.../2.html | 268 ++--
.../public/v6/footer.html?1311799944.html | 229 ++++
.../public/v6/scripts/site.min.js?1311799944 | 71 +
.../light/images/masthead/logo.png?1311799945 | Bin 0 -> 3460 bytes
.../v6/styles/light/light.c.css?1311799945.css | 1 +
.../v6/styles/print/print.css?1311799945.css | 1 +
regression_test_data/arstechnica-002.yaml | 216 +++
.../apple/reviews/2011/07/mac-os-x-10-7.ars.1.html | 753 ++++++++++
.../reviews/2011/07/mac-os-x-10-7.ars.1.html.rdbl | 1449 ++++++++++++++++++++
.../apple/reviews/2011/07/mac-os-x-10-7.ars.html | 762 ++++++++++
.../reviews/2011/07/mac-os-x-10-7.ars.html.rdbl | 1449 ++++++++++++++++++++
.../reviews/2011/07/mac-os-x-10-7.ars/10.html | 716 ++++++++++
.../reviews/2011/07/mac-os-x-10-7.ars/11.html | 696 ++++++++++
.../reviews/2011/07/mac-os-x-10-7.ars/12.html | 706 ++++++++++
.../reviews/2011/07/mac-os-x-10-7.ars/13.html | 887 ++++++++++++
.../reviews/2011/07/mac-os-x-10-7.ars/14.html | 733 ++++++++++
.../reviews/2011/07/mac-os-x-10-7.ars/15.html | 699 ++++++++++
.../reviews/2011/07/mac-os-x-10-7.ars/16.html | 704 ++++++++++
.../reviews/2011/07/mac-os-x-10-7.ars/17.html | 709 ++++++++++
.../reviews/2011/07/mac-os-x-10-7.ars/18.html | 706 ++++++++++
.../reviews/2011/07/mac-os-x-10-7.ars/19.html | 682 +++++++++
.../apple/reviews/2011/07/mac-os-x-10-7.ars/2.html | 754 ++++++++++
.../apple/reviews/2011/07/mac-os-x-10-7.ars/3.html | 781 +++++++++++
.../apple/reviews/2011/07/mac-os-x-10-7.ars/4.html | 713 ++++++++++
.../apple/reviews/2011/07/mac-os-x-10-7.ars/5.html | 723 ++++++++++
.../apple/reviews/2011/07/mac-os-x-10-7.ars/6.html | 716 ++++++++++
.../apple/reviews/2011/07/mac-os-x-10-7.ars/7.html | 747 ++++++++++
.../apple/reviews/2011/07/mac-os-x-10-7.ars/8.html | 709 ++++++++++
.../apple/reviews/2011/07/mac-os-x-10-7.ars/9.html | 721 ++++++++++
.../ars/theme/images/login_register/button-bg.png | Bin 0 -> 280 bytes
...Int(Math.random()*99999999, 10)).toString() + ' | Bin 0 -> 43 bytes
.../dragons/brains.gif?id=51310&1085522495 | Bin 0 -> 43 bytes
.../dragons/brains.gif?id=51310&1088032360 | Bin 0 -> 43 bytes
.../dragons/brains.gif?id=51310&1123524866 | Bin 0 -> 43 bytes
.../dragons/brains.gif?id=51310&118639069 | Bin 0 -> 43 bytes
.../dragons/brains.gif?id=51310&1247869715 | Bin 0 -> 43 bytes
.../dragons/brains.gif?id=51310&1260350867 | Bin 0 -> 43 bytes
.../dragons/brains.gif?id=51310&1264800290 | Bin 0 -> 43 bytes
.../dragons/brains.gif?id=51310&1280492493 | Bin 0 -> 43 bytes
.../dragons/brains.gif?id=51310&1454413109 | Bin 0 -> 43 bytes
.../dragons/brains.gif?id=51310&1457135610 | Bin 0 -> 43 bytes
.../dragons/brains.gif?id=51310&1505282952 | Bin 0 -> 43 bytes
.../dragons/brains.gif?id=51310&1665307579 | Bin 0 -> 43 bytes
.../dragons/brains.gif?id=51310&1688225497 | Bin 0 -> 43 bytes
.../dragons/brains.gif?id=51310&169720170 | Bin 0 -> 43 bytes
.../dragons/brains.gif?id=51310&1769904917 | Bin 0 -> 43 bytes
.../dragons/brains.gif?id=51310&1786189710 | Bin 0 -> 43 bytes
.../dragons/brains.gif?id=51310&1786932375 | Bin 0 -> 43 bytes
.../dragons/brains.gif?id=51310&1840678375 | Bin 0 -> 43 bytes
.../dragons/brains.gif?id=51310&1947431770 | Bin 0 -> 43 bytes
.../dragons/brains.gif?id=51310&1965584909 | Bin 0 -> 43 bytes
.../dragons/brains.gif?id=51310&2004572007 | Bin 0 -> 43 bytes
.../dragons/brains.gif?id=51310&2029085635 | Bin 0 -> 43 bytes
.../dragons/brains.gif?id=51310&2123559212 | Bin 0 -> 43 bytes
.../dragons/brains.gif?id=51310&2124948112 | Bin 0 -> 43 bytes
.../dragons/brains.gif?id=51310&297091925 | Bin 0 -> 43 bytes
.../dragons/brains.gif?id=51310&322808089 | Bin 0 -> 43 bytes
.../dragons/brains.gif?id=51310&348077856 | Bin 0 -> 43 bytes
.../dragons/brains.gif?id=51310&35653510 | Bin 0 -> 43 bytes
.../dragons/brains.gif?id=51310&516381647 | Bin 0 -> 43 bytes
.../dragons/brains.gif?id=51310&543726557 | Bin 0 -> 43 bytes
.../dragons/brains.gif?id=51310&618087003 | Bin 0 -> 43 bytes
.../dragons/brains.gif?id=51310&690216831 | Bin 0 -> 43 bytes
.../dragons/brains.gif?id=51310&701920133 | Bin 0 -> 43 bytes
.../dragons/brains.gif?id=51310&715583863 | Bin 0 -> 43 bytes
.../dragons/brains.gif?id=51310&729545956 | Bin 0 -> 43 bytes
.../dragons/brains.gif?id=51310&812084847 | Bin 0 -> 43 bytes
.../dragons/brains.gif?id=51310&883885930 | Bin 0 -> 43 bytes
.../dragons/brains.gif?id=51310&991256018 | Bin 0 -> 43 bytes
.../public/shared/scripts/da-1.5.js | Bin 0 -> 4013 bytes
.../v6/vendor/mediaelement/flashmediaelement.swf | Bin 0 -> 16405 bytes
.../arstechnica-002/arstechnica.com/robots.txt | 14 +
.../b.scorecardresearch.com/robots.txt | 2 +
.../static.addtoany.com/buttons/favicon.png | Bin 0 -> 270 bytes
.../05936218856bfda2a2b187b00e5cc496.png | Bin 0 -> 15153 bytes
.../2011/07/04/lion/address-book.png | Bin 0 -> 124986 bytes
.../2011/07/04/lion/auto-correction-menu.png | Bin 0 -> 68710 bytes
.../2011/07/04/lion/auto-correction.png | Bin 0 -> 30080 bytes
.../2011/07/04/lion/autosave-timeout.png | Bin 0 -> 27542 bytes
.../2011/07/04/lion/cat-names.png | Bin 0 -> 70743 bytes
.../2011/07/04/lion/classic-mac-scroll-bars-2.png | Bin 0 -> 1231 bytes
.../2011/07/04/lion/dock-indicator-lights-pref.png | Bin 0 -> 11092 bytes
.../2011/07/04/lion/dock.png | Bin 0 -> 102508 bytes
.../2011/07/04/lion/emoji.png | Bin 0 -> 110124 bytes
.../07/04/lion/file-vault-apple-recovery-key.png | Bin 0 -> 124346 bytes
.../2011/07/04/lion/file-vault-enable-user.png | Bin 0 -> 26756 bytes
.../2011/07/04/lion/file-vault-recovery-key.png | Bin 0 -> 46162 bytes
.../2011/07/04/lion/file-vault.png | Bin 0 -> 32800 bytes
.../2011/07/04/lion/finder-all-my-files.png | Bin 0 -> 117837 bytes
.../2011/07/04/lion/finder-no-scroll-bars.png | Bin 0 -> 71372 bytes
.../2011/07/04/lion/finder-overlay-scroll-bars.png | Bin 0 -> 71513 bytes
.../2011/07/04/lion/finder-search-tokens.png | Bin 0 -> 26744 bytes
.../07/04/lion/finder-sidebar-folder-icons.png | Bin 0 -> 96392 bytes
.../2011/07/04/lion/finder-sidebar.png | Bin 0 -> 13918 bytes
.../2011/07/04/lion/finder-view-options.png | Bin 0 -> 7531 bytes
.../2011/07/04/lion/hidpi-display-modes.png | Bin 0 -> 26166 bytes
.../2011/07/04/lion/hypercard-home-icons.png | Bin 0 -> 758 bytes
.../2011/07/04/lion/ical.png | Bin 0 -> 91460 bytes
.../2011/07/04/lion/installer-icon.png | Bin 0 -> 22167 bytes
.../2011/07/04/lion/installer.png | Bin 0 -> 73115 bytes
.../2011/07/04/lion/ios-scroll-bar.png | Bin 0 -> 14328 bytes
.../2011/07/04/lion/launchpad-download.png | Bin 0 -> 21020 bytes
.../2011/07/04/lion/launchpad-folder.png | Bin 0 -> 258265 bytes
.../2011/07/04/lion/launchpad.png | Bin 0 -> 241533 bytes
.../2011/07/04/lion/lion-gui.png | Bin 0 -> 50138 bytes
.../2011/07/04/lion/lion-window-widgets.png | Bin 0 -> 3165 bytes
.../2011/07/04/lion/lock-screen-big.png | Bin 0 -> 972976 bytes
.../2011/07/04/lion/login-screen.png | Bin 0 -> 335335 bytes
.../2011/07/04/lion/mail-accordion.png | Bin 0 -> 281602 bytes
.../2011/07/04/lion/mail-rich-text.png | Bin 0 -> 63521 bytes
.../2011/07/04/lion/mail-three-pane.png | Bin 0 -> 296607 bytes
.../2011/07/04/lion/mail.png | Bin 0 -> 336654 bytes
.../2011/07/04/lion/mission-control.png | Bin 0 -> 340178 bytes
.../2011/07/04/lion/previous-versions.png | Bin 0 -> 210085 bytes
.../2011/07/04/lion/quicktime-xpc.png | Bin 0 -> 61048 bytes
.../2011/07/04/lion/resize-widget-11A511.png | Bin 0 -> 5866 bytes
.../2011/07/04/lion/safari-popover.png | Bin 0 -> 108626 bytes
.../2011/07/04/lion/safari-reading-list.png | Bin 0 -> 194366 bytes
.../2011/07/04/lion/sandboxed-apps.png | Bin 0 -> 77320 bytes
.../2011/07/04/lion/save-a-version.png | Bin 0 -> 20707 bytes
.../2011/07/04/lion/scroll-bar-10.6.png | Bin 0 -> 2267 bytes
.../2011/07/04/lion/scroll-bar-10.7-11A511.png | Bin 0 -> 476 bytes
.../04/lion/scroll-bar-accessory-view-11A511.png | Bin 0 -> 14286 bytes
.../2011/07/04/lion/scroll-bar-dp3.png | Bin 0 -> 2277 bytes
.../2011/07/04/lion/scroll-bar-itunes-10.2.2.png | Bin 0 -> 1423 bytes
.../2011/07/04/lion/scroll-bar-preference.png | Bin 0 -> 7811 bytes
.../07/04/lion/scrolling-direction-preference.png | Bin 0 -> 5763 bytes
.../2011/07/04/lion/snow-leopard-gui.png | Bin 0 -> 49134 bytes
.../07/04/lion/snow-leopard-window-widgets.png | Bin 0 -> 2272 bytes
.../2011/07/04/lion/system-7-extensions.png | Bin 0 -> 1515 bytes
.../2011/07/04/lion/system-info-displays.png | Bin 0 -> 59547 bytes
.../2011/07/04/lion/system-info-memory.png | Bin 0 -> 52749 bytes
.../2011/07/04/lion/system-info-overview.png | Bin 0 -> 78539 bytes
.../2011/07/04/lion/system-info-service.png | Bin 0 -> 83737 bytes
.../2011/07/04/lion/system-info-storage.png | Bin 0 -> 91102 bytes
.../2011/07/04/lion/system-info-support.png | Bin 0 -> 50686 bytes
.../04/lion/system-preferences-accounts-other.png | Bin 0 -> 80297 bytes
.../07/04/lion/system-preferences-accounts.png | Bin 0 -> 104935 bytes
.../07/04/lion/system-preferences-customize.png | Bin 0 -> 155758 bytes
.../07/04/lion/system-preferences-gestures.png | Bin 0 -> 200336 bytes
.../2011/07/04/lion/system-preferences-menu.png | Bin 0 -> 117800 bytes
.../07/04/lion/system-preferences-time-zone.png | Bin 0 -> 67781 bytes
.../2011/07/04/lion/system-preferences.png | Bin 0 -> 139511 bytes
.../2011/07/04/lion/terminal-blur.png | Bin 0 -> 172716 bytes
.../2011/07/04/lion/textedit-hidpi.png | Bin 0 -> 53464 bytes
.../2011/07/04/lion/unzoom-widget.png | Bin 0 -> 6107 bytes
.../2011/07/04/lion/versions-menu.png | Bin 0 -> 63829 bytes
.../2011/07/04/lion/window-resizing-11A511.png | Bin 0 -> 108030 bytes
.../07/04/lion/xcode-arc-setting-highlighted.png | Bin 0 -> 16938 bytes
.../2011/07/04/lion/zoom-widget.png | Bin 0 -> 11338 bytes
.../274ea6386720b4f5d44d303aac087806.jpg | Bin 0 -> 29479 bytes
.../07/lion-review-intro-thumb-640xauto-23318.jpg | Bin 0 -> 68277 bytes
.../static.arstechnica.net/favicon.ico | Bin 0 -> 22382 bytes
.../plugins/ArsTheme/images/masthead-short.png | Bin 0 -> 14181 bytes
.../shared/images/Game-Stat-Box-Sprites.png?v3 | Bin 0 -> 3974 bytes
.../public/shared/images/activity.gif | Bin 0 -> 4178 bytes
.../public/shared/images/game-box-tall-bkg.png | Bin 0 -> 20719 bytes
.../shared/images/reviewed-platform-icon.png | Bin 0 -> 323 bytes
.../public/v6/footer.html?1311799944.html | 229 ++++
.../public/v6/scripts/site.min.js?1311799944 | 71 +
.../v6/styles/light/images/content/arrow-bg.png | Bin 0 -> 183 bytes
.../light/images/content/author-bubble-bg.png | Bin 0 -> 1258 bytes
.../v6/styles/light/images/content/badge.png | Bin 0 -> 3817 bytes
.../light/images/content/comments-bar-bg.png?2 | Bin 0 -> 6791 bytes
.../styles/light/images/content/etc-bubble-bg.png | Bin 0 -> 1248 bytes
.../light/images/content/etc-category-sprite.png | Bin 0 -> 2152 bytes
.../public/v6/styles/light/images/content/etc.png | Bin 0 -> 376 bytes
.../styles/light/images/content/follow-button.png | Bin 0 -> 1184 bytes
.../v6/styles/light/images/content/plus-large.png | Bin 0 -> 119 bytes
.../images/content/premier_key.png?1311799945 | Bin 0 -> 414 bytes
.../images/content/pullquote-presidential-bg.png | Bin 0 -> 551 bytes
.../light/images/content/pullquote-rules-bg.png | Bin 0 -> 425 bytes
.../images/content/read-more-comment-sprite.png | Bin 0 -> 3723 bytes
.../light/images/content/silo-headers/apple.png | Bin 0 -> 18330 bytes
.../light/images/content/silo-headers/ask-ars.png | Bin 0 -> 11565 bytes
.../light/images/content/silo-headers/att.png | Bin 0 -> 5894 bytes
.../light/images/content/silo-headers/bieb.png | Bin 0 -> 7152 bytes
.../light/images/content/silo-headers/business.png | Bin 0 -> 30278 bytes
.../content/silo-headers/columbia-header.png | Bin 0 -> 19102 bytes
.../light/images/content/silo-headers/features.png | Bin 0 -> 12664 bytes
.../images/content/silo-headers/future-cars.png | Bin 0 -> 15463 bytes
.../images/content/silo-headers/future-of-tv.png | Bin 0 -> 5777 bytes
.../light/images/content/silo-headers/gadgets.png | Bin 0 -> 31088 bytes
.../light/images/content/silo-headers/gaming.png | Bin 0 -> 9750 bytes
.../silo-headers/gift-guide-header-gadgets.jpg | Bin 0 -> 22331 bytes
.../silo-headers/gift-guide-header-gaming.jpg | Bin 0 -> 20602 bytes
.../silo-headers/gift-guide-header-hdtv.jpg | Bin 0 -> 22372 bytes
.../silo-headers/gift-guide-header-staff.jpg | Bin 0 -> 26511 bytes
.../light/images/content/silo-headers/guides.png | Bin 0 -> 13250 bytes
.../light/images/content/silo-headers/hardware.png | Bin 0 -> 21687 bytes
.../content/silo-headers/ht-samsung-header.png | Bin 0 -> 14745 bytes
.../content/silo-headers/ibm_lotus_collab.png | Bin 0 -> 5308 bytes
.../light/images/content/silo-headers/ipad.png | Bin 0 -> 7517 bytes
.../images/content/silo-headers/last-mile.png | Bin 0 -> 14608 bytes
.../light/images/content/silo-headers/media.png | Bin 0 -> 22170 bytes
.../images/content/silo-headers/microsoft.png | Bin 0 -> 23794 bytes
.../content/silo-headers/msft_cloud_header.jpg | Bin 0 -> 21450 bytes
.../light/images/content/silo-headers/netapp.png | Bin 0 -> 5423 bytes
.../images/content/silo-headers/open-source.png | Bin 0 -> 22404 bytes
.../images/content/silo-headers/planet_cloud.jpg | Bin 0 -> 18995 bytes
.../light/images/content/silo-headers/premier.jpg | Bin 0 -> 13745 bytes
.../light/images/content/silo-headers/raise-iq.png | Bin 0 -> 15030 bytes
.../light/images/content/silo-headers/reviews.png | Bin 0 -> 11984 bytes
.../light/images/content/silo-headers/science.png | Bin 0 -> 28485 bytes
.../light/images/content/silo-headers/security.png | Bin 0 -> 16844 bytes
.../light/images/content/silo-headers/software.png | Bin 0 -> 21915 bytes
.../light/images/content/silo-headers/staff.png | Bin 0 -> 22155 bytes
.../images/content/silo-headers/system-guides.png | Bin 0 -> 18702 bytes
.../images/content/silo-headers/tech-policy.png | Bin 0 -> 21229 bytes
.../content/silo-headers/technopaedia-header.png | Bin 0 -> 33016 bytes
.../light/images/content/silo-headers/telecom.png | Bin 0 -> 21697 bytes
.../light/images/content/silo-headers/web.png | Bin 0 -> 21263 bytes
.../light/images/content/technopaedia-footer.png | Bin 0 -> 4986 bytes
.../light/images/content/technopaedia-header.png | Bin 0 -> 432 bytes
.../light/images/content/technopaedia-min-max.png | Bin 0 -> 602 bytes
.../light/images/footer/footer-mobile-badge.png | Bin 0 -> 6635 bytes
.../light/images/masthead/logo.png?1311799945 | Bin 0 -> 3460 bytes
.../images/masthead/masthead-bg-premier-banned.png | Bin 0 -> 30698 bytes
.../images/masthead/masthead-bg-premier-tent.png | Bin 0 -> 31059 bytes
.../light/images/masthead/masthead-bg-premier.png | Bin 0 -> 28571 bytes
.../styles/light/images/masthead/masthead-bg.png | Bin 0 -> 75864 bytes
.../navigation/auxiliary-navigation-sprite.png | Bin 0 -> 804 bytes
.../light/images/navigation/more-arrow-sprite.png | Bin 0 -> 222 bytes
.../light/images/navigation/nav-bar-bkg.png?v=2 | Bin 0 -> 591 bytes
.../light/images/news-bar/tabs-icons-sprite.png | Bin 0 -> 4129 bytes
.../light/images/news-bar/tabs-left-sprite.png | Bin 0 -> 588 bytes
.../light/images/news-bar/tabs-right-sprite.png | Bin 0 -> 595 bytes
.../v6/styles/light/images/search/button-bg.png | Bin 0 -> 492 bytes
.../v6/styles/light/images/search/field-bg.png | Bin 0 -> 165 bytes
.../light/images/search/search-rollover-bkg.png | Bin 0 -> 269 bytes
.../light/images/sidebar/arrow-bullets-sprite.png | Bin 0 -> 141 bytes
.../light/images/sidebar/bottom-bubble-bg.png | Bin 0 -> 1904 bytes
.../light/images/sidebar/bottom-bubble-top-bg.png | Bin 0 -> 151 bytes
.../sidebar/categories-sprite-vertical.png?2 | Bin 0 -> 3890 bytes
.../light/images/sidebar/categories-sprite.png | Bin 0 -> 2088 bytes
.../styles/light/images/sidebar/links-copyedit.png | Bin 0 -> 1165 bytes
.../styles/light/images/sidebar/links-sprite.png | Bin 0 -> 9252 bytes
.../light/images/sidebar/misc-icons-sprite.png | Bin 0 -> 1402 bytes
.../styles/light/images/sidebar/top-bubble-bg.png | Bin 0 -> 1485 bytes
.../light/images/sidebar/top-bubble-bottom-bg.png | Bin 0 -> 151 bytes
.../v6/styles/light/light.c.css?1311799945.css | 1 +
.../v6/styles/light/light.c.css?1311799945orig | 1 +
.../v6/styles/print/print.css?1311799945.css | 1 +
regression_test_data/businessinsider-000.yaml | 183 ++-
.../connect.facebook.net/en_US/all.js | 12 +-
...Account=clusterstock&Module=snapshot2&Output=JS | 66 +-
...rstock&Module=stockquote5&Ticker=MSFT&Output=JS | 8 +-
.../platform.linkedin.com/in.js | 2 +-
.../vat/mon/vt.js?1311871997 | 1 +
.../assets/css/min-all.css?1311871997.css | 1 +
.../assets/css/min-print.css?1311871997.css | 1 +
.../obama-boehner-pelosi.jpg | Bin 0 -> 1873 bytes
.../4e1f5c4349e2aed54c2f0000-60-45/larry-page.jpg | Bin 0 -> 1612 bytes
.../hose-fire-water.jpg | Bin 0 -> 1906 bytes
.../4e31bb8a69beddf466000031-60-45/turlington.jpg | Bin 0 -> 1642 bytes
.../image/4cf42da149e2aecf03040000-50-sq/image.jpg | Bin 1577 -> 1868 bytes
.../image/4d937b57ccd1d5b2351c0000-50-sq/image.jpg | Bin 1473 -> 1712 bytes
.../4da3244b49e2ae6a141a0000-60-45/uncle-sam.jpg | Bin 0 -> 1403 bytes
.../4de66159cadcbb7155130000-60-45/jim-demint.jpg | Bin 0 -> 1786 bytes
.../harry-mccracken.jpg | Bin 0 -> 1675 bytes
.../mozambique-trash-land.jpg | Bin 0 -> 1787 bytes
.../4e31c61c6bb3f7f31d000004-50-50/seth-levine.jpg | Bin 0 -> 1434 bytes
.../vcs-editorial-sidebar.jpg | Bin 0 -> 25814 bytes
.../4b2a8b9d000000000006efaf-90-90/chris-dixon.jpg | Bin 0 -> 3734 bytes
.../4d55d947ccd1d5e1550c0000-70-70/matt-rosoff.jpg | Bin 1688 -> 2022 bytes
...and-ron-paul-should-not-both-be-republicans.jpg | Bin 0 -> 5689 bytes
.../william-lazonick.jpg | Bin 0 -> 1684 bytes
.../john-boehner.jpg | Bin 0 -> 1916 bytes
.../assets/js/min.js?1311871997 | 49 +
.../4d55d947ccd1d5e1550c0000-50-50/matt-rosoff.jpg | Bin 1291 -> 1491 bytes
...t-microsoft-too-understood-touch-interfaces.jpg | Bin 1747 -> 2070 bytes
..._section=sai&display_method=default&version=2.0 | 2 +-
...nx[vertical]=sai&openx[author]=Matt+Rosoff.html | 50 +
...nx[vertical]=sai&openx[author]=Matt+Rosoff.html | 27 +
.../www.businessinsider.com/partner/fc/iframe.html | 2 +-
...rosoft-ui-ideas-that-never-took-off-2011-7.html | 966 ++++++-------
regression_test_data/cnet-000.yaml | 159 +++
.../index.html | 162 +--
.../index.html.rdbl | 10 +-
.../cnwk.1d/css/rb/Build/8300/8300.0.0.css | 2 +-
.../cnwk.1d/css/rb/Build/8300/8300.39.0.css | 2 +-
.../cnwk.1d/css/rb/Build/global/matrix.site39.css | 2 +-
.../cnwk.1d/css/rb/Build/print/print.css | 2 +-
.../cnwk.1d/css/rb/tron/comments/newsComments.css | 20 +-
.../cnwk.1d/css/rb/tron/ipadOverwrite.css | 4 +-
.../rb/js/tron/news/news.tron.c3p0.compressed.js | 2 +-
.../2011/07/27/07_28_11_Word_formatting1_60x60.jpg | Bin 0 -> 2091 bytes
.../07/27/Facebook_anonymous_follow_1_60x60.png | Bin 0 -> 5066 bytes
.../i/tim/2011/07/27/google+_logo_60x60.png | Bin 0 -> 3412 bytes
.../i/tim/2011/07/27/iPhoto_backup_4_60x60.png | Bin 0 -> 5726 bytes
.../i/tim/2011/07/27/iPhoto_multi_60x60.png | Bin 0 -> 3528 bytes
.../i/tim/2011/07/28/123700-1-google_60x60.jpg | Bin 0 -> 2415 bytes
.../i/tim/2011/07/28/iPhoto_Picasa_60x60.png | Bin 0 -> 4418 bytes
.../i/tim/2011/07/28/iPhoto_slideshow_1_60x60.png | Bin 0 -> 5792 bytes
.../cnwk.1d/i/tim/2011/07/28/youtube_60x60.png | Bin 0 -> 7799 bytes
.../i/tim/2011/07/29/Slide3_ReadLater_60x60.png | Bin 0 -> 4020 bytes
.../tim/2011/07/29/iPhone_-_open_device_60x60.jpg | Bin 0 -> 1436 bytes
.../cnet-000/platform.linkedin.com/in.js | 2 +-
.../cnet-000/platform.twitter.com/widgets.js | 6 +-
regression_test_data/cnet-001.yaml | 158 ++-
.../cnwk.1d/css/rb/Build/8301/8301.3.0.css | 2 +-
.../cnwk.1d/css/rb/Build/global/matrix.site3.css | 2 +-
.../cnwk.1d/css/rb/Build/print/print.css | 2 +-
.../cnwk.1d/css/rb/tron/comments/newsComments.css | 20 +-
.../cnwk.1d/css/rb/tron/ipadOverwrite.css | 4 +-
.../cnwk.1d/css/rb/tron/news/riverWidget.css | 4 +-
.../rb/js/tron/news/news.tron.c3p0.compressed.js | 2 +-
.../cnwk.1d/i/ne/pg/fd_2011/newsiPhonefor350.jpg | Bin 0 -> 11954 bytes
.../cnwk.1d/i/tim/2011/07/27/4_on_184x138.jpg | Bin 0 -> 8295 bytes
...l-geographic-world-championship-3243_120x90.jpg | Bin 0 -> 17686 bytes
.../cnwk.1d/i/tim/2011/07/28/McNamee1_120x90.JPG | Bin 0 -> 4063 bytes
.../index.html | 407 +++---
.../index.html.rdbl | 35 +-
.../cnet-001/platform.linkedin.com/in.js | 2 +-
.../cnet-001/platform.twitter.com/widgets.js | 6 +-
regression_test_data/deadspin-000.yaml | 128 ++
.../base.v10.static/img/footer/deadspin_logo.png | Bin 0 -> 2658 bytes
.../base.v10.static/img/footer/gawker_logo.png | Bin 0 -> 3089 bytes
.../base.v10.static/img/footer/gizmodo_logo.png | Bin 0 -> 2689 bytes
.../assets/base.v10.static/img/footer/io9_logo.png | Bin 0 -> 1165 bytes
.../base.v10.static/img/footer/jalopnik_logo.png | Bin 0 -> 2032 bytes
.../base.v10.static/img/footer/jezebel_logo.png | Bin 0 -> 1773 bytes
.../base.v10.static/img/footer/kotaku_logo.png | Bin 0 -> 2422 bytes
.../base.v10.static/img/footer/lifehacker_logo.png | Bin 0 -> 1882 bytes
.../assets/base.v10.static/img/icon-play-large.png | Bin 0 -> 11305 bytes
.../assets/base.v10.static/img/icons/comment.png | Bin 0 -> 396 bytes
.../assets/base.v10.static/img/icons/flame.png | Bin 0 -> 332 bytes
.../base.v10.static/img/icons/icon.classic.png | Bin 0 -> 417 bytes
.../assets/base.v10.static/img/icons/icons.png | Bin 0 -> 10614 bytes
.../assets/base.v10.static/img/icons/quicklink.png | Bin 0 -> 219 bytes
.../assets/base.v10.static/img/icons/star.png | Bin 0 -> 976 bytes
.../img/interstitial-bottom-gradient.png | Bin 0 -> 493 bytes
.../assets/base.v10.static/img/lytebox/blank.gif | Bin 0 -> 43 bytes
.../base.v10.static/img/lytebox/close_grey.png | Bin 0 -> 1715 bytes
.../assets/base.v10.static/img/lytebox/loading.gif | Bin 0 -> 2767 bytes
.../base.v10.static/img/lytebox/next_grey.gif | Bin 0 -> 731 bytes
.../base.v10.static/img/lytebox/pause_grey.png | Bin 0 -> 1282 bytes
.../base.v10.static/img/lytebox/play_grey.png | Bin 0 -> 1178 bytes
.../base.v10.static/img/lytebox/prev_grey.gif | Bin 0 -> 748 bytes
.../base.v10.static/img/share/share_icons.png | Bin 0 -> 8780 bytes
.../assets/base.v10.static/img/ui/button-icons.png | Bin 0 -> 414 bytes
.../assets/base.v10.static/img/ui/icon-b.png | Bin 0 -> 281 bytes
.../assets/base.v10.static/img/ui/icon-cog.png | Bin 0 -> 1035 bytes
.../assets/base.v10.static/img/ui/icon-edit.png | Bin 0 -> 1074 bytes
.../assets/base.v10.static/img/ui/icon-expand.png | Bin 0 -> 1193 bytes
.../assets/base.v10.static/img/ui/icon-f.png | Bin 0 -> 287 bytes
.../assets/base.v10.static/img/ui/icon-ff.png | Bin 0 -> 294 bytes
.../assets/base.v10.static/img/ui/icon-heart.png | Bin 0 -> 822 bytes
.../assets/base.v10.static/img/ui/icon-mail.png | Bin 0 -> 285 bytes
.../base.v10.static/img/ui/icon-play-gray.png | Bin 0 -> 489 bytes
.../assets/base.v10.static/img/ui/icon-reply.png | Bin 0 -> 422 bytes
.../assets/base.v10.static/img/ui/icon-rw.png | Bin 0 -> 292 bytes
.../base.v10.static/img/ui/icon-thumbdown.png | Bin 0 -> 1040 bytes
.../assets/base.v10.static/img/ui/icon-thumbup.png | Bin 0 -> 950 bytes
.../base.v10.static/js/scripts.js?rev=o20110729 | 193 +++
.../static/base.v10.static.framework.o20110729.js | 5 +
.../static/base.v10.static.jquery.o20110729.js | 42 +
.../base.v10.static.jqueryplugin.o20110729.js | 18 +
.../static/base.v10.static.misc.o20110729.js | 31 +
.../static/base.v10.static.o20110729.css | 12 +
.../static/base.v10.static.widget.o20110729.js | 35 +
.../assets/base.v10/img/attribution-arrow.png | Bin 0 -> 1170 bytes
.../assets/base.v10/img/footer/mini-deadspin.png | Bin 0 -> 7502 bytes
.../assets/base.v10/img/footer/mini-gawker.png | Bin 0 -> 4111 bytes
.../assets/base.v10/img/footer/mini-gizmodo.png | Bin 0 -> 2712 bytes
.../assets/base.v10/img/footer/mini-io9.png | Bin 0 -> 1919 bytes
.../assets/base.v10/img/footer/mini-jalopnik.png | Bin 0 -> 4285 bytes
.../assets/base.v10/img/footer/mini-jezebel.png | Bin 0 -> 1920 bytes
.../assets/base.v10/img/footer/mini-kotaku.png | Bin 0 -> 2421 bytes
.../assets/base.v10/img/footer/mini-lifehacker.png | Bin 0 -> 2667 bytes
.../assets/base.v10/img/greyvertical.png | Bin 0 -> 163 bytes
.../assets/base.v10/img/icons/rightbar.comment.png | Bin 0 -> 396 bytes
.../assets/base.v10/img/icons/rightbar.flame.png | Bin 0 -> 332 bytes
.../img/indicator/progressIndicator_roller.gif | Bin 0 -> 1877 bytes
.../assets/base.v10/img/ui/arrow-down.png | Bin 0 -> 207 bytes
.../assets/base.v10/img/ui/arrow-right.png | Bin 0 -> 202 bytes
.../assets/base.v10/img/ui/icon-delete.png | Bin 0 -> 276 bytes
.../assets/base.v10/img/ui/icon-private.png | Bin 0 -> 441 bytes
.../assets/base.v10/img/ui/icon-video.png | Bin 0 -> 672 bytes
.../assets/base.v10/img/ui/userpost-image-sm.png | Bin 0 -> 1266 bytes
.../assets/base.v10/img/ui/userpost-text-sm.png | Bin 0 -> 584 bytes
.../assets/base.v10/img/ui/userpost-video-sm.png | Bin 0 -> 931 bytes
.../css/static.css?rev=o20110729.css | 40 +
.../v10.deadspin.com/img/apple-touch-icon.png | Bin 0 -> 8400 bytes
.../assets/v10.deadspin.com/img/bullet.png | Bin 0 -> 279 bytes
.../assets/v10.deadspin.com/img/logo-deadspin.png | Bin 0 -> 46196 bytes
...would-you-kill-a-stranger-to-save-football.html | 1086 +++++++++------
...-you-kill-a-stranger-to-save-football.html.rdbl | 11 +-
.../deadspin-000/deadspin.com/at.js.php | 2 +-
.../images/11/2011/07/micro_baddestbear-2.jpg | Bin 0 -> 9134 bytes
.../11/2011/07/xsmall_118730145_crop_340x234.jpg | Bin 0 -> 5984 bytes
.../images/11/2011/07/xsmall_740-388x230.jpg | Bin 0 -> 38659 bytes
.../images/11/2011/07/xsmall_ap110710029973.jpg | Bin 0 -> 8114 bytes
.../assets/images/11/2011/07/xsmall_berman.jpg | Bin 0 -> 5042 bytes
.../11/2011/07/xsmall_champishere-tattoo_01.jpg | Bin 0 -> 5578 bytes
.../images/11/2011/07/xsmall_donovandavid2.jpg | Bin 0 -> 9989 bytes
.../images/11/2011/07/xsmall_exclusive-helmet.jpg | Bin 0 -> 5606 bytes
.../11/2011/07/xsmall_funbag_deathbutton.jpg | Bin 0 -> 4730 bytes
.../assets/images/11/2011/07/xsmall_irvinout3.jpg | Bin 0 -> 11684 bytes
.../images/11/2011/07/xsmall_marlins_game.jpg | Bin 0 -> 4151 bytes
.../images/11/2011/07/xsmall_mccourt_divorce.jpg | Bin 0 -> 5611 bytes
.../images/11/2011/07/xsmall_nc3vb10djg0.jpg | Bin 0 -> 4908 bytes
.../images/11/2011/07/xsmall_oav3p.st.71.jpg | Bin 0 -> 33102 bytes
.../images/11/2011/07/xsmall_tglzd8qc9nk.jpg | Bin 0 -> 5135 bytes
.../images/11/2011/07/xsmall_yoh3b8fyuwu_01.jpg | Bin 0 -> 4200 bytes
.../2011/07/micro_explore_jalopnik_videos_216.jpg | Bin 0 -> 5658 bytes
.../2011/02/explore_gawkersales_videos_65.jpg | Bin 0 -> 34283 bytes
.../2011/02/xsmall_scott_listfield640x360b.jpg | Bin 0 -> 25550 bytes
.../images/13255/2011/03/xsmall_alienpepcid.jpg | Bin 0 -> 28739 bytes
.../13255/2011/04/xsmall_bg2040nr01_hrsmall.jpg | Bin 0 -> 5710 bytes
.../13255/2011/05/xsmall_deadspin-640x360.jpg | Bin 0 -> 3512 bytes
.../05/xsmall_explore_gawkersales_videos_91.jpg | Bin 0 -> 4083 bytes
.../images/13255/2011/05/xsmall_norelcoscreen.jpg | Bin 0 -> 8200 bytes
.../images/13255/2011/05/xsmall_stubble2.jpg | Bin 0 -> 3876 bytes
.../images/13255/2011/06/xsmall_blinditems.jpg | Bin 0 -> 8573 bytes
.../13255/2011/06/xsmall_deadspin-640x360.jpg | Bin 0 -> 3512 bytes
.../images/13255/2011/06/xsmall_pepsi_cars.jpg | Bin 0 -> 8680 bytes
.../07/xsmall_explore_gawkersales_videos_104.jpg | Bin 0 -> 3042 bytes
.../07/xsmall_explore_gawkersales_videos_106.jpg | Bin 0 -> 3042 bytes
.../17/2011/07/micro_today_in_lifehacker.jpg | Bin 0 -> 4921 bytes
.../images/39/2011/07/micro_mixedbag22511_03.jpg | Bin 0 -> 9985 bytes
.../images/4/2011/07/micro_best-apps-general.jpg | Bin 0 -> 10129 bytes
.../assets/images/7/2011/07/micro_0729_foxnews.jpg | Bin 0 -> 10805 bytes
...cil_starwars_line___flickr_-_photo_sharing_.jpg | Bin 0 -> 14530 bytes
.../assets/images/9/2011/07/micro_freeds_01.jpg | Bin 0 -> 10059 bytes
.../deadspin-000/www.facebook.com/robots.txt | 124 ++
.../deadspin-000/www.google.com/jsapi | 4 +-
.../deadspin-000/www.google.com/robots.txt | 2 +
regression_test_data/espn-000.yaml | 96 ++
.../combiner/c?css=espn.teams.r4i.css | 2 +-
...plane.0.0.0.js,community%2Fecho%2Fauth.0.0.8.js | 33 +
.../prod/styles/legacy.min.200811061403.css | 2 +-
.../a.espncdn.com/prod/styles/playerpopup1.css | 6 +-
...lb%2Fnews%2Fstory?id=6760720&style=compact.html | 12 +-
.../mlb/news/story?id=6760720.html | 144 +-
.../mlb/news/story?id=6760720.html.rdbl | 4 +-
regression_test_data/mit-000.yaml | 47 +-
.../images/article_images/tn/20110728162532-3.jpg | Bin 0 -> 27123 bytes
.../s7.addthis.com/js/250/addthis_widget.js | 2 +-
.../2011/compare-recommendation-systems-0708.html | 48 +-
...ss.php?css=c02feab0fb9955599676411caef15f7d.css | 2 +-
regression_test_data/nytimes-000.yaml | 65 +
.../css/0.1/screen/common/global.css | 52 +-
.../css/0.1/screen/common/layout.css | 12 +-
.../css/0.1/screen/common/modules.css | 2 +-
.../css/0.1/screen/common/modules/rss.css | 4 +-
.../css/0.1/screen/common/modules/sharetools.css | 10 +-
.../css/0.1/screen/common/shell.css | 2 +-
.../css/0.1/screen/common/util/tooltip.css | 2 +-
.../us/politics/specialseason/subNavigation.css | 22 +-
.../css/blogs/3.1/screen/community/comments.css | 2 +-
.../css/blogs/3.1/screen/modules/common.css | 16 +-
.../css/blogs/3.1/screen/modules/sharetools.css | 36 +-
.../blogs/3.1/screen/themes/universal/comments.css | 2 +-
.../blogs/3.1/screen/themes/universal/entry.css | 18 +-
.../blogs/3.1/screen/themes/universal/layout.css | 18 +-
.../themes/universal/style.css?v=06-03-2011.css | 32 +-
.../index.html?hp.html | 193 ++-
.../index.html?hp.html.rdbl | 2 +-
regression_test_data/nytimes-001.yaml | 100 +-
.../marketing/mm09/verticalst/verticals_movies.gif | Bin 0 -> 430 bytes
.../ads/marketing/mm11/dealbook_072911.jpg | Bin 0 -> 13034 bytes
.../ads/marketing/mm11/movies_072911.jpg | Bin 0 -> 16113 bytes
.../ADS/24/31/ad.243137/KN_Briefcase_336x79.gif | Bin 0 -> 9434 bytes
.../adx/images/ADS/26/68/ad.266879/120x60_10k.gif | Bin 0 -> 9801 bytes
.../ADS/27/11/ad.271168/NYTStore_Red_336x79.gif | Bin 0 -> 6460 bytes
.../ad.271444/11-0220_AudienceDev_86x60_farm.jpg | Bin 0 -> 3326 bytes
.../css/0.1/screen/article/abstract.css | 2 +-
.../css/0.1/screen/article/upnext.css | 2 +-
.../css/0.1/screen/build/article/2.0/styles.css | 36 +-
.../css/0.1/screen/common/article.css | 30 +-
.../css/0.1/screen/common/global.css | 52 +-
.../css/0.1/screen/common/layout.css | 12 +-
.../css/0.1/screen/common/modules.css | 2 +-
.../css/0.1/screen/common/modules/articletools.css | 16 +-
.../0.1/screen/common/modules/readercomments.css | 6 +-
.../css/0.1/screen/common/modules/sharetools.css | 10 +-
.../css/0.1/screen/common/mostpopular.css | 2 +-
.../css/0.1/screen/common/shell.css | 2 +-
.../0.1/screen/section/travel/modules/expedia.css | 2 +-
.../css/standalone/regilite/screen/regiLite.css | 2 +-
.../29MOTH_FLEISS/29MOTH_FLEISS-moth.jpg | Bin 0 -> 13298 bytes
.../29MOTH_INTERRUPT/29MOTH_INTERRUPT-moth.jpg | Bin 0 -> 55334 bytes
.../29/nyregion/29moth_cross/29moth_cross-moth.jpg | Bin 0 -> 9283 bytes
.../07/29/opinion/29moth_rfd/29moth_rfd-moth.jpg | Bin 0 -> 8331 bytes
.../29/realestate/29moth-tour/29moth-tour-moth.jpg | Bin 0 -> 10568 bytes
.../the-dark-art-of-breaking-bad.html?_r=1.html | 184 ++-
...he-dark-art-of-breaking-bad.html?_r=1.html.rdbl | 6 +-
...art-of-breaking-bad.html?_r=2&pagewanted=2.html | 172 ++-
...art-of-breaking-bad.html?_r=3&pagewanted=3.html | 176 ++-
...art-of-breaking-bad.html?_r=4&pagewanted=4.html | 181 ++-
...litesub_insert.html?product=LT&size=336X90.html | 8 +-
...art-of-breaking-bad.html?_r=5&pagewanted=5.html | 133 +-
...art-of-breaking-bad.html?_r=6&pagewanted=2.html | 1011 ++++++++++++++
regression_test_data/washingtonpost-000.yaml | 46 +
.../wp-srv/ad/textlinks/style/textlinks.css | 8 +-
.../2011/07/11/gIQA0XDg9H_story.html?hpid=z1.html | 600 ++++----
...e&m=false&context=wp-static&r=%2Fad%2Faudsci.js | 20 +
...r=%2Fad%2Fwpni_generic_ad.js&r=%2Fad%2Fwp_ad.js | 833 +++++++++++
regression_test_data/washingtonpost-001.yaml | 53 +-
.../wp-srv/ad/textlinks/style/textlinks.css | 8 +-
.../2011/06/26/AGtmeftH_story.html | 530 +++----
.../2011/06/26/AGtmeftH_story.html.rdbl | 105 +-
.../2011/06/26/AGtmeftH_story_1.html | 522 +++----
...e&m=false&context=wp-static&r=%2Fad%2Faudsci.js | 20 +
...r=%2Fad%2Fwpni_generic_ad.js&r=%2Fad%2Fwp_ad.js | 833 +++++++++++
test_data/nytimes-wget-log.txt | 709 ++++++++++
527 files changed, 28031 insertions(+), 3366 deletions(-)
commit 0d1fe06a3e15c3d526f7e1e561ca59222b3be638
Author: Jerry Charumilind <git@jcharum.fastmail.net>
Date: Thu Jul 28 16:49:58 2011 -0700
Remove log message used for quick debugging
readability/readability.py | 1 -
1 file changed, 1 deletion(-)
commit 02fbb2ca7f07d351d2cf4e6e1b5c12e6c60f2dbb
Author: Jerry Charumilind <git@jcharum.fastmail.net>
Date: Thu Jul 28 12:25:30 2011 -0700
Add regression test for multi-page washingtonpost
regression_test_data/washingtonpost-001.yaml | 11 +
.../b.scorecardresearch.com/robots.txt | 2 +
...788;rand='+TWP.StaticMethods.getUniqueToken()+' | Bin 0 -> 43 bytes
.../media.washingtonpost.com/robots.txt | 53 +
.../wp-srv/ad/textlinks/js/utilsTextLinksXML.js | 67 +
.../wp-srv/ad/textlinks/style/textlinks.css | 297 ++++
.../wp-srv/images/bullet_3x3_999999.gif | Bin 0 -> 44 bytes
.../pixel.quantserve.com/robots.txt | 2 +
.../content/images/iconSprite.png | Bin 0 -> 3689 bytes
.../content/images/slidingDoors.png | Bin 0 -> 1425 bytes
.../content/images/spark-plcHolder.png | Bin 0 -> 809 bytes
.../content/images/sprite.png | Bin 0 -> 3926 bytes
.../content/styles/WSODModules.css | 911 +++++++++++
.../washpost.bloomberg.com/modules/key-rates.html | 11 +
.../modules/other-market-data.html | 11 +
.../modules/world-markets.html | 11 +
.../2011/06/26/AGtmeftH_story.html | 1722 ++++++++++++++++++++
.../2011/06/26/AGtmeftH_story.html.rdbl | 50 +
.../2011/06/26/AGtmeftH_story_1.html | 1702 +++++++++++++++++++
.../2011/06/26/AGtmeftH_story_2.html | 1700 +++++++++++++++++++
.../2011/06/26/AGtmeftH_story_3.html | 1704 +++++++++++++++++++
.../www.washingtonpost.com/favicon.ico | Bin 0 -> 24038 bytes
.../Business/Videos/06302011-17v/06302011-17v.jpg | Bin 0 -> 7046 bytes
.../2011-07-01/w-debtservicersPROMO--296x195.jpg | Bin 0 -> 33102 bytes
.../www.washingtonpost.com/robots.txt | 53 +
.../Images/11-257 PromoBox_BOTH_335x100.jpg | Bin 0 -> 48140 bytes
.../_module-content/trove-right-rail-business.png | Bin 0 -> 9596 bytes
.../gov-showdown-promo-updated-333x100.jpg | Bin 0 -> 20192 bytes
.../twpweb/img/bkgds/overlay-for-296-graphics.png | Bin 0 -> 3746 bytes
.../rw/sites/twpweb/img/icons/icon-minus.png | Bin 0 -> 3132 bytes
.../rw/sites/twpweb/img/icons/icon-plus.png | Bin 0 -> 3144 bytes
.../rw/sites/twpweb/js/conf.js | 10 +
.../rw/sites/twpweb/js/site_traffic/comscore.js | 8 +
.../rw/sites/twpweb/js/wp_omniture.js | 1069 ++++++++++++
.../wp-srv/ad/textlinks/images/dash.gif | Bin 0 -> 46 bytes
.../wp-srv/ad/textlinks/images/dot.gif | Bin 0 -> 424 bytes
...e&m=false&context=wp-static&r=%2Fad%2Faudsci.js | 20 +
...r=%2Fad%2Fwpni_generic_ad.js&r=%2Fad%2Fwp_ad.js | 833 ++++++++++
38 files changed, 10247 insertions(+)
commit 3ec236909dca5603e142907c2bce4f8046a9ffc2
Author: Jerry Charumilind <git@jcharum.fastmail.net>
Date: Thu Jul 28 12:16:28 2011 -0700
Fix bug with write_spec
In the previous refactoring, some renaming broke the write_spec function. This
fixes it.
gen_test.py | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
commit 9aadf93048518302a8be9b7de9a526717776d007
Author: Jerry Charumilind <git@jcharum.fastmail.net>
Date: Thu Jul 28 11:19:01 2011 -0700
Refactor to improve urlfetch and gen_test code
The way URL fetching worked was confusing. The mapping from URLs to paths on
disks was being created as paths relative to the test data directory, but the
MockUrlFetch implementation needed paths relative to the root project
directory. This made everything confusing. This fixes that by making url_maps
always relative to the test data directory.
This change also allows gen_test to be refactored to be shorter and clearer.
gen_test.py | 100 +++++++++++++++++++----------------------------
readability/urlfetch.py | 25 +++++++-----
regression_test.py | 23 ++++-------
3 files changed, 64 insertions(+), 84 deletions(-)
commit 601895cc3edb259fb9f627422a18a97bd5ba623a
Author: Jerry Charumilind <git@jcharum.fastmail.net>
Date: Wed Jul 27 16:43:32 2011 -0700
Make genbench work again
This makes genbench work again but in a very ugly way. A forthcoming commit
will make this much nicer.
gen_test.py | 31 +++++++++++++++++++++----------
readability/urlfetch.py | 4 ++--
2 files changed, 23 insertions(+), 12 deletions(-)
commit 73af69f8b5021f3a95c92b18c47918f1a8708a3f
Author: Jerry Charumilind <git@jcharum.fastmail.net>
Date: Wed Jul 27 15:14:39 2011 -0700
Rework regression test to run without internet
The pages and data used for the regression tests are now fully self-contained.
Previously, we were using local HTML files, but things like images were still
being referenced out on the internet. We now make full local copies of the
pages and their prerequisites so that the entire test can run without the
internet.
This required a lot of tweaking, as we now use wget download the page and
prerequisites.
gen_test.py | 65 +-
readability/readability.py | 17 +-
readability/urlfetch.py | 70 +-
regression_test.py | 85 +-
regression_test_data/arstechnica-000-orig.html | 664 ----
regression_test_data/arstechnica-000-rdbl.html | 53 -
regression_test_data/arstechnica-000.yaml | 5 +-
.../ars/theme/images/login_register/button-bg.png | Bin 0 -> 280 bytes
...Int(Math.random()*99999999, 10)).toString() + ' | Bin 0 -> 43 bytes
.../dragons/brains.gif?id=51247&1215625555 | Bin 0 -> 43 bytes
.../public/shared/scripts/da-1.5.js | 382 +++
.../arstechnica-000/arstechnica.com/robots.txt | 14 +
.../arstechnica-000/arstechnica.com/web/index.html | 1415 ++++++++
...ne-browser-stats-rapid-release-edition.ars.html | 668 ++++
...owser-stats-rapid-release-edition.ars.html.rdbl | 54 +
.../b.scorecardresearch.com/robots.txt | 2 +
.../static.addtoany.com/buttons/favicon.png | Bin 0 -> 270 bytes
.../browsers-june-2011/ars-browser-share.png | Bin 0 -> 36454 bytes
.../browsers-june-2011/chrome-transition.png | Bin 0 -> 34061 bytes
.../browsers-june-2011/firefox-transition.png | Bin 0 -> 20769 bytes
.../browsers-june-2011/global-browser-share.png | Bin 0 -> 53197 bytes
.../internet-explorer-transition.png | Bin 0 -> 20450 bytes
.../browsers-june-2011/thumb-sogou-chrome.png | Bin 0 -> 98522 bytes
.../browsers-june-2011/thumb-sogou-ie.png | Bin 0 -> 107126 bytes
.../05936218856bfda2a2b187b00e5cc496.png | Bin 0 -> 15153 bytes
.../274ea6386720b4f5d44d303aac087806.jpg | Bin 0 -> 29479 bytes
.../03/firefox-09-small-thumb-230x130-20442-f.jpg | Bin 0 -> 13358 bytes
...oogle-plus-arrows2-sq-thumb-280x280-22992-f.jpg | Bin 0 -> 22079 bytes
...thumb_ER_tubes_flickr-thumb-230x130-22873-f.jpg | Bin 0 -> 15994 bytes
...ental-4e2ee28-listing-thumb-230x130-24059-f.jpg | Bin 0 -> 39649 bytes
.../peter-chatting-small-thumb-230x130-23314-f.jpg | Bin 0 -> 11995 bytes
...light-4e2dadf-listing-thumb-230x130-24014-f.jpg | Bin 0 -> 28018 bytes
...ty_flow-4e20ec5-intro-thumb-230x130-23652-f.jpg | Bin 0 -> 15366 bytes
.../static.arstechnica.net/favicon.ico | Bin 0 -> 22382 bytes
.../plugins/ArsTheme/images/masthead-short.png | Bin 0 -> 14181 bytes
.../opensource/firefox-09-small.jpg | Bin 0 -> 17065 bytes
.../shared/images/Game-Stat-Box-Sprites.png?v3 | Bin 0 -> 3974 bytes
.../public/shared/images/activity.gif | Bin 0 -> 4178 bytes
.../public/shared/images/game-box-tall-bkg.png | Bin 0 -> 20719 bytes
.../shared/images/reviewed-platform-icon.png | Bin 0 -> 323 bytes
.../public/v6/footer.html?1311610759.html | 229 ++
.../public/v6/scripts/site.min.js?1311610759 | 71 +
.../v6/styles/light/images/content/arrow-bg.png | Bin 0 -> 183 bytes
.../light/images/content/author-bubble-bg.png | Bin 0 -> 1258 bytes
.../v6/styles/light/images/content/badge.png | Bin 0 -> 3817 bytes
.../light/images/content/comments-bar-bg.png?2 | Bin 0 -> 6791 bytes
.../styles/light/images/content/etc-bubble-bg.png | Bin 0 -> 1248 bytes
.../light/images/content/etc-category-sprite.png | Bin 0 -> 2152 bytes
.../public/v6/styles/light/images/content/etc.png | Bin 0 -> 376 bytes
.../styles/light/images/content/etc.png?1311610759 | Bin 0 -> 376 bytes
.../light/images/content/feature.png?1311610759 | Bin 0 -> 1418 bytes
.../styles/light/images/content/follow-button.png | Bin 0 -> 1184 bytes
.../v6/styles/light/images/content/plus-large.png | Bin 0 -> 119 bytes
.../images/content/pullquote-presidential-bg.png | Bin 0 -> 551 bytes
.../light/images/content/pullquote-rules-bg.png | Bin 0 -> 425 bytes
.../images/content/read-more-comment-sprite.png | Bin 0 -> 3723 bytes
.../light/images/content/silo-headers/apple.png | Bin 0 -> 18330 bytes
.../light/images/content/silo-headers/ask-ars.png | Bin 0 -> 11565 bytes
.../light/images/content/silo-headers/att.png | Bin 0 -> 5894 bytes
.../light/images/content/silo-headers/bieb.png | Bin 0 -> 7152 bytes
.../light/images/content/silo-headers/business.png | Bin 0 -> 30278 bytes
.../content/silo-headers/columbia-header.png | Bin 0 -> 19102 bytes
.../light/images/content/silo-headers/features.png | Bin 0 -> 12664 bytes
.../images/content/silo-headers/future-cars.png | Bin 0 -> 15463 bytes
.../images/content/silo-headers/future-of-tv.png | Bin 0 -> 5777 bytes
.../light/images/content/silo-headers/gadgets.png | Bin 0 -> 31088 bytes
.../light/images/content/silo-headers/gaming.png | Bin 0 -> 9750 bytes
.../silo-headers/gift-guide-header-gadgets.jpg | Bin 0 -> 22331 bytes
.../silo-headers/gift-guide-header-gaming.jpg | Bin 0 -> 20602 bytes
.../silo-headers/gift-guide-header-hdtv.jpg | Bin 0 -> 22372 bytes
.../silo-headers/gift-guide-header-staff.jpg | Bin 0 -> 26511 bytes
.../light/images/content/silo-headers/guides.png | Bin 0 -> 13250 bytes
.../light/images/content/silo-headers/hardware.png | Bin 0 -> 21687 bytes
.../content/silo-headers/ht-samsung-header.png | Bin 0 -> 14745 bytes
.../content/silo-headers/ibm_lotus_collab.png | Bin 0 -> 5308 bytes
.../light/images/content/silo-headers/ipad.png | Bin 0 -> 7517 bytes
.../images/content/silo-headers/last-mile.png | Bin 0 -> 14608 bytes
.../light/images/content/silo-headers/media.png | Bin 0 -> 22170 bytes
.../images/content/silo-headers/microsoft.png | Bin 0 -> 23794 bytes
.../content/silo-headers/msft_cloud_header.jpg | Bin 0 -> 21450 bytes
.../light/images/content/silo-headers/netapp.png | Bin 0 -> 5423 bytes
.../images/content/silo-headers/open-source.png | Bin 0 -> 22404 bytes
.../images/content/silo-headers/planet_cloud.jpg | Bin 0 -> 18995 bytes
.../light/images/content/silo-headers/premier.jpg | Bin 0 -> 13745 bytes
.../light/images/content/silo-headers/raise-iq.png | Bin 0 -> 15030 bytes
.../light/images/content/silo-headers/reviews.png | Bin 0 -> 11984 bytes
.../light/images/content/silo-headers/science.png | Bin 0 -> 28485 bytes
.../light/images/content/silo-headers/security.png | Bin 0 -> 16844 bytes
.../light/images/content/silo-headers/software.png | Bin 0 -> 21915 bytes
.../light/images/content/silo-headers/staff.png | Bin 0 -> 22155 bytes
.../images/content/silo-headers/system-guides.png | Bin 0 -> 18702 bytes
.../images/content/silo-headers/tech-policy.png | Bin 0 -> 21229 bytes
.../content/silo-headers/technopaedia-header.png | Bin 0 -> 33016 bytes
.../light/images/content/silo-headers/telecom.png | Bin 0 -> 21697 bytes
.../light/images/content/silo-headers/web.png | Bin 0 -> 21263 bytes
.../light/images/content/technopaedia-footer.png | Bin 0 -> 4986 bytes
.../light/images/content/technopaedia-header.png | Bin 0 -> 432 bytes
.../light/images/content/technopaedia-min-max.png | Bin 0 -> 602 bytes
.../light/images/footer/footer-mobile-badge.png | Bin 0 -> 6635 bytes
.../light/images/masthead/logo.png?1311610759 | Bin 0 -> 3460 bytes
.../images/masthead/masthead-bg-premier-banned.png | Bin 0 -> 30698 bytes
.../images/masthead/masthead-bg-premier-tent.png | Bin 0 -> 31059 bytes
.../light/images/masthead/masthead-bg-premier.png | Bin 0 -> 28571 bytes
.../styles/light/images/masthead/masthead-bg.png | Bin 0 -> 75864 bytes
.../navigation/auxiliary-navigation-sprite.png | Bin 0 -> 804 bytes
.../light/images/navigation/more-arrow-sprite.png | Bin 0 -> 222 bytes
.../light/images/navigation/nav-bar-bkg.png?v=2 | Bin 0 -> 591 bytes
.../light/images/news-bar/tabs-icons-sprite.png | Bin 0 -> 4129 bytes
.../light/images/news-bar/tabs-left-sprite.png | Bin 0 -> 588 bytes
.../light/images/news-bar/tabs-right-sprite.png | Bin 0 -> 595 bytes
.../v6/styles/light/images/search/button-bg.png | Bin 0 -> 492 bytes
.../v6/styles/light/images/search/field-bg.png | Bin 0 -> 165 bytes
.../light/images/search/search-rollover-bkg.png | Bin 0 -> 269 bytes
.../light/images/sidebar/arrow-bullets-sprite.png | Bin 0 -> 141 bytes
.../light/images/sidebar/bottom-bubble-bg.png | Bin 0 -> 1904 bytes
.../light/images/sidebar/bottom-bubble-top-bg.png | Bin 0 -> 151 bytes
.../sidebar/categories-sprite-vertical.png?2 | Bin 0 -> 3890 bytes
.../light/images/sidebar/categories-sprite.png | Bin 0 -> 2088 bytes
.../styles/light/images/sidebar/links-copyedit.png | Bin 0 -> 1165 bytes
.../styles/light/images/sidebar/links-sprite.png | Bin 0 -> 9252 bytes
.../light/images/sidebar/misc-icons-sprite.png | Bin 0 -> 1402 bytes
.../styles/light/images/sidebar/top-bubble-bg.png | Bin 0 -> 1485 bytes
.../light/images/sidebar/top-bubble-bottom-bg.png | Bin 0 -> 151 bytes
.../v6/styles/light/light.c.css?1311610759.css | 1 +
.../v6/styles/light/light.c.css?1311610759orig | 1 +
.../v6/styles/print/print.css?1311610759.css | 1 +
regression_test_data/arstechnica-001.yaml | 5 +
.../ars/theme/images/login_register/button-bg.png | Bin 0 -> 280 bytes
...Int(Math.random()*99999999, 10)).toString() + ' | Bin 0 -> 43 bytes
.../dragons/brains.gif?id=50988&1547967201 | Bin 0 -> 43 bytes
.../dragons/brains.gif?id=50988&450609258 | Bin 0 -> 43 bytes
.../dragons/brains.gif?id=50988&487992712 | Bin 0 -> 43 bytes
.../dragons/brains.gif?id=50988&768661508 | Bin 0 -> 43 bytes
...Int(Math.random()*99999999, 10)).toString() + ' | Bin 0 -> 43 bytes
.../dragons/brains.gif?id=51247&739048522 | Bin 0 -> 43 bytes
.../public/shared/scripts/da-1.5.js | 382 +++
.../arstechnica-001/arstechnica.com/robots.txt | 14 +
.../1.html | 706 ++++
.../1.html.rdbl | 184 +
.../2.html | 745 ++++
.../arstechnica-001/arstechnica.com/web/index.html | 1408 ++++++++
...ne-browser-stats-rapid-release-edition.ars.html | 664 ++++
.../b.scorecardresearch.com/robots.txt | 2 +
.../static.addtoany.com/buttons/favicon.png | Bin 0 -> 270 bytes
.../static.arstechnica.com/06-17-2011/figure1.jpg | Bin 0 -> 196891 bytes
.../static.arstechnica.com/06-17-2011/figure2.jpg | Bin 0 -> 99691 bytes
.../static.arstechnica.com/06-17-2011/figure3.jpg | Bin 0 -> 58325 bytes
.../static.arstechnica.com/06-17-2011/figure4.jpg | Bin 0 -> 425720 bytes
.../browsers-june-2011/ars-browser-share.png | Bin 0 -> 36454 bytes
.../browsers-june-2011/chrome-transition.png | Bin 0 -> 34061 bytes
.../browsers-june-2011/firefox-transition.png | Bin 0 -> 20769 bytes
.../browsers-june-2011/global-browser-share.png | Bin 0 -> 53197 bytes
.../internet-explorer-transition.png | Bin 0 -> 20450 bytes
.../browsers-june-2011/thumb-sogou-chrome.png | Bin 0 -> 98522 bytes
.../browsers-june-2011/thumb-sogou-ie.png | Bin 0 -> 107126 bytes
.../05936218856bfda2a2b187b00e5cc496.png | Bin 0 -> 15153 bytes
.../274ea6386720b4f5d44d303aac087806.jpg | Bin 0 -> 29479 bytes
.../03/firefox-09-small-thumb-230x130-20442-f.jpg | Bin 0 -> 13358 bytes
...oogle-plus-arrows2-sq-thumb-280x280-22992-f.jpg | Bin 0 -> 22079 bytes
...thumb_ER_tubes_flickr-thumb-230x130-22873-f.jpg | Bin 0 -> 15994 bytes
...ental-4e2ee28-listing-thumb-230x130-24059-f.jpg | Bin 0 -> 39649 bytes
...ame-over-4e29f36-intro-thumb-640xauto-23967.jpg | Bin 0 -> 135777 bytes
.../peter-chatting-small-thumb-230x130-23314-f.jpg | Bin 0 -> 11995 bytes
...light-4e2dadf-listing-thumb-230x130-24014-f.jpg | Bin 0 -> 28018 bytes
...ty_flow-4e20ec5-intro-thumb-230x130-23652-f.jpg | Bin 0 -> 15366 bytes
.../static.arstechnica.net/favicon.ico | Bin 0 -> 22382 bytes
.../plugins/ArsTheme/images/masthead-short.png | Bin 0 -> 14181 bytes
.../opensource/firefox-09-small.jpg | Bin 0 -> 17065 bytes
.../shared/images/Game-Stat-Box-Sprites.png?v3 | Bin 0 -> 3974 bytes
.../public/shared/images/activity.gif | Bin 0 -> 4178 bytes
.../public/shared/images/game-box-tall-bkg.png | Bin 0 -> 20719 bytes
.../shared/images/reviewed-platform-icon.png | Bin 0 -> 323 bytes
.../public/v6/footer.html?1311610759.html | 229 ++
.../public/v6/scripts/site.min.js?1311610759 | 71 +
.../v6/styles/light/images/content/arrow-bg.png | Bin 0 -> 183 bytes
.../light/images/content/author-bubble-bg.png | Bin 0 -> 1258 bytes
.../v6/styles/light/images/content/badge.png | Bin 0 -> 3817 bytes
.../light/images/content/comments-bar-bg.png?2 | Bin 0 -> 6791 bytes
.../styles/light/images/content/etc-bubble-bg.png | Bin 0 -> 1248 bytes
.../light/images/content/etc-category-sprite.png | Bin 0 -> 2152 bytes
.../public/v6/styles/light/images/content/etc.png | Bin 0 -> 376 bytes
.../styles/light/images/content/etc.png?1311610759 | Bin 0 -> 376 bytes
.../light/images/content/feature.png?1311610759 | Bin 0 -> 1418 bytes
.../styles/light/images/content/follow-button.png | Bin 0 -> 1184 bytes
.../v6/styles/light/images/content/plus-large.png | Bin 0 -> 119 bytes
.../images/content/pullquote-presidential-bg.png | Bin 0 -> 551 bytes
.../light/images/content/pullquote-rules-bg.png | Bin 0 -> 425 bytes
.../images/content/read-more-comment-sprite.png | Bin 0 -> 3723 bytes
.../light/images/content/silo-headers/apple.png | Bin 0 -> 18330 bytes
.../light/images/content/silo-headers/ask-ars.png | Bin 0 -> 11565 bytes
.../light/images/content/silo-headers/att.png | Bin 0 -> 5894 bytes
.../light/images/content/silo-headers/bieb.png | Bin 0 -> 7152 bytes
.../light/images/content/silo-headers/business.png | Bin 0 -> 30278 bytes
.../content/silo-headers/columbia-header.png | Bin 0 -> 19102 bytes
.../light/images/content/silo-headers/features.png | Bin 0 -> 12664 bytes
.../images/content/silo-headers/future-cars.png | Bin 0 -> 15463 bytes
.../images/content/silo-headers/future-of-tv.png | Bin 0 -> 5777 bytes
.../light/images/content/silo-headers/gadgets.png | Bin 0 -> 31088 bytes
.../light/images/content/silo-headers/gaming.png | Bin 0 -> 9750 bytes
.../silo-headers/gift-guide-header-gadgets.jpg | Bin 0 -> 22331 bytes
.../silo-headers/gift-guide-header-gaming.jpg | Bin 0 -> 20602 bytes
.../silo-headers/gift-guide-header-hdtv.jpg | Bin 0 -> 22372 bytes
.../silo-headers/gift-guide-header-staff.jpg | Bin 0 -> 26511 bytes
.../light/images/content/silo-headers/guides.png | Bin 0 -> 13250 bytes
.../light/images/content/silo-headers/hardware.png | Bin 0 -> 21687 bytes
.../content/silo-headers/ht-samsung-header.png | Bin 0 -> 14745 bytes
.../content/silo-headers/ibm_lotus_collab.png | Bin 0 -> 5308 bytes
.../light/images/content/silo-headers/ipad.png | Bin 0 -> 7517 bytes
.../images/content/silo-headers/last-mile.png | Bin 0 -> 14608 bytes
.../light/images/content/silo-headers/media.png | Bin 0 -> 22170 bytes
.../images/content/silo-headers/microsoft.png | Bin 0 -> 23794 bytes
.../content/silo-headers/msft_cloud_header.jpg | Bin 0 -> 21450 bytes
.../light/images/content/silo-headers/netapp.png | Bin 0 -> 5423 bytes
.../images/content/silo-headers/open-source.png | Bin 0 -> 22404 bytes
.../images/content/silo-headers/planet_cloud.jpg | Bin 0 -> 18995 bytes
.../light/images/content/silo-headers/premier.jpg | Bin 0 -> 13745 bytes
.../light/images/content/silo-headers/raise-iq.png | Bin 0 -> 15030 bytes
.../light/images/content/silo-headers/reviews.png | Bin 0 -> 11984 bytes
.../light/images/content/silo-headers/science.png | Bin 0 -> 28485 bytes
.../light/images/content/silo-headers/security.png | Bin 0 -> 16844 bytes
.../light/images/content/silo-headers/software.png | Bin 0 -> 21915 bytes
.../light/images/content/silo-headers/staff.png | Bin 0 -> 22155 bytes
.../images/content/silo-headers/system-guides.png | Bin 0 -> 18702 bytes
.../images/content/silo-headers/tech-policy.png | Bin 0 -> 21229 bytes
.../content/silo-headers/technopaedia-header.png | Bin 0 -> 33016 bytes
.../light/images/content/silo-headers/telecom.png | Bin 0 -> 21697 bytes
.../light/images/content/silo-headers/web.png | Bin 0 -> 21263 bytes
.../light/images/content/technopaedia-footer.png | Bin 0 -> 4986 bytes
.../light/images/content/technopaedia-header.png | Bin 0 -> 432 bytes
.../light/images/content/technopaedia-min-max.png | Bin 0 -> 602 bytes
.../light/images/footer/footer-mobile-badge.png | Bin 0 -> 6635 bytes
.../light/images/masthead/logo.png?1311610759 | Bin 0 -> 3460 bytes
.../images/masthead/masthead-bg-premier-banned.png | Bin 0 -> 30698 bytes
.../images/masthead/masthead-bg-premier-tent.png | Bin 0 -> 31059 bytes
.../light/images/masthead/masthead-bg-premier.png | Bin 0 -> 28571 bytes
.../styles/light/images/masthead/masthead-bg.png | Bin 0 -> 75864 bytes
.../navigation/auxiliary-navigation-sprite.png | Bin 0 -> 804 bytes
.../light/images/navigation/more-arrow-sprite.png | Bin 0 -> 222 bytes
.../light/images/navigation/nav-bar-bkg.png?v=2 | Bin 0 -> 591 bytes
.../light/images/news-bar/tabs-icons-sprite.png | Bin 0 -> 4129 bytes
.../light/images/news-bar/tabs-left-sprite.png | Bin 0 -> 588 bytes
.../light/images/news-bar/tabs-right-sprite.png | Bin 0 -> 595 bytes
.../v6/styles/light/images/search/button-bg.png | Bin 0 -> 492 bytes
.../v6/styles/light/images/search/field-bg.png | Bin 0 -> 165 bytes
.../light/images/search/search-rollover-bkg.png | Bin 0 -> 269 bytes
.../light/images/sidebar/arrow-bullets-sprite.png | Bin 0 -> 141 bytes
.../light/images/sidebar/bottom-bubble-bg.png | Bin 0 -> 1904 bytes
.../light/images/sidebar/bottom-bubble-top-bg.png | Bin 0 -> 151 bytes
.../sidebar/categories-sprite-vertical.png?2 | Bin 0 -> 3890 bytes
.../light/images/sidebar/categories-sprite.png | Bin 0 -> 2088 bytes
.../styles/light/images/sidebar/links-copyedit.png | Bin 0 -> 1165 bytes
.../styles/light/images/sidebar/links-sprite.png | Bin 0 -> 9252 bytes
.../light/images/sidebar/misc-icons-sprite.png | Bin 0 -> 1402 bytes
.../styles/light/images/sidebar/top-bubble-bg.png | Bin 0 -> 1485 bytes
.../light/images/sidebar/top-bubble-bottom-bg.png | Bin 0 -> 151 bytes
.../v6/styles/light/light.c.css?1311610759.css | 1 +
.../v6/styles/light/light.c.css?1311610759orig | 1 +
.../v6/styles/print/print.css?1311610759.css | 1 +
regression_test_data/businessinsider-000-orig.html | 3602 -------------------
regression_test_data/businessinsider-000-rdbl.html | 11 -
regression_test_data/businessinsider-000.yaml | 2 +
...oves_and_jesus_hates_apple_computers_normal.png | Bin 0 -> 5679 bytes
.../b.scorecardresearch.com/robots.txt | 2 +
.../cdn.sailthru.com/scout/v1.js | 1 +
.../connect.facebook.net/en_US/all.js | 82 +
.../connect.facebook.net/robots.txt | 1 +
.../d.businessinsider.com/robots.txt | 7 +
...Account=clusterstock&Module=snapshot2&Output=JS | 987 ++++++
...rstock&Module=stockquote5&Ticker=MSFT&Output=JS | 903 +++++
.../jobs.businessinsider.com/robots.txt | 34 +
.../pixel.quantserve.com/robots.txt | 2 +
.../platform.linkedin.com/in.js | 68 +
.../anywhere.js?id=ZV0JHq7YJkjozsfohDIleQ&v=1 | 17 +
.../vat/mon/vt.js?1311566655 | 1 +
.../m1?ci=us-103525h&cg=0&cc=1&ts=noscript&ja=1 | Bin 0 -> 44 bytes
.../static.fmpub.net/robots.txt | 2 +
.../ui-bg_diagonals-thick_18_b81900_40x40.png | Bin 0 -> 260 bytes
.../ui-bg_diagonals-thick_20_666666_40x40.png | Bin 0 -> 251 bytes
.../css/images/ui-bg_flat_10_000000_40x100.png | Bin 0 -> 178 bytes
.../css/images/ui-bg_glass_100_f6f6f6_1x400.png | Bin 0 -> 104 bytes
.../css/images/ui-bg_glass_100_fdf5ce_1x400.png | Bin 0 -> 125 bytes
.../css/images/ui-bg_glass_65_ffffff_1x400.png | Bin 0 -> 105 bytes
.../images/ui-bg_gloss-wave_35_f6a828_500x100.png | Bin 0 -> 3762 bytes
.../ui-bg_highlight-soft_100_eeeeee_1x100.png | Bin 0 -> 90 bytes
.../ui-bg_highlight-soft_75_ffe45c_1x100.png | Bin 0 -> 129 bytes
.../assets/css/images/ui-icons_222222_256x240.png | Bin 0 -> 4369 bytes
.../assets/css/images/ui-icons_228ef1_256x240.png | Bin 0 -> 4369 bytes
.../assets/css/images/ui-icons_ef8c08_256x240.png | Bin 0 -> 4369 bytes
.../assets/css/images/ui-icons_ffd27a_256x240.png | Bin 0 -> 4369 bytes
.../assets/css/images/ui-icons_ffffff_256x240.png | Bin 0 -> 4369 bytes
.../assets/css/min-all.css?1311566655.css | 1 +
.../assets/css/min-all.css?1311566655orig | 1 +
.../assets/css/min-print.css?1311566655.css | 1 +
.../assets/css/min-print.css?1311566655orig | 1 +
.../ui-lightness/images/ui-anim_basic_16x16.gif | Bin 0 -> 1553 bytes
.../assets/images/back_black33.png | Bin 0 -> 109 bytes
.../assets/images/back_blue_tbi.png | Bin 0 -> 114 bytes
.../assets/images/back_breaking.png | Bin 0 -> 1605 bytes
.../assets/images/back_buttons.png | Bin 0 -> 33836 bytes
.../assets/images/back_buttons_slideshow.png | Bin 0 -> 12042 bytes
.../assets/images/back_chart_bar.png | Bin 0 -> 968 bytes
.../assets/images/back_countdown.png | Bin 0 -> 6656 bytes
.../assets/images/back_digg_count.png | Bin 0 -> 1156 bytes
.../assets/images/back_dotted_hor.gif | Bin 0 -> 43 bytes
.../assets/images/back_dotted_ver.gif | Bin 0 -> 43 bytes
.../assets/images/back_earnings_gradient.png | Bin 0 -> 244 bytes
.../assets/images/back_gradient_tbi.png | Bin 0 -> 144 bytes
.../assets/images/back_grey_icon_arrow_blue.png | Bin 0 -> 372 bytes
.../assets/images/back_header_tbi.gif | Bin 0 -> 546 bytes
.../assets/images/back_header_tbi_cs.gif | Bin 0 -> 305 bytes
.../assets/images/back_header_tbi_ent.gif | Bin 0 -> 304 bytes
.../assets/images/back_header_tbi_gbi.gif | Bin 0 -> 539 bytes
.../assets/images/back_header_tbi_law.gif | Bin 0 -> 926 bytes
.../assets/images/back_header_tbi_life.gif | Bin 0 -> 533 bytes
.../assets/images/back_header_tbi_mg.gif | Bin 0 -> 319 bytes
.../assets/images/back_header_tbi_misc.gif | Bin 0 -> 8365 bytes
.../assets/images/back_header_tbi_politics.gif | Bin 0 -> 314 bytes
.../assets/images/back_header_tbi_sai.gif | Bin 0 -> 553 bytes
.../assets/images/back_header_tbi_sp.gif | Bin 0 -> 301 bytes
.../assets/images/back_header_tbi_tools.gif | Bin 0 -> 1089 bytes
.../assets/images/back_header_tbi_tvl.gif | Bin 0 -> 3035 bytes
.../assets/images/back_header_tbi_tw.gif | Bin 0 -> 298 bytes
.../assets/images/back_jquery_dialogue.png | Bin 0 -> 13616 bytes
.../assets/images/back_liveplayer.png | Bin 0 -> 332 bytes
.../assets/images/back_liveplayer_botleft.png | Bin 0 -> 223 bytes
.../assets/images/back_liveplayer_botright.png | Bin 0 -> 209 bytes
.../assets/images/back_liveplayer_topleft.png | Bin 0 -> 233 bytes
.../assets/images/back_liveplayer_topright.png | Bin 0 -> 239 bytes
.../assets/images/back_rays.png | Bin 0 -> 86574 bytes
.../assets/images/back_right_gradient.png | Bin 0 -> 183 bytes
.../images/back_right_gradient_desaturated.png | Bin 0 -> 1163 bytes
.../assets/images/back_right_newsletter.png | Bin 0 -> 24808 bytes
.../assets/images/bi_mobile.png | Bin 0 -> 7293 bytes
.../assets/images/button_reply_left.png | Bin 0 -> 3492 bytes
.../assets/images/button_reply_right.png | Bin 0 -> 585 bytes
.../assets/images/button_thumbs_replies.png | Bin 0 -> 6712 bytes
.../assets/images/button_thumbs_right.png | Bin 0 -> 520 bytes
.../assets/images/button_thumbsdown_left.png | Bin 0 -> 4275 bytes
.../assets/images/button_thumbsup_left.png | Bin 0 -> 4020 bytes
.../assets/images/button_x.png | Bin 0 -> 906 bytes
.../assets/images/dot-black.png | Bin 0 -> 145 bytes
.../assets/images/gradient_background.png | Bin 0 -> 1158 bytes
.../assets/images/icons/icon_calendar.png | Bin 0 -> 4625 bytes
.../assets/images/icons/icon_disclaimer.png | Bin 0 -> 179 bytes
.../assets/images/icons/icon_email.png | Bin 0 -> 292 bytes
.../assets/images/icons/icon_email_15x10.gif | Bin 0 -> 1583 bytes
.../assets/images/icons/icon_ext_link.png | Bin 0 -> 300 bytes
.../assets/images/icons/icon_newalerts.png | Bin 0 -> 3942 bytes
.../assets/images/icons/icon_print.gif | Bin 0 -> 76 bytes
.../assets/images/icons/icon_required.png | Bin 0 -> 200 bytes
.../assets/images/icons/icon_rss.png | Bin 0 -> 1720 bytes
.../assets/images/icons/icon_twitter.png | Bin 0 -> 1628 bytes
.../images/icons/icon_vikings_watercooler.png | Bin 0 -> 14295 bytes
.../assets/images/icons/icons.png | Bin 0 -> 25110 bytes
.../assets/images/icons/list-arrow-orange.png | Bin 0 -> 2873 bytes
.../assets/images/icons/list_arrow_blue.png | Bin 0 -> 217 bytes
.../assets/images/icons/placeholder.png | Bin 0 -> 2074 bytes
.../images/lightbox/lightbox-ico-loading-blue.gif | Bin 0 -> 4586 bytes
.../images/lightbox/lightbox-ico-loading.gif | Bin 0 -> 3990 bytes
.../assets/images/pipeline/back_pipeline.png | Bin 0 -> 2088 bytes
.../images/pipeline/back_pipeline_newsletter.png | Bin 0 -> 25643 bytes
.../assets/images/poll-bg.png | Bin 0 -> 1082 bytes
.../assets/images/share/facebook.gif | Bin 0 -> 142 bytes
.../assets/images/strikes.png | Bin 0 -> 2060 bytes
.../assets/images/tab_selected.png | Bin 0 -> 1553 bytes
.../assets/images/twitter_popup_arrow.png | Bin 0 -> 1164 bytes
.../assets/images/vid_header_ooyala.png | Bin 0 -> 5795 bytes
.../obama-editorial-sidebar.jpg | Bin 0 -> 8894 bytes
...cs-were-an-early-touch-screen-pc-experiment.jpg | Bin 0 -> 2586 bytes
.../windows-8-start-screen.jpg | Bin 0 -> 54562 bytes
...me-out-way-back-in-2002-but-never-sold-well.jpg | Bin 0 -> 2000 bytes
.../danya-grayson.jpg | Bin 0 -> 1610 bytes
.../bomb-shelter.jpg | Bin 0 -> 1985 bytes
.../alexia-tsotsis.jpg | Bin 0 -> 1795 bytes
...-zune-laid-the-groundwork-for-windows-phone.jpg | Bin 0 -> 1748 bytes
.../static5.businessinsider.com/robots.txt | 6 +
.../assets/images/partners/catchpoint.png | Bin 0 -> 6017 bytes
.../assets/images/partners/ooyala.png | Bin 0 -> 3780 bytes
.../static6.businessinsider.com/assets/js/track.js | 35 +
.../caterina-fake.jpg | Bin 0 -> 3318 bytes
.../4cc9f7b64bd7c87a2e020000-60-45/family.jpg | Bin 0 -> 1991 bytes
.../image/4cf42da149e2aecf03040000-50-sq/image.jpg | Bin 0 -> 1577 bytes
.../image/4d937b57ccd1d5b2351c0000-50-sq/image.jpg | Bin 0 -> 1473 bytes
.../head-ceiling.jpg | Bin 0 -> 1486 bytes
...-came-from-now-check-out-what-it-looks-like.jpg | Bin 0 -> 1629 bytes
.../4e1320e4ccd1d5a331080000-60-60/jaeson.jpg | Bin 0 -> 1872 bytes
...ntually-became-the-ill-fated-windows-mobile.jpg | Bin 0 -> 1732 bytes
.../4e2c035149e2aebd09140000-60-45/vince-cable.png | Bin 0 -> 5489 bytes
...t-the-new-york-city-subway-was-like-in-1973.jpg | Bin 0 -> 1548 bytes
.../4e2ecf39ecad04de6f00001a-50-50/tim-carmody.jpg | Bin 0 -> 2186 bytes
.../static6.businessinsider.com/robots.txt | 6 +
.../assets/images/faviconBI.ico | Bin 0 -> 24776 bytes
.../assets/images/logos/tbi_print.jpg | Bin 0 -> 7750 bytes
.../assets/images/partners/ad-juster.png | Bin 0 -> 9226 bytes
.../assets/images/partners/datapipe.png | Bin 0 -> 12692 bytes
.../assets/images/partners/openx.png | Bin 0 -> 5093 bytes
.../dan-loeb-third-point.jpg | Bin 0 -> 1869 bytes
.../4d55d947ccd1d5e1550c0000-70-70/matt-rosoff.jpg | Bin 0 -> 1688 bytes
.../4d9b16cb4bd7c8f464130000-50-50/steve-blank.jpg | Bin 0 -> 1997 bytes
...ne-7-is-the-direct-predecessor-to-windows-8.jpg | Bin 0 -> 1453 bytes
...ntroduced-the-horizontal-and-vertical-menus.jpg | Bin 0 -> 1813 bytes
.../4e258541ccd1d58013270000-60-45/boehner.png | Bin 0 -> 6315 bytes
.../static7.businessinsider.com/robots.txt | 6 +
.../assets/images/logos/logo-bi-vertical.png | Bin 0 -> 3962 bytes
.../assets/images/partners/financial-content.png | Bin 0 -> 5300 bytes
.../assets/js/min.js?1311566655 | 49 +
.../obama-telephone.jpg | Bin 0 -> 1781 bytes
.../4d55d947ccd1d5e1550c0000-50-50/matt-rosoff.jpg | Bin 0 -> 1291 bytes
.../america-online.jpg | Bin 0 -> 1930 bytes
.../4e0cb0384bd7c81559050000-60-60/bensinger.png | Bin 0 -> 7692 bytes
...-off-microsofts-tablet-efforts-back-in-1991.jpg | Bin 0 -> 1782 bytes
...n-attempt-to-create-a-friendlier-pc-desktop.jpg | Bin 0 -> 2618 bytes
...r-took-the-menu-system-and-made-it-portable.jpg | Bin 0 -> 1960 bytes
...-if-the-treasury-has-to-prioritize-payments.jpg | Bin 0 -> 1562 bytes
...t-microsoft-too-understood-touch-interfaces.jpg | Bin 0 -> 1747 bytes
.../static8.businessinsider.com/robots.txt | 6 +
..._section=sai&display_method=default&version=2.0 | 95 +
.../Tracer.js?user=a-j2SKmdSr37y5adbiUzgI&s=142 | 34 +
.../businessinsider-000/tcr.tynt.com/robots.txt | 7 +
...nx[vertical]=sai&openx[author]=Matt+Rosoff.html | 27 +
...nx[vertical]=sai&openx[author]=Matt+Rosoff.html | 27 +
...nx[vertical]=sai&openx[author]=Matt+Rosoff.html | 50 +
.../www.businessinsider.com/partner/fc/iframe.html | 6 +
.../www.businessinsider.com/robots.txt | 6 +
...rosoft-ui-ideas-that-never-took-off-2011-7.html | 3626 ++++++++++++++++++++
...t-ui-ideas-that-never-took-off-2011-7.html.rdbl | 11 +
regression_test_data/cnet-000-orig.html | 777 -----
regression_test_data/cnet-000-rdbl.html | 9 -
regression_test_data/cnet-000.yaml | 4 +-
.../cnet-000/ad.doubleclick.net/robots.txt | 8 +
.../cnet-000/adlog.com.com/robots.txt | 201 ++
.../cnet-000/b.scorecardresearch.com/robots.txt | 2 +
regression_test_data/cnet-000/dw.com.com/js/dw.js | 82 +
.../cnet-000/dw.com.com/robots.txt | 45 +
.../index.html | 767 +++++
.../index.html.rdbl | 5 +
.../cnet-000/howto.cnet.com/robots.txt | 204 ++
.../i.i.com.com/cnwk.1d/Ads/common/adinfo_top.gif | Bin 0 -> 106 bytes
.../cnwk.1d/css/rb/Build/8300/8300.0.0.css | 1 +
.../cnwk.1d/css/rb/Build/8300/8300.39.0.css | 1 +
.../cnwk.1d/css/rb/Build/global/matrix.site39.css | 1 +
.../cnwk.1d/css/rb/Build/print/print.css | 1 +
.../cnwk.1d/css/rb/tron/comments/newsComments.css | 632 ++++
.../cnwk.1d/css/rb/tron/ipadOverwrite.css | 145 +
.../cnet-000/i.i.com.com/cnwk.1d/html/pt/pt2.js | 65 +
.../rb/js/tron/news/news.tron.c3p0.compressed.js | 6 +
.../rb/js/tron/news/news.tron.howto.compressed.js | 6 +
.../js/tron/news/news.tron.pagetools.compressed.js | 6 +
.../html/rb/js/tron/oreo.moo.rb.combined.js | 11 +
.../i/bn/mugs/blog_dennis_oreilly_60x60.png | Bin 0 -> 6793 bytes
.../i.i.com.com/cnwk.1d/i/cbs/print/logoDrkSm.gif | Bin 0 -> 3244 bytes
.../i.i.com.com/cnwk.1d/i/cnet/iconHeartBlue.png | Bin 0 -> 637 bytes
.../i.i.com.com/cnwk.1d/i/cnet/iconHeartGray.png | Bin 0 -> 569 bytes
.../i.i.com.com/cnwk.1d/i/cnet/latestYellowBG.png | Bin 0 -> 781 bytes
.../cnwk.1d/i/cnettv/mobileLive/CNETredball.png | Bin 0 -> 3234 bytes
.../cnwk.1d/i/com/cnet/news/total_user_comment.gif | Bin 0 -> 312 bytes
.../cnwk.1d/i/com/forums/2010/closeModal.jpg | Bin 0 -> 545 bytes
.../i.i.com.com/cnwk.1d/i/frm/ico_cnet.gif | Bin 0 -> 142 bytes
.../i.i.com.com/cnwk.1d/i/gl/icon/rss_22.png | Bin 0 -> 1400 bytes
.../i.i.com.com/cnwk.1d/i/gl/icon/twitter_22.png | Bin 0 -> 854 bytes
.../cnwk.1d/i/mooeditable/emoticon-sprite.gif | Bin 0 -> 4053 bytes
.../mooeditable-toolbarbuttons-silk.png | Bin 0 -> 8979 bytes
.../cnwk.1d/i/mooeditable/tango-toggleview.png | Bin 0 -> 617 bytes
.../cnwk.1d/i/ne/blogs/conversation/convoCta.gif | Bin 0 -> 23218 bytes
.../hdrs/2009/blog_hd_conversation_980x71.gif | Bin 0 -> 36612 bytes
.../i.i.com.com/cnwk.1d/i/ne/extra/poll_hed2.gif | Bin 0 -> 956 bytes
.../cnwk.1d/i/ne/rss/feed-icon-10x10.jpg | Bin 0 -> 893 bytes
.../i.i.com.com/cnwk.1d/i/rb/fb/CNETlogo36x36.png | Bin 0 -> 1932 bytes
.../cnwk.1d/i/rb/fb/cnet_redball_blue_s-36x36.jpg | Bin 0 -> 1770 bytes
.../i.i.com.com/cnwk.1d/i/tiburon/hh/187.gif | Bin 0 -> 53 bytes
.../i.i.com.com/cnwk.1d/i/tiburon/hh/catNav.png | Bin 0 -> 463 bytes
.../cnwk.1d/i/tiburon/hh/catNavArrow.gif | Bin 0 -> 53 bytes
.../cnwk.1d/i/tiburon/hh/catNavArrow2.gif | Bin 0 -> 53 bytes
.../i.i.com.com/cnwk.1d/i/tiburon/hh/catNavIE.jpg | Bin 0 -> 901 bytes
.../i.i.com.com/cnwk.1d/i/tiburon/hh/dot3.gif | Bin 0 -> 45 bytes
.../cnwk.1d/i/tiburon/hh/tptbCorners.gif | Bin 0 -> 445 bytes
.../cnwk.1d/i/tim/2011/07/01/Lion_60x45.png | Bin 0 -> 7010 bytes
.../i/tim/2011/07/04/07_05_11_Audacity1_60x45.jpg | Bin 0 -> 1823 bytes
.../i/tim/2011/07/10/07_11_11_DropBox1_610x399.jpg | Bin 0 -> 100598 bytes
.../2011/07/10/07_11_11_Malwarebytes4_60x45.jpg | Bin 0 -> 1864 bytes
.../22/factory_reset_blackberry_playbook_60x60.png | Bin 0 -> 6663 bytes
.../i/tim/2011/07/23/samsungtab_desktop_60x60.png | Bin 0 -> 8634 bytes
.../i/tim/2011/07/24/07_25_11_Outlook1_60x45.jpg | Bin 0 -> 1556 bytes
.../i/tim/2011/07/24/07_25_11_Outlook1_60x60.jpg | Bin 0 -> 2055 bytes
.../07/24/2_OSX_LION_SIGNATURE_PREVIEW_60x60.png | Bin 0 -> 3896 bytes
.../i/tim/2011/07/24/5_iOS_Voice_Memo_60x60.PNG | Bin 0 -> 2013 bytes
.../i/tim/2011/07/24/Plays_offline_60x60.png | Bin 0 -> 7623 bytes
.../07/24/samsungtab10.1_wallpaper13_60x60.png | Bin 0 -> 9130 bytes
.../2011/07/25/1_iOS5_Private_Browsing_60x60.png | Bin 0 -> 6664 bytes
.../i/tim/2011/07/25/Voyager-AVG-promo_60x60.jpg | Bin 0 -> 2039 bytes
.../i/tim/2011/07/25/iPhoto_keywords_60x60.png | Bin 0 -> 3662 bytes
.../2011/07/26/ht_MCandLP-720_512x288_60x60.jpg | Bin 0 -> 7338 bytes
.../i/tim/2011/07/26/iPhoto_slideshow_1_60x60.png | Bin 0 -> 5792 bytes
.../cnwk.1d/i/tron/cnetToolbar/horizListLine.png | Bin 0 -> 313 bytes
.../cnwk.1d/i/tron/cnetToolbar/listItemBkg.png | Bin 0 -> 227 bytes
.../i/tron/cnetToolbar/listItemSelectBkg.png | Bin 0 -> 247 bytes
.../cnwk.1d/i/tron/cnetToolbar/popupArrow.png | Bin 0 -> 431 bytes
.../cnwk.1d/i/tron/cnetToolbar/popupBkg.png | Bin 0 -> 11169 bytes
.../cnwk.1d/i/tron/cnetToolbar/redball.png | Bin 0 -> 1585 bytes
.../cnwk.1d/i/tron/cnetToolbar/refreshIcon.gif | Bin 0 -> 13228 bytes
.../cnwk.1d/i/tron/cnetToolbar/scrollbarArrows.png | Bin 0 -> 407 bytes
.../cnwk.1d/i/tron/cnetToolbar/scrollbarHoriz.png | Bin 0 -> 783 bytes
.../cnwk.1d/i/tron/cnetToolbar/scrollbarVert.png | Bin 0 -> 877 bytes
.../cnwk.1d/i/tron/cnetToolbar/selectorsSprite.png | Bin 0 -> 601 bytes
.../cnwk.1d/i/tron/cnetToolbar/smlListBkg.png | Bin 0 -> 500 bytes
.../cnwk.1d/i/tron/cnetToolbar/toolbarAccents.png | Bin 0 -> 1074 bytes
.../cnwk.1d/i/tron/cnetToolbar/toolbarBkg2.png | Bin 0 -> 3618 bytes
.../cnwk.1d/i/tron/cnetToolbar/vertListBkg.png | Bin 0 -> 1127 bytes
.../cnwk.1d/i/tron/cnetToolbar/vertListLine.png | Bin 0 -> 200 bytes
.../cnwk.1d/i/tron/community/myListsSprite.gif | Bin 0 -> 7494 bytes
.../cnwk.1d/i/tron/community/myListsSprite.png | Bin 0 -> 7583 bytes
.../cnwk.1d/i/tron/community/scrollbarBg.png | Bin 0 -> 174 bytes
.../i.i.com.com/cnwk.1d/i/tron/fbFavIcon.png | Bin 0 -> 153 bytes
.../cnwk.1d/i/tron/features/ces10/nbt/pollBar.gif | Bin 0 -> 2229 bytes
.../cnwk.1d/i/tron/features/ces10/nbt/pollTip.gif | Bin 0 -> 494 bytes
.../cnwk.1d/i/tron/features/ces10/nbt/pollTip.png | Bin 0 -> 1079 bytes
.../i.i.com.com/cnwk.1d/i/tron/gallery/btns.png | Bin 0 -> 893 bytes
.../i.i.com.com/cnwk.1d/i/tron/gwhBkg2.gif | Bin 0 -> 1035 bytes
.../i.i.com.com/cnwk.1d/i/tron/icon/close_x.gif | Bin 0 -> 586 bytes
.../i.i.com.com/cnwk.1d/i/tron/icon/cnet16x16.gif | Bin 0 -> 650 bytes
.../i.i.com.com/cnwk.1d/i/tron/icon/collapse.gif | Bin 0 -> 533 bytes
.../i.i.com.com/cnwk.1d/i/tron/icon/comments.gif | Bin 0 -> 622 bytes
.../cnwk.1d/i/tron/icon/delicious_16x16.gif | Bin 0 -> 113 bytes
.../i.i.com.com/cnwk.1d/i/tron/icon/digg_16x16.gif | Bin 0 -> 247 bytes
.../i.i.com.com/cnwk.1d/i/tron/icon/email.gif | Bin 0 -> 356 bytes
.../i.i.com.com/cnwk.1d/i/tron/icon/expand.gif | Bin 0 -> 533 bytes
.../cnwk.1d/i/tron/icon/facebook_16x16.gif | Bin 0 -> 121 bytes
.../cnwk.1d/i/tron/icon/fontSizeLarge.gif | Bin 0 -> 1049 bytes
.../cnwk.1d/i/tron/icon/fontSizeSmall.gif | Bin 0 -> 1046 bytes
.../cnwk.1d/i/tron/icon/googleig_16x16.gif | Bin 0 -> 676 bytes
.../cnwk.1d/i/tron/icon/linkedin_16x16.gif | Bin 0 -> 2184 bytes
.../i.i.com.com/cnwk.1d/i/tron/icon/listIcon.gif | Bin 0 -> 387 bytes
.../i.i.com.com/cnwk.1d/i/tron/icon/modalClose.gif | Bin 0 -> 789 bytes
.../cnwk.1d/i/tron/icon/n-users-large.gif | Bin 0 -> 1794 bytes
.../i.i.com.com/cnwk.1d/i/tron/icon/n-users-sm.gif | Bin 0 -> 1682 bytes
.../i.i.com.com/cnwk.1d/i/tron/icon/neo187.png | Bin 0 -> 165 bytes
.../cnwk.1d/i/tron/icon/newsvine_16x16.gif | Bin 0 -> 85 bytes
.../cnwk.1d/i/tron/icon/padlock_16x16.gif | Bin 0 -> 247 bytes
.../i.i.com.com/cnwk.1d/i/tron/icon/postTools.gif | Bin 0 -> 859 bytes
.../i.i.com.com/cnwk.1d/i/tron/icon/print.gif | Bin 0 -> 222 bytes
.../cnwk.1d/i/tron/icon/ratingStars.gif | Bin 0 -> 1755 bytes
.../cnwk.1d/i/tron/icon/ratingStarsSm.gif | Bin 0 -> 1398 bytes
.../cnwk.1d/i/tron/icon/reddit_16x16.gif | Bin 0 -> 603 bytes
.../i.i.com.com/cnwk.1d/i/tron/icon/share.gif | Bin 0 -> 340 bytes
.../cnwk.1d/i/tron/icon/stumble_16x16.gif | Bin 0 -> 432 bytes
.../i.i.com.com/cnwk.1d/i/tron/icon/thumbsUp.gif | Bin 0 -> 989 bytes
.../cnwk.1d/i/tron/icon/twitter_16x16.gif | Bin 0 -> 361 bytes
.../cnwk.1d/i/tron/icon/twitter_grey_16x16.gif | Bin 0 -> 591 bytes
.../cnwk.1d/i/tron/icon/yahoo_bkmks_16x16.gif | Bin 0 -> 346 bytes
.../cnwk.1d/i/tron/modals/modalIcons.png | Bin 0 -> 7941 bytes
.../cnwk.1d/i/tron/oreo/2011RbLogo.24.png | Bin 0 -> 4592 bytes
.../cnwk.1d/i/tron/oreo/2011RbLogo.8.png | Bin 0 -> 3024 bytes
.../cnwk.1d/i/tron/oreo/footerSponsorArw.png | Bin 0 -> 167 bytes
.../cnwk.1d/i/tron/oreo/footerSponsorArw8.png | Bin 0 -> 125 bytes
.../cnwk.1d/i/tron/oreo/hgg/BoldOgif.png | Bin 0 -> 158 bytes
.../i.i.com.com/cnwk.1d/i/tron/oreo/nav.png | Bin 0 -> 1091 bytes
.../i.i.com.com/cnwk.1d/i/tron/oreo/rbBodyBg.2.png | Bin 0 -> 378 bytes
.../i.i.com.com/cnwk.1d/i/tron/oreo/rbHow.png | Bin 0 -> 650 bytes
.../i.i.com.com/cnwk.1d/i/tron/oreo/sponsBlue.png | Bin 0 -> 310 bytes
.../cnwk.1d/i/tron/oreo/yodaGradients.png | Bin 0 -> 1131 bytes
.../cnwk.1d/i/tron/premiereUnits/gradientBG.gif | Bin 0 -> 1001 bytes
.../cnwk.1d/i/tron/premiereUnits/machoGradient.jpg | Bin 0 -> 368 bytes
.../cnwk.1d/i/tron/premiereUnits/miniMachoBG.gif | Bin 0 -> 965 bytes
.../cnwk.1d/i/tron/reviews/FD/reviewsSprite.gif | Bin 0 -> 1046 bytes
.../cnwk.1d/i/tron/reviews/whiteArrows.png | Bin 0 -> 1286 bytes
.../i/tron/sb_sweepstakes/sb_confHeader.jpg | Bin 0 -> 22294 bytes
.../i.i.com.com/cnwk.1d/i/tron/shareBg.png | Bin 0 -> 179 bytes
.../i.i.com.com/cnwk.1d/i/tron/shareBgBtm.png | Bin 0 -> 233 bytes
.../i.i.com.com/cnwk.1d/i/tron/site1catNav.gif | Bin 0 -> 3009 bytes
.../i.i.com.com/cnwk.1d/i/tron/site1catNav.png | Bin 0 -> 2678 bytes
.../cnwk.1d/i/tron/site7catNav-wide-tan.png | Bin 0 -> 394 bytes
.../i.i.com.com/cnwk.1d/i/tron/statusOr.jpg | Bin 0 -> 1194 bytes
.../cnet-000/i.i.com.com/cnwk.1d/i/tron/tips.png | Bin 0 -> 1058 bytes
.../i.i.com.com/cnwk.1d/i/tron/tipsWide.png | Bin 0 -> 1329 bytes
.../cnwk.1d/i/tron/userLists/newlist_bg.jpg | Bin 0 -> 15098 bytes
.../cnwk.1d/i/tron/vader/dottedLine.gif | Bin 0 -> 45 bytes
.../cnwk.1d/i/tron/vader/rblogoFooter.gif | Bin 0 -> 1307 bytes
.../cnwk.1d/i/tron/vader/rssFeeds/bloglines.gif | Bin 0 -> 1874 bytes
.../cnwk.1d/i/tron/vader/rssFeeds/google.gif | Bin 0 -> 1958 bytes
.../cnwk.1d/i/tron/vader/rssFeeds/msn.gif | Bin 0 -> 1850 bytes
.../cnwk.1d/i/tron/vader/rssFeeds/newsgator.gif | Bin 0 -> 1821 bytes
.../cnwk.1d/i/tron/vader/rssFeeds/yahoo.gif | Bin 0 -> 1734 bytes
.../cnwk.1d/i/tron/vader/siteId3headerBar.gif | Bin 0 -> 388 bytes
.../cnwk.1d/i/tron/yoda/yodaPopupDivider.gif | Bin 0 -> 43 bytes
.../cnwk.1d/i/tron/yoda/yodaPopupDividerDark.gif | Bin 0 -> 56 bytes
.../cnet-000/platform.linkedin.com/in.js | 68 +
.../cnet-000/platform.twitter.com/widgets.js | 13 +
.../m1?ci=us-304254h&cg=0&cc=1&ts=noscript&ja=1 | Bin 0 -> 44 bytes
.../cnet-000/www.facebook.com/robots.txt | 124 +
regression_test_data/cnet-001.yaml | 5 +
.../cnet-001/adlog.com.com/robots.txt | 201 ++
.../cnet-001/b.scorecardresearch.com/robots.txt | 2 +
regression_test_data/cnet-001/dw.com.com/js/dw.js | 82 +
.../cnet-001/dw.com.com/robots.txt | 45 +
.../i.i.com.com/cnwk.1d/Ads/common/adinfo_top.gif | Bin 0 -> 106 bytes
.../cnwk.1d/Ads/common/manta/adFunctions-cnet.js | 2 +
.../cnwk.1d/css/news/gradient-white.png | Bin 0 -> 145 bytes
.../cnwk.1d/css/rb/Build/8301/8301.3.0.css | 1 +
.../cnwk.1d/css/rb/Build/8601/8601.3.0.css | 1 +
.../cnwk.1d/css/rb/Build/global/matrix.site3.css | 1 +
.../cnwk.1d/css/rb/Build/print/print.css | 1 +
.../cnwk.1d/css/rb/tron/comments/newsComments.css | 632 ++++
.../cnwk.1d/css/rb/tron/ipadOverwrite.css | 145 +
.../cnwk.1d/css/rb/tron/news/riverWidget.css | 101 +
.../cnet-001/i.i.com.com/cnwk.1d/html/pt/pt2.js | 65 +
.../cnwk.1d/html/rb/js/tron/cbsi/head.min.js | 8 +
.../rb/js/tron/news/news.tron.c3p0.compressed.js | 6 +
.../js/tron/news/news.tron.pagetools.compressed.js | 6 +
.../html/rb/js/tron/oreo.moo.rb.combined.js | 11 +
.../i/News/FrontDoor/about_box_iconsspriteNew.gif | Bin 0 -> 6927 bytes
.../i.i.com.com/cnwk.1d/i/cbs/print/logoDrkSm.gif | Bin 0 -> 3244 bytes
.../cnwk.1d/i/cnettv/mobileLive/CNETredball.png | Bin 0 -> 3234 bytes
.../cnwk.1d/i/com/cnet/news/total_user_comment.gif | Bin 0 -> 312 bytes
.../cnwk.1d/i/com/forums/2010/closeModal.jpg | Bin 0 -> 545 bytes
.../i.i.com.com/cnwk.1d/i/frm/ico_cnet.gif | Bin 0 -> 142 bytes
.../cnwk.1d/i/gl/icon/feed-icon-28x28.png | Bin 0 -> 1737 bytes
.../cnwk.1d/i/ne/blogs/conversation/convoCta.gif | Bin 0 -> 23218 bytes
.../hdrs/2009/blog_hd_conversation_980x71.gif | Bin 0 -> 36612 bytes
.../hdrs/2009/blog_hd_deeptech_complex_980x70.jpg | Bin 0 -> 54870 bytes
.../i.i.com.com/cnwk.1d/i/ne/extra/poll_hed2.gif | Bin 0 -> 956 bytes
.../cnwk.1d/i/ne/pg/fd_2008/081104_iphone.jpg | Bin 0 -> 14432 bytes
.../cnwk.1d/i/ne/pg/fd_2011/newsLGThrill3d.jpg | Bin 0 -> 12595 bytes
.../cnwk.1d/i/ne/rss/feed-icon-10x10.jpg | Bin 0 -> 893 bytes
.../i.i.com.com/cnwk.1d/i/rb/fb/CNETlogo36x36.png | Bin 0 -> 1932 bytes
.../cnwk.1d/i/rb/fb/cnet_redball_blue_s-36x36.jpg | Bin 0 -> 1770 bytes
.../cnwk.1d/i/river/v2/icons-sprite.gif | Bin 0 -> 2945 bytes
.../cnwk.1d/i/river/v3/rightRail-tools.gif | Bin 0 -> 3618 bytes
.../i.i.com.com/cnwk.1d/i/tiburon/hh/187.gif | Bin 0 -> 53 bytes
.../i.i.com.com/cnwk.1d/i/tiburon/hh/catNav.png | Bin 0 -> 463 bytes
.../cnwk.1d/i/tiburon/hh/catNavArrow.gif | Bin 0 -> 53 bytes
.../cnwk.1d/i/tiburon/hh/catNavArrow2.gif | Bin 0 -> 53 bytes
.../i.i.com.com/cnwk.1d/i/tiburon/hh/catNavIE.jpg | Bin 0 -> 901 bytes
.../i.i.com.com/cnwk.1d/i/tiburon/hh/dot3.gif | Bin 0 -> 45 bytes
.../cnwk.1d/i/tiburon/hh/tptbCorners.gif | Bin 0 -> 445 bytes
.../cnwk.1d/i/tim/2009/12/21/arrow_news.gif | Bin 0 -> 343 bytes
.../2011/04/04/blog_stephen_shankland_60x60.png | Bin 0 -> 6910 bytes
.../2011/07/12/Knol-Google-verified-account.jpg | Bin 0 -> 25652 bytes
.../i/tim/2011/07/12/Lady-Gaga-Google+_270x197.png | Bin 0 -> 19021 bytes
.../2011/07/21/Classroom_in_Accra_7_2_120x90.JPG | Bin 0 -> 11348 bytes
.../i/tim/2011/07/23/NCISpanelSTILL_120x90.jpg | Bin 0 -> 11783 bytes
.../i/tim/2011/07/23/PanelNCISstill_120x90.jpg | Bin 0 -> 11183 bytes
.../2011/07/24/First_BMW_car_3-15_PS_120x80.jpg | Bin 0 -> 15894 bytes
.../cnwk.1d/i/tron/cnetToolbar/horizListLine.png | Bin 0 -> 313 bytes
.../cnwk.1d/i/tron/cnetToolbar/listItemBkg.png | Bin 0 -> 227 bytes
.../i/tron/cnetToolbar/listItemSelectBkg.png | Bin 0 -> 247 bytes
.../cnwk.1d/i/tron/cnetToolbar/popupArrow.png | Bin 0 -> 431 bytes
.../cnwk.1d/i/tron/cnetToolbar/popupBkg.png | Bin 0 -> 11169 bytes
.../cnwk.1d/i/tron/cnetToolbar/redball.png | Bin 0 -> 1585 bytes
.../cnwk.1d/i/tron/cnetToolbar/refreshIcon.gif | Bin 0 -> 13228 bytes
.../cnwk.1d/i/tron/cnetToolbar/scrollbarArrows.png | Bin 0 -> 407 bytes
.../cnwk.1d/i/tron/cnetToolbar/scrollbarHoriz.png | Bin 0 -> 783 bytes
.../cnwk.1d/i/tron/cnetToolbar/scrollbarVert.png | Bin 0 -> 877 bytes
.../cnwk.1d/i/tron/cnetToolbar/selectorsSprite.png | Bin 0 -> 601 bytes
.../cnwk.1d/i/tron/cnetToolbar/smlListBkg.png | Bin 0 -> 500 bytes
.../cnwk.1d/i/tron/cnetToolbar/toolbarAccents.png | Bin 0 -> 1074 bytes
.../cnwk.1d/i/tron/cnetToolbar/toolbarBkg2.png | Bin 0 -> 3618 bytes
.../cnwk.1d/i/tron/cnetToolbar/vertListBkg.png | Bin 0 -> 1127 bytes
.../cnwk.1d/i/tron/cnetToolbar/vertListLine.png | Bin 0 -> 200 bytes
.../cnwk.1d/i/tron/community/myListsSprite.gif | Bin 0 -> 7494 bytes
.../cnwk.1d/i/tron/community/myListsSprite.png | Bin 0 -> 7583 bytes
.../cnwk.1d/i/tron/community/scrollbarBg.png | Bin 0 -> 174 bytes
.../i.i.com.com/cnwk.1d/i/tron/fbFavIcon.png | Bin 0 -> 153 bytes
.../cnwk.1d/i/tron/features/ces10/nbt/pollBar.gif | Bin 0 -> 2229 bytes
.../cnwk.1d/i/tron/features/ces10/nbt/pollTip.gif | Bin 0 -> 494 bytes
.../cnwk.1d/i/tron/features/ces10/nbt/pollTip.png | Bin 0 -> 1079 bytes
.../cnwk.1d/i/tron/gallery/arrowsLeft.png | Bin 0 -> 451 bytes
.../cnwk.1d/i/tron/gallery/arrowsRight.png | Bin 0 -> 450 bytes
.../i.i.com.com/cnwk.1d/i/tron/gallery/btns.png | Bin 0 -> 893 bytes
.../i.i.com.com/cnwk.1d/i/tron/gwhBkg2.gif | Bin 0 -> 1035 bytes
.../i.i.com.com/cnwk.1d/i/tron/icon/close_x.gif | Bin 0 -> 586 bytes
.../i.i.com.com/cnwk.1d/i/tron/icon/cnet16x16.gif | Bin 0 -> 650 bytes
.../i.i.com.com/cnwk.1d/i/tron/icon/collapse.gif | Bin 0 -> 533 bytes
.../i.i.com.com/cnwk.1d/i/tron/icon/comments.gif | Bin 0 -> 622 bytes
.../cnwk.1d/i/tron/icon/delicious_16x16.gif | Bin 0 -> 113 bytes
.../i.i.com.com/cnwk.1d/i/tron/icon/digg_16x16.gif | Bin 0 -> 247 bytes
.../i.i.com.com/cnwk.1d/i/tron/icon/email.gif | Bin 0 -> 356 bytes
.../i.i.com.com/cnwk.1d/i/tron/icon/expand.gif | Bin 0 -> 533 bytes
.../cnwk.1d/i/tron/icon/facebook_16x16.gif | Bin 0 -> 121 bytes
.../cnwk.1d/i/tron/icon/fontSizeLarge.gif | Bin 0 -> 1049 bytes
.../cnwk.1d/i/tron/icon/fontSizeSmall.gif | Bin 0 -> 1046 bytes
.../cnwk.1d/i/tron/icon/googleig_16x16.gif | Bin 0 -> 676 bytes
.../cnwk.1d/i/tron/icon/linkedin_16x16.gif | Bin 0 -> 2184 bytes
.../i.i.com.com/cnwk.1d/i/tron/icon/listIcon.gif | Bin 0 -> 387 bytes
.../i.i.com.com/cnwk.1d/i/tron/icon/modalClose.gif | Bin 0 -> 789 bytes
.../cnwk.1d/i/tron/icon/n-users-large.gif | Bin 0 -> 1794 bytes
.../i.i.com.com/cnwk.1d/i/tron/icon/n-users-sm.gif | Bin 0 -> 1682 bytes
.../i.i.com.com/cnwk.1d/i/tron/icon/neo187.png | Bin 0 -> 165 bytes
.../cnwk.1d/i/tron/icon/newsvine_16x16.gif | Bin 0 -> 85 bytes
.../cnwk.1d/i/tron/icon/padlock_16x16.gif | Bin 0 -> 247 bytes
.../i.i.com.com/cnwk.1d/i/tron/icon/postTools.gif | Bin 0 -> 859 bytes
.../i.i.com.com/cnwk.1d/i/tron/icon/print.gif | Bin 0 -> 222 bytes
.../cnwk.1d/i/tron/icon/ratingStars.gif | Bin 0 -> 1755 bytes
.../cnwk.1d/i/tron/icon/ratingStarsSm.gif | Bin 0 -> 1398 bytes
.../cnwk.1d/i/tron/icon/reddit_16x16.gif | Bin 0 -> 603 bytes
.../i.i.com.com/cnwk.1d/i/tron/icon/share.gif | Bin 0 -> 340 bytes
.../cnwk.1d/i/tron/icon/stumble_16x16.gif | Bin 0 -> 432 bytes
.../i.i.com.com/cnwk.1d/i/tron/icon/thumbsUp.gif | Bin 0 -> 989 bytes
.../cnwk.1d/i/tron/icon/twitter_16x16.gif | Bin 0 -> 361 bytes
.../cnwk.1d/i/tron/icon/twitter_grey_16x16.gif | Bin 0 -> 591 bytes
.../cnwk.1d/i/tron/icon/yahoo_bkmks_16x16.gif | Bin 0 -> 346 bytes
.../cnwk.1d/i/tron/modals/modalIcons.png | Bin 0 -> 7941 bytes
.../cnwk.1d/i/tron/news/inTheNews_btm.jpg | Bin 0 -> 454 bytes
.../cnwk.1d/i/tron/news/inTheNews_top.jpg | Bin 0 -> 5771 bytes
.../cnwk.1d/i/tron/oreo/2011RbLogo.24.png | Bin 0 -> 4592 bytes
.../cnwk.1d/i/tron/oreo/2011RbLogo.8.png | Bin 0 -> 3024 bytes
.../cnwk.1d/i/tron/oreo/footerSponsorArw.png | Bin 0 -> 167 bytes
.../cnwk.1d/i/tron/oreo/footerSponsorArw8.png | Bin 0 -> 125 bytes
.../cnwk.1d/i/tron/oreo/hgg/BoldOgif.png | Bin 0 -> 158 bytes
.../i.i.com.com/cnwk.1d/i/tron/oreo/nav.png | Bin 0 -> 1091 bytes
.../i.i.com.com/cnwk.1d/i/tron/oreo/rbBodyBg.2.png | Bin 0 -> 378 bytes
.../i.i.com.com/cnwk.1d/i/tron/oreo/rbNews.png | Bin 0 -> 836 bytes
.../i.i.com.com/cnwk.1d/i/tron/oreo/sponsBlue.png | Bin 0 -> 310 bytes
.../cnwk.1d/i/tron/oreo/yodaGradients.png | Bin 0 -> 1131 bytes
.../cnwk.1d/i/tron/premiereUnits/gradientBG.gif | Bin 0 -> 1001 bytes
.../cnwk.1d/i/tron/premiereUnits/machoGradient.jpg | Bin 0 -> 368 bytes
.../cnwk.1d/i/tron/premiereUnits/miniMachoBG.gif | Bin 0 -> 965 bytes
.../cnwk.1d/i/tron/reviews/FD/reviewsSprite.gif | Bin 0 -> 1046 bytes
.../i/tron/sb_sweepstakes/sb_confHeader.jpg | Bin 0 -> 22294 bytes
.../i.i.com.com/cnwk.1d/i/tron/shareBg.png | Bin 0 -> 179 bytes
.../i.i.com.com/cnwk.1d/i/tron/shareBgBtm.png | Bin 0 -> 233 bytes
.../cnwk.1d/i/tron/site7catNav-wide-tan.png | Bin 0 -> 394 bytes
.../i.i.com.com/cnwk.1d/i/tron/statusOr.jpg | Bin 0 -> 1194 bytes
.../cnet-001/i.i.com.com/cnwk.1d/i/tron/tips.png | Bin 0 -> 1058 bytes
.../i.i.com.com/cnwk.1d/i/tron/tipsWide.png | Bin 0 -> 1329 bytes
.../cnwk.1d/i/tron/userLists/newlist_bg.jpg | Bin 0 -> 15098 bytes
.../cnwk.1d/i/tron/vader/dottedLine.gif | Bin 0 -> 45 bytes
.../cnwk.1d/i/tron/vader/rblogoFooter.gif | Bin 0 -> 1307 bytes
.../cnwk.1d/i/tron/vader/rssFeeds/bloglines.gif | Bin 0 -> 1874 bytes
.../cnwk.1d/i/tron/vader/rssFeeds/google.gif | Bin 0 -> 1958 bytes
.../cnwk.1d/i/tron/vader/rssFeeds/msn.gif | Bin 0 -> 1850 bytes
.../cnwk.1d/i/tron/vader/rssFeeds/newsgator.gif | Bin 0 -> 1821 bytes
.../cnwk.1d/i/tron/vader/rssFeeds/yahoo.gif | Bin 0 -> 1734 bytes
.../cnwk.1d/i/tron/vader/siteId3headerBar.gif | Bin 0 -> 388 bytes
.../cnwk.1d/i/tron/yoda/yodaPopupDivider.gif | Bin 0 -> 43 bytes
.../cnwk.1d/i/tron/yoda/yodaPopupDividerDark.gif | Bin 0 -> 56 bytes
.../cnet-001/js.admeld.com/meld120.js | 102 +
.../cnet-001/js.admeld.com/meld128.js | 119 +
.../index.html | 2262 ++++++++++++
.../index.html.rdbl | 54 +
...yId=2140&targetCommunityId=2140&blogId=264.html | 364 ++
.../cnet-001/news.cnet.com/favicon.ico | Bin 0 -> 24838 bytes
.../cnet-001/news.cnet.com/robots.txt | 204 ++
.../cnet-001/platform.linkedin.com/in.js | 68 +
.../cnet-001/platform.twitter.com/widgets.js | 13 +
.../m1?ci=us-304254h&cg=0&cc=1&ts=noscript&ja=1 | Bin 0 -> 44 bytes
.../Tracer.js?user=cry3Q6LBqr37zJadbi-bnq | 34 +
.../cnet-001/tcr.tynt.com/robots.txt | 7 +
.../cnet-001/twitter.com/favicon.ico | Bin 0 -> 1150 bytes
.../cnet-001/twitter.com/robots.txt | 24 +
.../i/tron/features/ces10/river/picBgSm.gif | Bin 0 -> 464 bytes
.../cnet-001/www.cnet.com/robots.txt | 204 ++
.../cnet-001/www.facebook.com/robots.txt | 124 +
regression_test_data/deadspin-000-orig.html | 1011 ------
regression_test_data/deadspin-000-rdbl.html | 9 -
regression_test_data/deadspin-000.yaml | 2 +
.../b.scorecardresearch.com/robots.txt | 2 +
.../assets/base.v9/img/icons/icons.png | Bin 0 -> 14274 bytes
.../assets/images/11/2011/07/medium_003.jpg | Bin 0 -> 37273 bytes
.../11/2011/07/medium_funbag_deathbutton.jpg | Bin 0 -> 32520 bytes
.../images/11/2011/07/medium_gc_5q2--zfe.jpg | Bin 0 -> 11202 bytes
.../assets/images/11/2011/07/medium_horrible.png | Bin 0 -> 63323 bytes
.../assets/images/11/2011/07/medium_imag0157.jpg | Bin 0 -> 27403 bytes
.../assets/images/11/2011/07/medium_photo_02.jpg | Bin 0 -> 34664 bytes
.../assets/images/commenter/1610000/1610103_32.jpg | Bin 0 -> 1447 bytes
.../deadspin-000/cache.gawkerassets.com/robots.txt | 1 +
...would-you-kill-a-stranger-to-save-football.html | 1083 ++++++
...-you-kill-a-stranger-to-save-football.html.rdbl | 9 +
.../assets/base/img/favicon/deadspin.ico | Bin 0 -> 9062 bytes
.../deadspin-000/deadspin.com/at.js.php | 1 +
.../deadspin-000/deadspin.com/robots.txt | 9 +
.../deadspin-000/edge.quantserve.com/robots.txt | 2 +
.../base.v10.static/img/chrome-arrow-sol.png | Bin 0 -> 989 bytes
.../base.v10.static/img/footer/deadspin_logo.png | Bin 0 -> 2658 bytes
.../base.v10.static/img/footer/gawker_logo.png | Bin 0 -> 3089 bytes
.../base.v10.static/img/footer/gizmodo_logo.png | Bin 0 -> 2689 bytes
.../assets/base.v10.static/img/footer/io9_logo.png | Bin 0 -> 1165 bytes
.../base.v10.static/img/footer/jalopnik_logo.png | Bin 0 -> 2032 bytes
.../base.v10.static/img/footer/jezebel_logo.png | Bin 0 -> 1773 bytes
.../base.v10.static/img/footer/kotaku_logo.png | Bin 0 -> 2422 bytes
.../base.v10.static/img/footer/lifehacker_logo.png | Bin 0 -> 1882 bytes
.../assets/base.v10.static/img/icon-play-large.png | Bin 0 -> 2552 bytes
.../assets/base.v10.static/img/icons/comment.png | Bin 0 -> 396 bytes
.../assets/base.v10.static/img/icons/flame.png | Bin 0 -> 332 bytes
.../base.v10.static/img/icons/icon.classic.png | Bin 0 -> 417 bytes
.../assets/base.v10.static/img/icons/icons.png | Bin 0 -> 10614 bytes
.../assets/base.v10.static/img/icons/quicklink.png | Bin 0 -> 219 bytes
.../assets/base.v10.static/img/icons/star.png | Bin 0 -> 976 bytes
.../img/interstitial-bottom-gradient.png | Bin 0 -> 493 bytes
.../assets/base.v10.static/img/lytebox/blank.gif | Bin 0 -> 43 bytes
.../base.v10.static/img/lytebox/close_grey.png | Bin 0 -> 1715 bytes
.../assets/base.v10.static/img/lytebox/loading.gif | Bin 0 -> 2767 bytes
.../base.v10.static/img/lytebox/next_grey.gif | Bin 0 -> 731 bytes
.../base.v10.static/img/lytebox/pause_grey.png | Bin 0 -> 1282 bytes
.../base.v10.static/img/lytebox/play_grey.png | Bin 0 -> 1178 bytes
.../base.v10.static/img/lytebox/prev_grey.gif | Bin 0 -> 748 bytes
.../base.v10.static/img/share/share_icons.png | Bin 0 -> 8780 bytes
.../assets/base.v10.static/img/top-chrome.png | Bin 0 -> 410 bytes
.../assets/base.v10.static/img/ui/button-icons.png | Bin 0 -> 414 bytes
.../assets/base.v10.static/img/ui/icon-b.png | Bin 0 -> 281 bytes
.../assets/base.v10.static/img/ui/icon-cog.png | Bin 0 -> 1035 bytes
.../assets/base.v10.static/img/ui/icon-edit.png | Bin 0 -> 1074 bytes
.../assets/base.v10.static/img/ui/icon-expand.png | Bin 0 -> 1193 bytes
.../assets/base.v10.static/img/ui/icon-f.png | Bin 0 -> 287 bytes
.../assets/base.v10.static/img/ui/icon-ff.png | Bin 0 -> 294 bytes
.../assets/base.v10.static/img/ui/icon-heart.png | Bin 0 -> 822 bytes
.../assets/base.v10.static/img/ui/icon-home.png | Bin 0 -> 270 bytes
.../base.v10.static/img/ui/icon-play-gray.png | Bin 0 -> 489 bytes
.../assets/base.v10.static/img/ui/icon-popular.png | Bin 0 -> 346 bytes
.../assets/base.v10.static/img/ui/icon-reply.png | Bin 0 -> 422 bytes
.../assets/base.v10.static/img/ui/icon-rw.png | Bin 0 -> 292 bytes
.../assets/base.v10.static/img/ui/icon-search.png | Bin 0 -> 557 bytes
.../base.v10.static/img/ui/icon-thumbdown.png | Bin 0 -> 1040 bytes
.../assets/base.v10.static/img/ui/icon-thumbup.png | Bin 0 -> 950 bytes
.../base.v10.static/js/scripts.js?rev=20110726 | 159 +
.../static/base.v10.static.20110726.css | 11 +
.../static/base.v10.static.framework.20110726.js | 5 +
.../static/base.v10.static.jquery.20110726.js | 42 +
.../base.v10.static.jqueryplugin.20110726.js | 18 +
.../static/base.v10.static.misc.20110726.js | 31 +
.../static/base.v10.static.widget.20110726.js | 36 +
.../assets/base.v10/img/bg-chrome.jpg | Bin 0 -> 3618 bytes
.../assets/base.v10/img/greyvertical.png | Bin 0 -> 163 bytes
.../assets/base.v10/img/icons/rightbar.flame.png | Bin 0 -> 332 bytes
.../img/indicator/progressIndicator_roller.gif | Bin 0 -> 1877 bytes
.../assets/base.v10/img/ui/arrow-down.png | Bin 0 -> 207 bytes
.../assets/base.v10/img/ui/arrow-right.png | Bin 0 -> 202 bytes
.../assets/base.v10/img/ui/icon-delete.png | Bin 0 -> 276 bytes
.../assets/base.v10/img/ui/icon-private.png | Bin 0 -> 441 bytes
.../assets/base.v10/img/ui/userpost-image-sm.png | Bin 0 -> 1266 bytes
.../assets/base.v10/img/ui/userpost-text-sm.png | Bin 0 -> 584 bytes
.../assets/base.v10/img/ui/userpost-video-sm.png | Bin 0 -> 931 bytes
.../images/11/2011/07/funbag_deathbutton.jpg | Bin 0 -> 267730 bytes
.../images/11/2011/07/small_funbag_deathbutton.jpg | Bin 0 -> 15596 bytes
.../assets/images/11/2011/07/small_irvinout3.jpg | Bin 0 -> 21380 bytes
.../images/11/2011/07/small_mccourt_divorce.jpg | Bin 0 -> 15426 bytes
.../13255/2011/05/small_deadspin-640x360.jpg | Bin 0 -> 10204 bytes
.../13255/2011/06/small_deadspin-640x360.jpg | Bin 0 -> 10204 bytes
.../images/13255/2011/06/small_pepsi_cars.jpg | Bin 0 -> 18497 bytes
.../css/static.css?rev=20110726.css | 22 +
.../css/static.css?rev=20110726orig | 22 +
.../v10.deadspin.com/img/apple-touch-icon.png | Bin 0 -> 8400 bytes
.../assets/v10.deadspin.com/img/logo-deadspin.png | Bin 0 -> 46196 bytes
.../fastcache.gawkerassets.com/robots.txt | 1 +
.../deadspin-000/fonts.gawker.com/tbb4flu.js | 63 +
.../deadspin-000/platform.tumblr.com/v1/share.js | 1 +
.../images/pdsimple-votebutton.gif | Bin 0 -> 540 bytes
.../deadspin-000/www.google.com/jsapi | 39 +
.../deadspin-000/www.google.com/robots.txt | 271 ++
regression_test_data/espn-000-orig.html | 993 ------
regression_test_data/espn-000-rdbl.html | 31 -
regression_test_data/espn-000.yaml | 2 +
.../combiner/c?css=community%2Fecho.1.2.7.css | 1 +
.../combiner/c?css=espn.teams.r4i.css | 1 +
...plane.0.0.0.js,community%2Fecho%2Fauth.0.0.7.js | 33 +
.../boston/prod/assets/mod_alsosee.gif | Bin 0 -> 6138 bytes
.../newyork/prod/assets/mod_alsosee.gif | Bin 0 -> 8061 bytes
.../a.espncdn.com/i/columnists/munson_lester_m.jpg | Bin 0 -> 3175 bytes
.../i/mlb/playerpopup/popup_middle.png | Bin 0 -> 404 bytes
.../a.espncdn.com/photo/2010/0116/quinn_tj_m.jpg | Bin 0 -> 3454 bytes
.../prod/styles/legacy.min.200811061403.css | 1 +
.../a.espncdn.com/prod/styles/playerpopup1.css | 70 +
.../espn-000/a.espncdn.com/robots.txt | 24 +
.../i/teamlogos/nfl/scoreboard/nfc.png | Bin 0 -> 1018 bytes
.../prod/assets/memberservices/ms-bg-fave.gif | Bin 0 -> 1539 bytes
.../mlb/logo-mlb-teams-medium-vert-ie6.png | Bin 0 -> 30490 bytes
.../mlb/logo-mlb-teams-small-vert-ie6.png | Bin 0 -> 11166 bytes
.../teamlogos/mls/logo-mls-teams-40-vert.png | Bin 0 -> 13470 bytes
.../teamlogos/nba/logo-nba-teams-40-vert.png | Bin 0 -> 87029 bytes
.../nba/logo-nba-teams-large-vert-ie6.png | Bin 0 -> 66706 bytes
.../teamlogos/nfl/logo-nfl-teams-40-vert-ie6.png | Bin 0 -> 18118 bytes
.../teamlogos/nfl/logo-nfl-teams-medium-vert.png | Bin 0 -> 113857 bytes
.../teamlogos/nfl/logo-nfl-teams-small-vert.png | Bin 0 -> 49874 bytes
.../teamlogos/nhl/logo-nhl-teams-large-vert.png | Bin 0 -> 287310 bytes
.../teamlogos/wnba/logo-wnba-teams-40-vert.png | Bin 0 -> 32231 bytes
.../teamlogos/wnba/logo-wnba-teams-large-vert.png | Bin 0 -> 87702 bytes
.../espn-000/a1.espncdn.com/robots.txt | 24 +
.../gamepackage10/i/logo-nfl-teams-medium-bg.png | Bin 0 -> 64562 bytes
.../i/teamlogos/nfl/scoreboard/afc.png | Bin 0 -> 877 bytes
.../a2.espncdn.com/prod/assets/btn-blue-bg.gif | Bin 0 -> 115 bytes
.../teamlogos/mlb/logo-mlb-teams-large-vert.png | Bin 0 -> 292971 bytes
.../teamlogos/mls/logo-mls-teams-large-vert.png | Bin 0 -> 33767 bytes
.../teamlogos/mls/logo-mls-teams-medium-vert.png | Bin 0 -> 16707 bytes
.../teamlogos/mls/logo-mls-teams-small-vert.png | Bin 0 -> 7589 bytes
.../teamlogos/nba/logo-nba-teams-40-vert-ie6.png | Bin 0 -> 21987 bytes
.../teamlogos/nba/logo-nba-teams-medium-vert.png | Bin 0 -> 122377 bytes
.../teamlogos/nba/logo-nba-teams-small-vert.png | Bin 0 -> 41389 bytes
.../nfl/logo-nfl-teams-medium-vert-ie6.png | Bin 0 -> 24478 bytes
.../nfl/logo-nfl-teams-small-vert-ie6.png | Bin 0 -> 9408 bytes
.../teamlogos/nhl/logo-nhl-teams-40-vert.png | Bin 0 -> 100250 bytes
.../nhl/logo-nhl-teams-large-vert-ie6.png | Bin 0 -> 57901 bytes
.../teamlogos/nhl/logo-nhl-teams-medium-bg.png | Bin 0 -> 111783 bytes
.../teamlogos/wnba/logo-wnba-teams-40-vert-ie6.png | Bin 0 -> 10077 bytes
.../wnba/logo-wnba-teams-large-vert-ie6.png | Bin 0 -> 26604 bytes
.../espn-000/a2.espncdn.com/robots.txt | 24 +
.../teamlogos/mlb/logo-mlb-teams-40-vert.png | Bin 0 -> 113441 bytes
.../mlb/logo-mlb-teams-large-vert-ie6.png | Bin 0 -> 60843 bytes
.../nba/logo-nba-teams-medium-vert-ie6.png | Bin 0 -> 31060 bytes
.../nba/logo-nba-teams-small-vert-ie6.png | Bin 0 -> 10778 bytes
.../teamlogos/nfl/logo-nfl-teams-large-vert.png | Bin 0 -> 220150 bytes
.../teamlogos/nhl/logo-nhl-teams-40-vert-ie6.png | Bin 0 -> 20446 bytes
.../teamlogos/nhl/logo-nhl-teams-medium-vert.png | Bin 0 -> 132346 bytes
.../teamlogos/nhl/logo-nhl-teams-small-vert.png | Bin 0 -> 46094 bytes
.../teamlogos/wnba/logo-wnba-teams-medium-vert.png | Bin 0 -> 44324 bytes
.../teamlogos/wnba/logo-wnba-teams-small-vert.png | Bin 0 -> 17579 bytes
.../espn-000/a3.espncdn.com/robots.txt | 24 +
.../ncaa/scoreboard/trans/ncaa_logo_sprite.png | Bin 0 -> 403207 bytes
.../teamlogos/mlb/logo-mlb-teams-40-vert-ie6.png | Bin 0 -> 22046 bytes
.../teamlogos/mlb/logo-mlb-teams-medium-vert.png | Bin 0 -> 139082 bytes
.../teamlogos/mlb/logo-mlb-teams-small-vert.png | Bin 0 -> 52056 bytes
.../teamlogos/nba/logo-nba-teams-large-vert.png | Bin 0 -> 274859 bytes
.../teamlogos/nfl/logo-nfl-teams-40-vert.png | Bin 0 -> 89354 bytes
.../nfl/logo-nfl-teams-large-vert-ie6.png | Bin 0 -> 48333 bytes
.../nhl/logo-nhl-teams-medium-vert-ie6.png | Bin 0 -> 28117 bytes
.../nhl/logo-nhl-teams-small-vert-ie6.png | Bin 0 -> 10163 bytes
.../wnba/logo-wnba-teams-medium-vert-ie6.png | Bin 0 -> 13784 bytes
.../wnba/logo-wnba-teams-small-vert-ie6.png | Bin 0 -> 15162 bytes
.../espn-000/a4.espncdn.com/robots.txt | 24 +
.../espn-000/ad.doubleclick.net/robots.txt | 8 +
.../espn-000/assets.espn.go.com/i/fp/07/bull.gif | Bin 0 -> 56 bytes
.../espn-000/assets.espn.go.com/i/in.gif | Bin 0 -> 110 bytes
.../i/story/design07/byLineRule.gif | Bin 0 -> 46 bytes
.../i/story/design07/jumperBg.jpg | Bin 0 -> 636 bytes
.../i/story/design07/notebook_bg_black.gif | Bin 0 -> 84 bytes
.../i/story/design07/notebook_rail_bg.gif | Bin 0 -> 68 bytes
.../i/story/design07/pt_comment.gif | Bin 0 -> 103 bytes
.../i/story/design07/pt_email2.gif | Bin 0 -> 898 bytes
.../i/story/design07/pt_print2.gif | Bin 0 -> 883 bytes
.../i/story/design07/pt_share2.gif | Bin 0 -> 865 bytes
.../i/story/design07/pt_share_drop_bg2.jpg | Bin 0 -> 265 bytes
.../i/story/design07/sideHeaderBg.gif | Bin 0 -> 119 bytes
.../espn-000/assets.espn.go.com/icons/live.gif | Bin 0 -> 142 bytes
.../espn-000/assets.espn.go.com/icons/watch.png | Bin 0 -> 390 bytes
.../photo/2008/0227/mlb_a_clemens_65.jpg | Bin 0 -> 2818 bytes
.../espn-000/assets.espn.go.com/robots.txt | 24 +
.../espn-000/cache-01.cleanprint.net/robots.txt | 4 +
.../espn-000/content.dl-rms.com/robots.txt | 2 +
.../espn-000/js.adsonar.com/js/adsonar.js | 1 +
...lb%2Fnews%2Fstory?id=6760720&style=compact.html | 34 +
.../espn-000/pro.tweetmeme.com/robots.txt | 2 +
.../mlb/news/story?id=6760720.html | 935 +++++
.../mlb/news/story?id=6760720.html.rdbl | 39 +
.../tweetmeme.s3.amazonaws.com/demo/tc-widget.gif | Bin 0 -> 530 bytes
.../espn-000/zulu.tweetmeme.com/button_ajax3.js | 1 +
.../espn-000/zulu.tweetmeme.com/button_loader.gif | Bin 0 -> 1849 bytes
.../zulu.tweetmeme.com/compactbutton_loader.gif | Bin 0 -> 1849 bytes
.../espn-000/zulu.tweetmeme.com/widget.gif | Bin 0 -> 536 bytes
regression_test_data/mit-000-orig.html | 246 --
regression_test_data/mit-000-rdbl.html | 5 -
regression_test_data/mit-000.yaml | 2 +
.../images/article_images/tn/20110725153133-1.jpg | Bin 0 -> 21837 bytes
.../mitnews/1/H.23.3--NS/0?AQB=1&pccr=true&AQE=1 | Bin 0 -> 43 bytes
.../mit-000/mitnewsoffice.122.2o7.net/robots.txt | 2 +
.../s7.addthis.com/js/250/addthis_widget.js | 2 +
.../mit-000/s7.addthis.com/robots.txt | 3 +
.../2011/compare-recommendation-systems-0708.html | 244 ++
.../compare-recommendation-systems-0708.html.rdbl | 5 +
.../templates/SSlide-emotop/images/dot.gif | Bin 0 -> 801 bytes
.../templates/SSlide-emotop/images/dotv.gif | Bin 0 -> 809 bytes
.../templates/SSlide-emotop/images/html.png | Bin 0 -> 1168 bytes
.../templates/SSlide-emotop/images/mailgreen.jpg | Bin 0 -> 623 bytes
.../templates/SSlide-emotop/images/mailred.jpg | Bin 0 -> 639 bytes
.../templates/SSlide-emotop/images/user.png | Bin 0 -> 810 bytes
.../templates/SSlide-emotop/images/voting_no.png | Bin 0 -> 490 bytes
.../templates/SSlide-emotop/images/voting_yes.png | Bin 0 -> 505 bytes
.../newsoffice/images/M_images/searchButton.gif | Bin 0 -> 323 bytes
.../images/article_images/20110707114444-1.jpg | Bin 0 -> 80341 bytes
...ss.php?css=c02feab0fb9955599676411caef15f7d.css | 1 +
.../js.php?js=9e9ae0680c37e64c1e6523aab2eac8d3.js | 1245 +++++++
.../newsoffice/templates/mit/favicon.ico | Bin 0 -> 14846 bytes
.../newsoffice/templates/mit/images/GiveButton.png | Bin 0 -> 1134 bytes
.../templates/mit/images/footer-logo.gif | Bin 0 -> 1167 bytes
.../newsoffice/templates/mit/images/loader.gif | Bin 0 -> 10453 bytes
.../newsoffice/templates/mit/images/newslogo.gif | Bin 0 -> 2550 bytes
.../images/prettyPhoto/dark_rounded/btnNext.png | Bin 0 -> 1411 bytes
.../prettyPhoto/dark_rounded/btnPrevious.png | Bin 0 -> 1442 bytes
.../prettyPhoto/dark_rounded/contentPattern.png | Bin 0 -> 121 bytes
.../mit/images/prettyPhoto/dark_rounded/loader.gif | Bin 0 -> 2545 bytes
.../mit/images/prettyPhoto/dark_rounded/sprite.png | Bin 0 -> 3838 bytes
.../mit/images/prettyPhoto/dark_square/btnNext.png | Bin 0 -> 1411 bytes
.../images/prettyPhoto/dark_square/btnPrevious.png | Bin 0 -> 1442 bytes
.../mit/images/prettyPhoto/dark_square/sprite.png | Bin 0 -> 3303 bytes
.../images/prettyPhoto/light_rounded/btnNext.png | Bin 0 -> 1270 bytes
.../prettyPhoto/light_rounded/btnPrevious.png | Bin 0 -> 1442 bytes
.../images/prettyPhoto/light_rounded/loader.gif | Bin 0 -> 2545 bytes
.../images/prettyPhoto/light_rounded/sprite.png | Bin 0 -> 4008 bytes
.../images/prettyPhoto/light_square/btnNext.png | Bin 0 -> 1411 bytes
.../prettyPhoto/light_square/btnPrevious.png | Bin 0 -> 1442 bytes
.../mit/images/prettyPhoto/light_square/sprite.png | Bin 0 -> 3303 bytes
.../templates/mit/images/story_comment.gif | Bin 0 -> 118 bytes
.../templates/mit/images/story_email.gif | Bin 0 -> 88 bytes
.../templates/mit/images/story_print.gif | Bin 0 -> 80 bytes
.../templates/mit/images/story_share.gif | Bin 0 -> 297 bytes
.../newsoffice/templates/mit/images/topbar.gif | Bin 0 -> 1095 bytes
.../newsoffice/templates/mit/js/s_code.js | 171 +
.../mit-000/web.mit.edu/robots.txt | 10 +
regression_test_data/nytimes-000-orig.html | 115 -
regression_test_data/nytimes-000-rdbl.html | 1 -
regression_test_data/nytimes-000.yaml | 4 +-
.../bcvideo/1.0/iframe/embed.js.html | 178 +
.../css/0.1/screen/common/global.css | 591 ++++
.../css/0.1/screen/common/layout.css | 508 +++
.../css/0.1/screen/common/masthead.css | 80 +
.../css/0.1/screen/common/modules.css | 213 ++
.../css/0.1/screen/common/modules/rss.css | 22 +
.../css/0.1/screen/common/modules/sharetools.css | 125 +
.../css/0.1/screen/common/shell.css | 181 +
.../css/0.1/screen/common/subNavigation.css | 137 +
.../css/0.1/screen/common/util/tooltip.css | 50 +
.../us/politics/specialseason/subNavigation.css | 151 +
.../css/blogs/3.1/screen/community/comments.css | 98 +
.../css/blogs/3.1/screen/modules/common.css | 926 +++++
.../css/blogs/3.1/screen/modules/sharetools.css | 172 +
.../blogs/3.1/screen/themes/thecaucus/style.css | 29 +
.../blogs/3.1/screen/themes/universal/archives.css | 61 +
.../blogs/3.1/screen/themes/universal/comments.css | 213 ++
.../blogs/3.1/screen/themes/universal/entry.css | 505 +++
.../blogs/3.1/screen/themes/universal/layout.css | 943 +++++
.../themes/universal/style.css?v=06-03-2011.css | 28 +
.../themes/universal/style.css?v=06-03-2011orig | 28 +
.../css/common/screen/navigation.css | 253 ++
.../12caucus-mcconnell-kyl-debt-blog480.jpg | Bin 0 -> 40353 bytes
.../images/blogs/dealbook/db_icon-bb.gif | Bin 0 -> 1543 bytes
.../images/blogs_v3/thecaucus/thecaucus_post.png | Bin 0 -> 8740 bytes
.../images/global/buttons/go.gif | Bin 0 -> 186 bytes
.../images/misc/nytlogo153x23.gif | Bin 0 -> 1877 bytes
.../images/section/us/politics/divider55.gif | Bin 0 -> 53 bytes
.../us/politics/icons/election_calendar.png | Bin 0 -> 1096 bytes
.../section/us/politics/icons/fivethirtyeight.png | Bin 0 -> 3285 bytes
.../images/section/us/politics/icons/governors.png | Bin 0 -> 698 bytes
.../section/us/politics/icons/h_2012watch.png | Bin 0 -> 2864 bytes
.../us/politics/icons/h_keyvotesincongress.png | Bin 0 -> 2044 bytes
.../images/section/us/politics/icons/h_polls.png | Bin 0 -> 925 bytes
.../images/section/us/politics/icons/house.png | Bin 0 -> 1241 bytes
.../section/us/politics/icons/politics_home.png | Bin 0 -> 2236 bytes
.../images/section/us/politics/icons/senate.png | Bin 0 -> 1328 bytes
.../images/section/us/politics/icons/thecaucus.png | Bin 0 -> 1573 bytes
.../images/section/us/politics/icons/video.png | Bin 0 -> 1309 bytes
.../js/app/analytics/trackingTags_v1.1.js | 308 ++
.../genericContentExpander/contentexpander.js | 7 +
.../js/app/lib/NYTD/0.0.1/tabset.js | 72 +
.../js/article/articleShare.js | 5 +
.../js/blogs_v3/nyt_universal/js/blogShare.js | 9 +
.../js/blogs_v3/nyt_universal/js/blogscrnr.js | 158 +
.../js/blogs_v3/nyt_universal/js/common.js | 265 ++
.../js/blogs_v3/nyt_universal/js/memberTools.js | 66 +
.../nyt_universal/tabsetoverlayrevealer.js | 80 +
.../nytimes-000/graphics8.nytimes.com/js/common.js | 376 ++
.../js/common/screen/altClickToSearch.js | 422 +++
.../js/common/screen/modifyNavigationDisplay.js | 42 +
.../graphics8.nytimes.com/js/print_todays_date.js | 56 +
.../graphics8.nytimes.com/js/util/tooltip.js | 73 +
.../graphics8.nytimes.com/robots.txt.html | 28 +
.../index.html?hp.html | 902 +++++
.../index.html?hp.html.rdbl | 17 +
.../thecaucus.blogs.nytimes.com/favicon.ico | Bin 0 -> 894 bytes
.../thecaucus.blogs.nytimes.com/robots.txt.html | 2 +
...u=' + encodeURIComponent(document.location) + ' | Bin 0 -> 43 bytes
...v=0&dcsuri=%2Fnojavascript&WT.js=No&WT.tv=1.0.7 | Bin 0 -> 67 bytes
.../blogs_v3/fivethirtyeight/fivethirtyeight75.gif | Bin 0 -> 2961 bytes
.../nytimes-000/www.nytimes.com/robots.txt | 27 +
regression_test_data/nytimes-001-orig-2.html | 941 -----
regression_test_data/nytimes-001-orig-3.html | 934 -----
regression_test_data/nytimes-001-orig-4.html | 944 -----
regression_test_data/nytimes-001-orig-5.html | 865 -----
regression_test_data/nytimes-001-orig.html | 957 ------
regression_test_data/nytimes-001-rdbl.html | 134 -
regression_test_data/nytimes-001.yaml | 12 +-
.../nytimes-001/ad.doubleclick.net/robots.txt | 8 +
.../ads/marketing/mm09/verticalst/nytimes.gif | Bin 0 -> 572 bytes
.../mm09/verticalst/verticals_dealbook.gif | Bin 0 -> 1896 bytes
.../mm09/verticalst/verticals_opinion.gif | Bin 0 -> 441 bytes
.../ads/marketing/mm11/dealbook_072711.jpg | Bin 0 -> 14586 bytes
.../ads/marketing/mm11/opinion_072711.jpg | Bin 0 -> 6728 bytes
.../images/ADS/24/26/ad.242614/90x79_newspaper.gif | Bin 0 -> 2017 bytes
.../ADS/25/86/ad.258614/Times_Limited_86x60.gif | Bin 0 -> 3439 bytes
.../images/ADS/26/57/ad.265768/MMMM_120X60_b.gif | Bin 0 -> 11284 bytes
.../ad.265876/101452_SomePromiseHD_336x79_sf.jpg | Bin 0 -> 15167 bytes
.../13/ad.271323/11-0920_HDWeekender2_336x79.gif | Bin 0 -> 5922 bytes
...11-0220_AudienceDev_336x79_pingpong_revised.jpg | Bin 0 -> 11580 bytes
.../11-0220_AudienceDev_86x60_bonnaroo.jpg | Bin 0 -> 3268 bytes
.../ad.271446/11-0220_AudienceDev_86x60_gluten.jpg | Bin 0 -> 3449 bytes
.../images/ADS/27/15/ad.271582/120x60_NP_IG002.gif | Bin 0 -> 7895 bytes
.../css/0.1/screen/article/abstract.css | 171 +
.../css/0.1/screen/article/upnext.css | 62 +
.../css/0.1/screen/build/article/2.0/styles.css | 23 +
.../css/0.1/screen/common/ads.css | 504 +++
.../css/0.1/screen/common/article.css | 679 ++++
.../css/0.1/screen/common/global.css | 591 ++++
.../css/0.1/screen/common/googleads.css | 116 +
.../css/0.1/screen/common/insideNYTimes.css | 181 +
.../css/0.1/screen/common/layout.css | 508 +++
.../css/0.1/screen/common/macros.css | 114 +
.../css/0.1/screen/common/masthead.css | 80 +
.../css/0.1/screen/common/modules.css | 213 ++
.../css/0.1/screen/common/modules/articletools.css | 116 +
.../0.1/screen/common/modules/readercomments.css | 57 +
.../css/0.1/screen/common/modules/sharetools.css | 125 +
.../css/0.1/screen/common/mostpopular.css | 83 +
.../css/0.1/screen/common/navigation.css | 208 ++
.../css/0.1/screen/common/shell.css | 181 +
.../0.1/screen/section/travel/modules/expedia.css | 236 ++
.../graphics8.nytimes.com/css/common/global.css | 69 +
.../css/standalone/regilite/screen/regiLite.css | 173 +
.../10bad1/mag-10Bad-t_CA1-articleInline.jpg | Bin 0 -> 26210 bytes
.../07/10/magazine/10bad2/10bad2-thumbWide.jpg | Bin 0 -> 13418 bytes
.../10bad_span/10bad_span-articleLarge.jpg | Bin 0 -> 68698 bytes
.../images/global/buttons/go.gif | Bin 0 -> 186 bytes
.../images/membercenter/icon_delivers.png | Bin 0 -> 3273 bytes
.../images/membercenter/signup.png | Bin 0 -> 1406 bytes
.../images/misc/nytlogo152x23.gif | Bin 0 -> 1110 bytes
.../js/app/analytics/trackingTags_v1.1.js | 308 ++
.../js/app/article/articleCommentCount.js | 129 +
.../js/app/article/outbrain.js | 16 +
.../graphics8.nytimes.com/js/app/article/upNext.js | 6 +
.../app/recommendations/recommendationsModule.js | 482 +++
.../js/article/articleShare.js | 5 +
.../js/article/comments/crnrXHR.js | 53 +
.../nytimes-001/graphics8.nytimes.com/js/common.js | 376 ++
.../js/common/screen/DropDown.js | 50 +
.../js/common/screen/altClickToSearch.js | 422 +++
.../graphics8.nytimes.com/js/util/tooltip.js | 73 +
.../graphics8.nytimes.com/robots.txt.html | 28 +
.../27/arts/27moth_spatial/27moth_spatial-moth.jpg | Bin 0 -> 9493 bytes
.../27/health/27physed_MOTH/27physed_MOTH-moth.jpg | Bin 0 -> 14841 bytes
.../27/nyregion/27moth_about/27moth_about-moth.jpg | Bin 0 -> 8878 bytes
.../07/27/opinion/27moth_rfd/27moth_rfd-moth.jpg | Bin 0 -> 9649 bytes
.../07/27/world/27moth_swim/27moth_swim-moth.jpg | Bin 0 -> 9351 bytes
.../images/global/buttons/moth_forward.gif | Bin 0 -> 130 bytes
.../images/global/buttons/moth_reverse.gif | Bin 0 -> 132 bytes
.../nytimes-001/i1.nyt.com/robots.txt.html | 28 +
.../nytimes-001/js.nyt.com/js/app/moth/moth.js | 4 +
.../nytimes-001/js.nyt.com/robots.txt.html | 28 +
.../pagead2.googlesyndication.com/robots.txt | 4 +
...gazine%2Fthe-dark-art-of-breaking-bad.html?_r=1 | Bin 0 -> 43 bytes
...dark-art-of-breaking-bad.html?_r=2&pagewanted=2 | Bin 0 -> 43 bytes
...dark-art-of-breaking-bad.html?_r=3&pagewanted=3 | Bin 0 -> 43 bytes
...dark-art-of-breaking-bad.html?_r=4&pagewanted=4 | Bin 0 -> 43 bytes
...v=0&dcsuri=%2Fnojavascript&WT.js=No&WT.tv=1.0.7 | Bin 0 -> 67 bytes
.../the-dark-art-of-breaking-bad.html?_r=1.html | 1030 ++++++
...he-dark-art-of-breaking-bad.html?_r=1.html.rdbl | 139 +
...art-of-breaking-bad.html?_r=2&pagewanted=2.html | 1015 ++++++
...art-of-breaking-bad.html?_r=3&pagewanted=3.html | 1013 ++++++
...art-of-breaking-bad.html?_r=4&pagewanted=4.html | 1017 ++++++
...litesub_insert.html?product=LT&size=336X90.html | 36 +
.../nytimes-001/www.nytimes.com/robots.txt | 27 +
...art-of-breaking-bad.html?_r=5&pagewanted=5.html | 939 +++++
.../nytimes-001/www10.nytimes.com/robots.txt | 4 +
regression_test_data/washingtonpost-000-orig.html | 1802 ----------
regression_test_data/washingtonpost-000-rdbl.html | 6 -
regression_test_data/washingtonpost-000.yaml | 4 +-
.../b.scorecardresearch.com/robots.txt | 2 +
...788;rand='+TWP.StaticMethods.getUniqueToken()+' | Bin 0 -> 43 bytes
.../media.washingtonpost.com/robots.txt | 53 +
.../wp-srv/ad/textlinks/js/utilsTextLinksXML.js | 67 +
.../wp-srv/ad/textlinks/style/textlinks.css | 297 ++
.../wp-srv/images/bullet_3x3_999999.gif | Bin 0 -> 44 bytes
.../pixel.quantserve.com/robots.txt | 2 +
.../www.washingtonpost.com/favicon.ico | Bin 0 -> 24038 bytes
.../2011/07/11/gIQA0XDg9H_story.html?hpid=z1.html | 1783 ++++++++++
.../07/11/gIQA0XDg9H_story.html?hpid=z1.html.rdbl | 6 +
.../Graphics/toles06222011forweb.jpg | Bin 0 -> 33288 bytes
.../Videos/07112011-43v/07112011-43v.jpg | Bin 0 -> 12757 bytes
.../Images/Standing Art/eugene_robinson_silo.jpg | Bin 0 -> 3961 bytes
.../www.washingtonpost.com/robots.txt | 53 +
.../Staff-Bio/Images/eugene-robinson-114x80.png | Bin 0 -> 13867 bytes
...trove-right-rail-promo-any-device-devices-R.png | Bin 0 -> 10299 bytes
.../rw/sites/twpweb/img/blogs/spacer.gif | Bin 0 -> 43 bytes
.../rw/sites/twpweb/img/icons/icon-apple-lrg.gif | Bin 0 -> 1577 bytes
.../sites/twpweb/img/icons/icon-excpoint-lrg.gif | Bin 0 -> 1636 bytes
.../sites/twpweb/img/icons/icon-facebook-lrg.gif | Bin 0 -> 1553 bytes
.../rw/sites/twpweb/img/icons/icon-minus.png | Bin 0 -> 3132 bytes
.../rw/sites/twpweb/img/icons/icon-mobile-lrg.gif | Bin 0 -> 1611 bytes
.../rw/sites/twpweb/img/icons/icon-plus.png | Bin 0 -> 3144 bytes
.../rw/sites/twpweb/img/icons/icon-rss-lrg.gif | Bin 0 -> 1674 bytes
.../rw/sites/twpweb/img/icons/icon-shadow.gif | Bin 0 -> 352 bytes
.../rw/sites/twpweb/img/icons/icon-twitter-lrg.gif | Bin 0 -> 1601 bytes
.../rw/sites/twpweb/js/conf.js | 10 +
.../rw/sites/twpweb/js/site_traffic/comscore.js | 8 +
.../rw/sites/twpweb/js/wp_omniture.js | 1069 ++++++
.../wp-srv/ad/textlinks/images/dash.gif | Bin 0 -> 46 bytes
.../wp-srv/ad/textlinks/images/dot.gif | Bin 0 -> 424 bytes
...e&m=false&context=wp-static&r=%2Fad%2Faudsci.js | 20 +
...r=%2Fad%2Fwpni_generic_ad.js&r=%2Fad%2Fwp_ad.js | 833 +++++
1183 files changed, 49969 insertions(+), 14189 deletions(-)
commit 11c4d9541133996522d46b9ee1989f486bdb699b
Author: Yuri Baburov <burchik@gmail.com>
Date: Wed Jul 27 01:56:17 2011 +0700
Fixed indentation, encoding issue and README bug. Thanks to Greg Jastrab. Bump version to 0.2.3
README | 4 ++--
readability/debug.py | 1 -
readability/encoding.py | 1 -
readability/readability.py | 8 ++++----
setup.py | 2 +-
5 files changed, 7 insertions(+), 9 deletions(-)
commit 6bf4948e69664adbc35b0506b1d5db98980d0cea
Author: Yuri Baburov <burchik@gmail.com>
Date: Tue Jul 26 13:40:53 2011 +0700
More README fixes for pipy and github. Bump to version 0.2.2
README | 6 ++++++
setup.py | 2 +-
2 files changed, 7 insertions(+), 1 deletion(-)
commit 9869e9e196a80dc877decf08293d0a1e0533d525
Author: Jerry Charumilind <git@jcharum.fastmail.net>
Date: Fri Jul 22 11:34:24 2011 -0700
Implement duplicate page detection
This adds detection of duplicate pages to avoid adding duplicate pages to a
multi-page article. It adds a simple unit test and regenerates the nytimes
regression test with the new, and more correct, result. Previously, we were
including page 2 again after page 5.
readability/readability.py | 84 ++++++++++++++++++++++++----
regression_test_data/nytimes-001-rdbl.html | 35 ++----------
regression_test_data/nytimes-001.yaml | 2 +-
test_data/duplicate-page-article.html | 48 ++++++++++++++++
test_data/duplicate-page-duplicate.html | 25 +++++++++
test_data/duplicate-page-unique.html | 20 +++++++
6 files changed, 172 insertions(+), 42 deletions(-)
commit 168262b03b6422617b02564b0c49507608b8b2cc
Author: Jerry Charumilind <git@jcharum.fastmail.net>
Date: Thu Jul 21 14:41:01 2011 -0700
Add a regression for a multi-page nytimes article
It does not quite work yet, as we wrongly pull in page 2 at the end of the
article due to yet-to-be-implemented duplicate avoidance.
gen_test.py | 52 +-
readability/readability.py | 14 +
regression_test.py | 8 +-
regression_test_data/nytimes-001-orig-2.html | 941 +++++++++++++++++++++++++
regression_test_data/nytimes-001-orig-3.html | 934 +++++++++++++++++++++++++
regression_test_data/nytimes-001-orig-4.html | 944 +++++++++++++++++++++++++
regression_test_data/nytimes-001-orig-5.html | 865 +++++++++++++++++++++++
regression_test_data/nytimes-001-orig.html | 957 ++++++++++++++++++++++++++
regression_test_data/nytimes-001-rdbl.html | 157 +++++
regression_test_data/nytimes-001.yaml | 10 +-
10 files changed, 4856 insertions(+), 26 deletions(-)
commit 0a4487495bd48309b31f8b62a828b8bf1e9afc07
Author: Jerry Charumilind <git@jcharum.fastmail.net>
Date: Thu Jul 21 10:38:10 2011 -0700
Improve unit test for basic multi-page handling
The test now actually asserts something instead of just printing some stuff out
for manual inspection.
readability/readability.py | 17 ++++-
test_data/basic-multi-page-expected.html | 123 ++++++++++++++++++++++++++++++
2 files changed, 138 insertions(+), 2 deletions(-)
commit aab3d29729cfa9fe8dce928058304edb256dd582
Author: Jerry Charumilind <git@jcharum.fastmail.net>
Date: Thu Jul 21 10:10:07 2011 -0700
Add --debug flag for unit testing
Running readability.py test --debug will turn on debug logging.
readability/readability.py | 11 +++++++----
1 file changed, 7 insertions(+), 4 deletions(-)
commit 9d490ec3ff858c7ca92432ab112ca202a414925d
Author: Jerry Charumilind <git@jcharum.fastmail.net>
Date: Thu Jul 21 09:56:04 2011 -0700
Checkpoint multi-page readability work
Restructured code to better support multi-page readability. Improved tests.
readability/readability.py | 115 +++++++++++++++++++++++--------------
regression_test.py | 38 ++++++++----
test_data/basic-multi-page-3.html | 60 +++++++++++++++++++
3 files changed, 160 insertions(+), 53 deletions(-)
commit 74dd7a45b905e6df4523e50627596ff5de2a5ee8
Author: Jerry Charumilind <git@jcharum.fastmail.net>
Date: Tue Jul 19 17:46:51 2011 -0700
Refactor code for easier testing
readability/readability.py | 917 ++++++++++++++++++++-----------------
test_data/basic-multi-page-2.html | 52 +++
test_data/basic-multi-page.html | 60 +++
3 files changed, 612 insertions(+), 417 deletions(-)
commit 1d1511551094c0a483e1ce3db8a79175eed75cb7
Author: Jerry Charumilind <git@jcharum.fastmail.net>
Date: Fri Jul 15 17:08:56 2011 -0700
Add scoring of next page link ancestry and href
This adds the scoring of next page link candidates' ancestry and href values
from the readability algorithm.
readability/readability.py | 37 +++++++++++++++++++++++++++++++++++--
1 file changed, 35 insertions(+), 2 deletions(-)
commit bf6954708b4db3c6776950b3a124d09a41f93171
Author: Jerry Charumilind <git@jcharum.fastmail.net>
Date: Fri Jul 15 16:31:18 2011 -0700
First working find_next_page_link case
We have a find_next_page_link that works for a nytimes article. There is a
small unit test for this.
I renamed the regression test, test.py, to regression_test.py, as it is a
little more informative. I also renamed the data directories used by that test
accordingly.
readability/readability.py | 167 +-
regression_test.py | 307 ++
regression_test_data/arstechnica-000-orig.html | 664 ++++
regression_test_data/arstechnica-000-rdbl.html | 53 +
regression_test_data/arstechnica-000.yaml | 2 +
regression_test_data/businessinsider-000-orig.html | 3602 ++++++++++++++++++++
regression_test_data/businessinsider-000-rdbl.html | 11 +
regression_test_data/businessinsider-000.yaml | 3 +
regression_test_data/cnet-000-orig.html | 777 +++++
regression_test_data/cnet-000-rdbl.html | 9 +
regression_test_data/cnet-000.yaml | 2 +
regression_test_data/deadspin-000-orig.html | 1011 ++++++
regression_test_data/deadspin-000-rdbl.html | 9 +
regression_test_data/deadspin-000.yaml | 2 +
regression_test_data/espn-000-orig.html | 993 ++++++
regression_test_data/espn-000-rdbl.html | 31 +
regression_test_data/espn-000.yaml | 2 +
regression_test_data/mit-000-orig.html | 246 ++
regression_test_data/mit-000-rdbl.html | 5 +
regression_test_data/mit-000.yaml | 3 +
regression_test_data/nytimes-000-orig.html | 115 +
regression_test_data/nytimes-000-rdbl.html | 1 +
regression_test_data/nytimes-000.yaml | 2 +
regression_test_data/nytimes-001.yaml | 9 +
regression_test_data/washingtonpost-000-orig.html | 1802 ++++++++++
regression_test_data/washingtonpost-000-rdbl.html | 6 +
regression_test_data/washingtonpost-000.yaml | 2 +
regression_test_output/.gitignore | 2 +
test.py | 307 --
test_data/arstechnica-000-orig.html | 664 ----
test_data/arstechnica-000-rdbl.html | 53 -
test_data/arstechnica-000.yaml | 2 -
test_data/businessinsider-000-orig.html | 3602 --------------------
test_data/businessinsider-000-rdbl.html | 11 -
test_data/businessinsider-000.yaml | 3 -
test_data/cnet-000-orig.html | 777 -----
test_data/cnet-000-rdbl.html | 9 -
test_data/cnet-000.yaml | 2 -
test_data/deadspin-000-orig.html | 1011 ------
test_data/deadspin-000-rdbl.html | 9 -
test_data/deadspin-000.yaml | 2 -
test_data/espn-000-orig.html | 993 ------
test_data/espn-000-rdbl.html | 31 -
test_data/espn-000.yaml | 2 -
test_data/mit-000-orig.html | 246 --
test_data/mit-000-rdbl.html | 5 -
test_data/mit-000.yaml | 3 -
test_data/nytimes-000-orig.html | 115 -
test_data/nytimes-000-rdbl.html | 1 -
test_data/nytimes-000.yaml | 2 -
test_data/nytimes-001.yaml | 9 -
test_data/nytimes-next-page.html | 975 ++++++
test_data/washingtonpost-000-orig.html | 1802 ----------
test_data/washingtonpost-000-rdbl.html | 6 -
test_data/washingtonpost-000.yaml | 2 -
test_output/.gitignore | 2 -
56 files changed, 10794 insertions(+), 9690 deletions(-)
commit 50fc147c38b388c94613b5d9d013ef76145dde52
Author: Jerry Charumilind <git@jcharum.fastmail.net>
Date: Thu Jul 14 11:15:36 2011 -0700
Add cleaning of short segments
readability/readability.py | 54 ++++++++++++++++++++++++++++++++++----------
1 file changed, 42 insertions(+), 12 deletions(-)
commit 0d6eb52e0f19c8db8bd4be30294adbb752041d6e
Author: Jerry Charumilind <git@jcharum.fastmail.net>
Date: Thu Jul 14 10:32:44 2011 -0700
Add cleaning of 'index' segments
readability/readability.py | 25 ++++++++++++++++++++++++-
1 file changed, 24 insertions(+), 1 deletion(-)
commit 1759d9ac8d98aa55398a1dd65f58b4b1b662f12f
Author: Jerry Charumilind <git@jcharum.fastmail.net>
Date: Wed Jul 13 17:31:51 2011 -0700
Checkpoint of multi-page article work
This implements some basic tools needed by the multi-page article algorithm.
readability/readability.py | 214 +++++++++++++++++++++++++++++++++++++++++++-
readability/urlfetch.py | 21 +++++
2 files changed, 231 insertions(+), 4 deletions(-)
commit 82564cb0c7eff526e3b1b5e478e30c28480928fe
Author: Jerry Charumilind <git@jcharum.fastmail.net>
Date: Wed Jul 13 10:40:35 2011 -0700
Add comment for read_orig
gen_test.py | 6 ++++++
1 file changed, 6 insertions(+)
commit d922637f45b855331e55d1cc355e932b74d411b6
Author: Jerry Charumilind <git@jcharum.fastmail.net>
Date: Tue Jul 12 17:37:25 2011 -0700
Add subcommand parsing to gen_test
There are now subcommands to generate new tests or just regenerate readable
versions of old tests.
gen_test.py | 103 ++++++++++++++++++++++++++++++++++++++++++++---------------
1 file changed, 77 insertions(+), 26 deletions(-)
commit ff185f71c0db3caf0f139b4b4844737603ceaba1
Author: Jerry Charumilind <git@jcharum.fastmail.net>
Date: Tue Jul 12 16:32:11 2011 -0700
Add option to not generate yaml file
Sometimes you just want to generate the data files without the YAML
specification. This change lets you do that. In doing so, I switched to use
the argparse module for argument parsing.
gen_test.py | 57 +++++++++++++++++++++++++++++++++++++++------------------
1 file changed, 39 insertions(+), 18 deletions(-)
commit e0f81c88bab8bcd6d36c85473069e25144fe2860
Author: Jerry Charumilind <git@jcharum.fastmail.net>
Date: Tue Jul 12 14:48:39 2011 -0700
Reorganize constants
test.py | 130 +++++++++++++++++++++++++++++++--------------------------------
1 file changed, 65 insertions(+), 65 deletions(-)
commit fa93a27d087eb29b92158ee8688b86ef64242021
Author: Jerry Charumilind <git@jcharum.fastmail.net>
Date: Tue Jul 12 14:36:09 2011 -0700
Add docstring briefly describing gen_test program
gen_test.py | 5 +++++
1 file changed, 5 insertions(+)
commit f3f3e35e89a1826f8bf35d0b086be595750635d8
Author: Jerry Charumilind <git@jcharum.fastmail.net>
Date: Tue Jul 12 14:32:20 2011 -0700
Add regression tests for readability results
These test cases provide a baseline from which we can start improving the
readability algorithm and making sure that we do not horribly break anything.
gen_test.py | 71 +
test.py | 39 +-
test_data/businessinsider-000-orig.html | 3602 +++++++++++++++++++++++++++++++
test_data/businessinsider-000-rdbl.html | 11 +
test_data/businessinsider-000.yaml | 3 +
test_data/cnet-000-orig.html | 777 +++++++
test_data/cnet-000-rdbl.html | 9 +
test_data/cnet-000.yaml | 2 +
test_data/deadspin-000-orig.html | 1011 +++++++++
test_data/deadspin-000-rdbl.html | 9 +
test_data/deadspin-000.yaml | 2 +
test_data/espn-000-orig.html | 993 +++++++++
test_data/espn-000-rdbl.html | 31 +
test_data/espn-000.yaml | 2 +
test_data/mit-000-orig.html | 246 +++
test_data/mit-000-rdbl.html | 5 +
test_data/mit-000.yaml | 3 +
test_data/nytimes-000-orig.html | 115 +
test_data/nytimes-000-rdbl.html | 1 +
test_data/nytimes-000.yaml | 2 +
test_data/nytimes-001.yaml | 9 +
test_data/washingtonpost-000-orig.html | 1802 ++++++++++++++++
test_data/washingtonpost-000-rdbl.html | 6 +
test_data/washingtonpost-000.yaml | 2 +
24 files changed, 8744 insertions(+), 9 deletions(-)
commit 9bc56745debb330867f6db65440f2b58302d6d61
Author: Jerry Charumilind <git@jcharum.fastmail.net>
Date: Mon Jul 11 17:47:57 2011 -0700
Add summary page for test results
test.py | 138 ++++++++++++++++++++++++++++++++++++++++++++++++++++++++-------
1 file changed, 123 insertions(+), 15 deletions(-)
commit 95f94e84aee37ed1de3a66d22861e09a88b8480e
Author: Jerry Charumilind <git@jcharum.fastmail.net>
Date: Mon Jul 11 17:46:19 2011 -0700
Remove obsolete code
test.py | 10 ----------
1 file changed, 10 deletions(-)
commit eb31e1857effecc21c42160ac60ed0bbd5c34456
Author: Jerry Charumilind <git@jcharum.fastmail.net>
Date: Mon Jul 11 14:21:47 2011 -0700
Remove obsolete code
test.py | 86 ---------------------------------------------------------------
1 file changed, 86 deletions(-)
commit 5a6ba0194e3ba7731c0010ebe21a4c663736514c
Author: Jerry Charumilind <git@jcharum.fastmail.net>
Date: Mon Jul 11 14:10:12 2011 -0700
Add reading of test information from YAML file
This sets up for more sophisticated tests (like multi-page tests).
test.py | 51 +++++++++++++++++++++++++++++-----------
test_data/arstechnica-000.yaml | 2 ++
2 files changed, 39 insertions(+), 14 deletions(-)
commit 304b3e0fd0e6c97f083f62596ad45ba14059c32a
Author: Jerry Charumilind <git@jcharum.fastmail.net>
Date: Fri Jul 8 16:18:38 2011 -0700
Add test data and writing of both result and diff
test.py | 29 +-
test_data/arstechnica-000-orig.html | 664 +++++++++++++++++++++++++++++++++++
test_data/arstechnica-000-rdbl.html | 53 +++
3 files changed, 737 insertions(+), 9 deletions(-)
commit cc0af7a105eee6f4e8ffbf6debf7fad7fd37e559
Author: Jerry Charumilind <git@jcharum.fastmail.net>
Date: Fri Jul 8 15:55:30 2011 -0700
Add beginnings of regression tests
.gitignore | 1 +
readability/readability.py | 5 +-
test.py | 240 ++++++++++++++++++++++++++++++++++++++++++++
test_output/.gitignore | 2 +
4 files changed, 247 insertions(+), 1 deletion(-)
commit 82eabfc6b1dba0ea2708ae29ea19c676691ecb3f
Author: Jerry Charumilind <git@jcharum.fastmail.net>
Date: Tue Jul 5 17:23:05 2011 -0700
Bump version number
setup.py | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
commit cba19f209bad8a58bbc502b8c8ea7f5cdc9f6b6c
Author: Jerry Charumilind <git@jcharum.fastmail.net>
Date: Tue Jul 5 17:17:38 2011 -0700
Fix issue with trying to drop root node
remove_unlikely_candidates would try to drop_tree the root node if it deemed it
an unlikely candidate. This prevents that from happening.
readability/readability.py | 6 +++++-
1 file changed, 5 insertions(+), 1 deletion(-)
commit 18fa6b51466a0305779fa20fee6de2e32e3ea2eb
Author: Jerry Charumilind <git@jcharum.fastmail.net>
Date: Tue Jul 5 13:36:58 2011 -0700
Bump version number for external use
setup.py | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
commit cdd30f625eaedbaf47e11385666199245f31a309
Author: Jerry Charumilind <git@jcharum.fastmail.net>
Date: Tue Jul 5 13:35:36 2011 -0700
Return confidence level when retieving summary
readability/readability.py | 15 ++++++++++++++-
1 file changed, 14 insertions(+), 1 deletion(-)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment