Skip to content

Instantly share code, notes, and snippets.

@jgrahamc
Created February 17, 2014 15:27
Show Gist options
  • Star 1 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save jgrahamc/9052622 to your computer and use it in GitHub Desktop.
Save jgrahamc/9052622 to your computer and use it in GitHub Desktop.
Script to take the files generated by wget -mkp --save-headers and remove headers that confuse quic_server and at the X-Original-Url header.
# quic-cleanup.pl
#
# When running the Google experimental quic_server it expects to find
# sites to serve in /tmp/quic-data. Under that there will be a
# directory per site.
#
# For example this was used to mirror the CloudFlare web site for
# QUIC testing.
#
# mkdir /tmp/quic-data
# cd /tmp/quic-data
# wget -mkp --save-headers https://www.cloudflare.com/
#
# The saved headers contain headers that confuse quic_server (such as
# Transfer-Encoding: chunked when the entire contents has been saved)
# and they need X-Original-Url added. This script is used to clean
# up the downloaded files.
#
# cd /tmp/quic-data
# perl quic-cleanup.pl
#
# Copyright (c) 2014 CloudFlare, Inc.
use strict;
use warnings;
sub cleanup {
my ($dir) = @_;
my @files = glob "$dir/*";
foreach my $f (@files) {
if (-d $f) {
cleanup($f);
next;
}
print "Processing $f...\n";
my $g = "$f.bak";
if ((open F, "<$f") && (open G, ">$g")) {
my $in_headers = 1;
while (<F>) {
my $line = $_;
if ($in_headers) {
if ($line =~ /^\r\n$/) {
$f =~ /^.\/([^\/]*)\/(.+)/;
print G "X-Original-Url: http://$1/$2\r\n";
$in_headers = 0;
} elsif ($line !~ /^(HTTP|Server:|Content-Type:|Date:|Content-Length:)/i) {
next;
}
if ($line =~ /Server:/i) {
$line = "Server: cloudflare-quic\r\n";
}
}
print G $line;
}
close G;
close F;
rename($g, $f);
} else {
die "Trouble opening $f or $g\n";
}
}
}
cleanup('.');
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment