Last active
August 8, 2018 22:09
-
-
Save aronasorman/57b8c01e5ed2b7cbf876e7734b7b9f38 to your computer and use it in GitHub Desktop.
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
I expected the ioloop method to win here, and it does hold its ground when ran on my local machine. | |
But i'm surprised how reordered things are when ran on a server on GCP: | |
the naive method is still slow, but it's a third faster than my local machine, which clocked in at 120 seconds. | |
I'm honestly surprised that the naive HTTP2.0 method is so slow. Maybe because of the overhead of the http2 library i use. | |
As usual, the threaded method was the winner here. I'm guessing it's because requests' builtin libraries are fast enough for HEADs. The HTTP2 method would probably win if we actually fetch the content. |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
#!/usr/bin/env python | |
import timeit | |
import requests | |
import hyper | |
from hyper.contrib import HTTP20Adapter | |
from collections import Counter | |
from urlparse import urlparse | |
from multiprocessing.dummy import Pool | |
from tornado.httpclient import AsyncHTTPClient | |
from tornado import gen, ioloop | |
from multiprocessing import Value, Lock | |
url = "https://storage.googleapis.com/studio-content/storage/0/0/000021c2b641f5fcc7ca1fb605af4460.png" | |
urls = [url] * 100 | |
print("This test requests a URL from GCS, 100 times per run. We do 10 runs, so a total of 1000 HEAD requests.\n\n\n") | |
print("Running the naive, synchronous technique. This one makes a HEAD request serially, without any session reuse. Our baseline.") | |
c = Value("i", 0) | |
def test_synchronous(): | |
for url in urls: | |
r = requests.head(url) | |
if r.status_code == 200: | |
c.value += 1 # no need to acquire lock, since it's synchronous | |
print("Number of successful calls: {}".format(c.value)) | |
print "time for naive synchronous method: {}\n\n\n".format(timeit.timeit(test_synchronous, number=10)) | |
print("Running the HTTP2.0, synchronous method. This reuses sessions across requests, and also implements HTTP2.0 for faster downloading.") | |
c = Value("i", 0) | |
def test_synchronous_http2(): | |
session = requests.Session() | |
session.mount("https://storage.googleapis.com", HTTP20Adapter()) | |
for url in urls: | |
r = session.head(url) | |
if r.status_code == 200: | |
c.value += 1 # no need to acquire lock, since it's synchronous | |
print("Number of successful calls: {}".format(c.value)) | |
print "time for http2 synchronous method: {}\n\n\n".format(timeit.timeit(test_synchronous_http2, number=10)) | |
print("Running the threaded method. This splits the request calls into three threads, and reuses sessions too.") | |
c = Value("i", 0) | |
session = requests.Session() | |
def handle_url(url): | |
resp = session.head(url) | |
if resp.status_code == 200: | |
with c.get_lock(): | |
c.value += 1 | |
def test_multiprocessing(): | |
pool = Pool(3) | |
pool.map(handle_url, urls) | |
print("Number of successful calls: {}".format(c.value)) | |
print "time for threaded method: {}\n\n\n".format(timeit.timeit(test_multiprocessing, number=10)) | |
print("Running the threaded HTTP2.0 method. This splits requests into three threads, and makes requests use HTTP2.0.") | |
c = Value("i", 0) | |
session = requests.Session() | |
session.mount("https://storage.googleapis.com", HTTP20Adapter()) | |
def handle_url(url): | |
resp = session.head(url) | |
if resp.status_code == 200: | |
with c.get_lock(): | |
c.value += 1 | |
def test_http2_multiprocessing(): | |
pool = Pool(3) | |
pool.map(handle_url, urls) | |
print("Number of successful calls: {}".format(c.value)) | |
print "time for threaded http2 method: {}\n\n\n".format(timeit.timeit(test_http2_multiprocessing, "gc.enable()", number=10)) | |
print("Running the IO loop method. This uses coroutines instead of separate OS threads.") | |
http_client = AsyncHTTPClient() | |
c = Value("i", 0) | |
@gen.coroutine | |
def async_fetch_gen(url): | |
response = yield http_client.fetch(url) | |
raise gen.Return(response.code) | |
@gen.coroutine | |
def async_main(): | |
futures = [] | |
for url in urls: | |
futures.append(async_fetch_gen(url)) | |
results = yield gen.multi(futures) | |
for r in results: | |
if r == 200: | |
c.value += 1 | |
def test_ioloop(): | |
io_loop = ioloop.IOLoop.current() | |
io_loop.run_sync(async_main) | |
print("Number of successful calls: {}".format(c.value)) | |
print "time for ioloop method: {}\n\n\n".format(timeit.timeit(test_ioloop, number=10)) |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
requests | |
hyper | |
tornado | |
singledispatch | |
backports_abc |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This test requests a URL from GCS, 100 times per run. We do 10 runs, so a total of 1000 HEAD requests. | |
Running the naive, synchronous technique. This one makes a HEAD request serially, without any session reuse. Our baseline. | |
Number of successful calls: 100 | |
Number of successful calls: 200 | |
Number of successful calls: 300 | |
Number of successful calls: 400 | |
Number of successful calls: 500 | |
Number of successful calls: 600 | |
Number of successful calls: 700 | |
Number of successful calls: 800 | |
Number of successful calls: 900 | |
Number of successful calls: 1000 | |
time for naive synchronous method: 36.1137590408 | |
Running the HTTP2.0, synchronous method. This reuses sessions across requests, and also implements HTTP2.0 for faster downloading. | |
Number of successful calls: 100 | |
Number of successful calls: 200 | |
Number of successful calls: 300 | |
Number of successful calls: 400 | |
Number of successful calls: 500 | |
Number of successful calls: 600 | |
Number of successful calls: 700 | |
Number of successful calls: 800 | |
Number of successful calls: 900 | |
Number of successful calls: 1000 | |
time for http2 synchronous method: 57.045787096 | |
Running the threaded method. This splits the request calls into three threads, and reuses sessions too. | |
Number of successful calls: 100 | |
Number of successful calls: 200 | |
Number of successful calls: 300 | |
Number of successful calls: 400 | |
Number of successful calls: 500 | |
Number of successful calls: 600 | |
Number of successful calls: 700 | |
Number of successful calls: 800 | |
Number of successful calls: 900 | |
Number of successful calls: 1000 | |
time for threaded method: 6.38562583923 | |
Running the threaded HTTP2.0 method. This splits requests into three threads, and makes requests use HTTP2.0. | |
Number of successful calls: 100 | |
Number of successful calls: 200 | |
Number of successful calls: 300 | |
Number of successful calls: 400 | |
Number of successful calls: 500 | |
Number of successful calls: 600 | |
Number of successful calls: 700 | |
Number of successful calls: 800 | |
Number of successful calls: 900 | |
Number of successful calls: 1000 | |
time for threaded http2 method: 11.7339289188 | |
Running the IO loop method. This uses coroutines instead of separate OS threads. | |
Number of successful calls: 100 | |
Number of successful calls: 200 | |
Number of successful calls: 300 | |
Number of successful calls: 400 | |
Number of successful calls: 500 | |
Number of successful calls: 600 | |
Number of successful calls: 700 | |
Number of successful calls: 800 | |
Number of successful calls: 900 | |
Number of successful calls: 1000 | |
time for ioloop method: 17.4569571018 |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment