Skip to content

Instantly share code, notes, and snippets.

@rodneyrehm
Last active January 30, 2019 21:28
Show Gist options
  • Star 0 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save rodneyrehm/77a694a6372ab76b307425b00e0d3f4e to your computer and use it in GitHub Desktop.
Save rodneyrehm/77a694a6372ab76b307425b00e0d3f4e to your computer and use it in GitHub Desktop.
execute batches of HTTP requests in parallel

Parallel HTTP request execution

  • Takes values from a text file and converts them into HTTP requests (against an echo service with a request execution time of 150ms - 200ms).
  • splits the text file into chunks of $CHUNKS number of items which get executed in sequence by a single curl invocation that reuses the connection (reducing TCP and TLS overhead)
  • $PARALLEL number of chunks are executed concurrently by sem (of GNU parallel)

Example

The demo generates 100 requests. Each request's response is stored to disk (because why not).

run-sequentially.sh executes the 100 requests in sequential curl invocations. Each invocation fires up a new process, establishes a new TCP connection and does the TLS handshake anew.

run-parallel.sh executes 5 request batches concurrently. Each batch consists of 10 sequential requests (re)using a single TCP/TLS connection.

time ./run-parallel.sh
…
./run-parallel.sh  1.05s user 0.80s system 30% cpu 6.152 total
time ./run-sequentially.sh
…
./run-sequentially.sh  3.10s user 0.70s system 1% cpu 4:07.13 total

As becomes apparent from those numbers (6 seconds vs. 4 minutes) the parallel execution with reused connections plays in a different league - it's roughly 25 times faster for 100 requests. For 200 requests the parallel execution finished after 10 seconds - guess how long the sequentially executed requests took…

Notes

  • The chunk size is restricted by the executing machine's resources - that's the one sending the requests.
  • The number of parallel executions is restricted by the host machine's resources - that's the one you're bombarding with requests.
alpha-0
bravo-0
charlie-0
delta-0
echo-0
foxtrot-0
golf-0
hotel-0
india-0
juliet-0
kilo-0
lima-0
mike-0
november-0
oskar-0
papa-0
quebec-0
romeo-0
sierra-0
tango-0
uniform-0
victor-0
whiskey-0
xray-0
yankee-0
zulu-0
alpha-1
bravo-1
charlie-1
delta-1
echo-1
foxtrot-1
golf-1
hotel-1
india-1
juliet-1
kilo-1
lima-1
mike-1
november-1
oskar-1
papa-1
quebec-1
romeo-1
sierra-1
tango-1
uniform-1
victor-1
whiskey-1
xray-1
yankee-1
zulu-1
alpha-2
bravo-2
charlie-2
delta-2
echo-2
foxtrot-2
golf-2
hotel-2
india-2
juliet-2
kilo-2
lima-2
mike-2
november-2
oskar-2
papa-2
quebec-2
romeo-2
sierra-2
tango-2
uniform-2
victor-2
whiskey-2
xray-2
yankee-2
zulu-2
alpha-3
bravo-3
charlie-3
delta-3
echo-3
foxtrot-3
golf-3
hotel-3
india-3
juliet-3
kilo-3
lima-3
mike-3
november-3
oskar-3
papa-3
quebec-3
romeo-3
sierra-3
tango-3
uniform-3
victor-3
#!/bin/bash
set -e
# number of items per batch
CHUNKS=10
# number of parallel batch executions
PARALLEL=5
# split huge data set into smaller chunks
rm -rf chunks || true; mkdir 'chunks'
split -l "${CHUNKS}" data.txt 'chunks/chunk_'
# container to store responses
rm -rf results || true; mkdir results
function execute_batch () {
ARGS=()
BATCH=$(cat "${1}/chunks/${2}")
for ITEM in ${BATCH}; do
ARGS+=( --next --silent --output "${1}/results/${ITEM}" --show-error --fail "https://postman-echo.com/get?item=${ITEM}" )
done
echo "executing batch ${2}"
curl "${ARGS[@]}"
}
export -f execute_batch
for FILE in ./chunks/chunk_*; do
CHUNK=$(basename "${FILE}")
sem -j "${PARALLEL}" "execute_batch \"${PWD}\" \"${CHUNK}\""
done
sem --wait
echo "all done"
#!/bin/bash
set -e
# container to store responses
rm -rf results || true; mkdir results
BATCH=$(cat "./data.txt")
for ITEM in ${BATCH}; do
echo "executing ${ITEM}"
curl --silent --output "./results/${ITEM}" --show-error --fail "https://postman-echo.com/get?item=${ITEM}"
done
echo "all done"
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment