Skip to content

Instantly share code, notes, and snippets.

@creationix
Created May 11, 2012 20:39
Show Gist options
  • Star 1 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save creationix/4b71912f266133c69506 to your computer and use it in GitHub Desktop.
Save creationix/4b71912f266133c69506 to your computer and use it in GitHub Desktop.
Benchmark Challenge
[submodule "libuv"]
path = libuv
url = https://github.com/joyent/libuv.git

Benchmark Challenge

As I work on several platforms that are often used as web frameworks, the first example application is always an http server that serves the static string "Hello World" to all users. Then these are benchmarked against each other to prove which platform is the best at being web scale.

Then there comes a flood of comments about how this benchmark is meaningless for any real-world work. So I hereby propose a new standard for http benchmarks. Using this standard data, platforms can implement this challenge and see how well they perform.

The Database

Most real apps have some sort of persistent database. Here I'll have two "tables" named "users" and "sessions". Their contents are simple. This can be implemented using an in-memory object like I've done here, or it can be in redis, riak, or some other system. Whatever makes sense for your platform.

var users = {
  creationix: {
    name: "Tim Caswell",
    twitter: "creationix",
    github: "creationix",
    irc: "creationix",
    projects: ["node", "Luvit", "Luvmonkey", "candor.io", "vfs", "architect", "wheat", "step"],
    websites: ["http://howtonode.org/", "http://creationix.com/", "http://nodebits.org/"]
  },
};

var sessions = {
  eo299pqyw9791jie7yp: {
    username: "creationix",
    pageViews: 0,
  },
};

The Request

There is one page in this made-up system, to render it, the user data needs to be merged with some html.

function myprofile(user) {
  var html = '<h2>' + user.name + '</h2>\n';
  html += '<ul class="links">\n';
  if (user.twitter) {
    html += '  <li><a href="https://twitter.com/' + user.twitter + '/">Twitter</a></li>\n';
  }
  if (user.github) {
    html += '  <li><a href="https://github.com/' + user.github + '/">Github</a></li>\n';
  }
  if (user.websites) {
    user.websites.foreach(function (website) {
      html += '  <li><a href="' + website + '/">' + website + '</a></li>\n';
    });
  }
  html += '</ul>\n';
  return html;
}

The http request should have a cookie called "SESSID" or something that contains the session key "eo299pqyw9791jie7yp". In order to make this benchmark fair, no caching is allowed in the session or database lookups (If you use just one process, then an in-process object or table is fair game). This is to simulate requests not being the same user and session every time. The html template, however can be compiled and saved at startup since a site usually has a finite number of templates.

After getting the session ID out of the cookie, the server will need to look up the session data. Then from that load the html template and merge it with the user's data from the database. The template itself can pull from the db or you can pull from the db and pass the data to the template.

It's up to you to use a fixed Content-Length or chunked encoding in the response, but the following headers must be in the response.

Date: the current timestamp
Server: A simple string identifying your server, eg "node"
Content-Type: "text/html"

The Client

To test these, you need to send in the proper headers. In apache-bench this can be done with the -H flag to pass in the cookie header. The server should reject invalid requests.

Points to Ponder

Notice that I said that it's legal to store the data in-memory. This means that a simple node server can implement this without doing any I/O to gather the data. But if you want to scale across CPU cores for maximum speed, then a shared data store is required. This is a tradeoff in real applications. Scaling to threads within a process, then to processes within a machine, then across machines, across data centers, etc is hard. The larger the cluster, the harder is it to share data. Since the data is read-only, this isn't too hard, but I don't want to make the benchmark too complicated.

#include <stdio.h>
#include <stdlib.h>
#include <assert.h>
#include "uv.h"
typedef enum {
START = 0,
USERNAME,
SESSID
} state_t;
typedef struct {
uv_tcp_t handle;
state_t state;
int offset;
} client_t;
static const char user_key[] = "users/creationix";
static int user_key_len = sizeof(user_key);
static char user[] = \
"{\"name\":\"Tim Caswell\"" \
",\"twitter\":\"creationix\"" \
",\"github\":\"creationix\"" \
",\"irc\":\"creationix\"" \
",\"projects\":[\"node\",\"Luvit\",\"Luvmonkey\",\"candor.io\",\"vfs\",\"architect\",\"wheat\",\"step\"]"\
",\"websites\":[\"http://howtonode.org/\",\"http://creationix.com/\",\"http://nodebits.org/\"]"\
"}";
static int user_len = sizeof(user);
static const char session_key[] = "sessions/eo299pqyw9791jie7yp";
static int session_key_len = sizeof(session_key);
static char session[] = \
"{\"username\": \"creationix\"" \
",\"pageViews\": 0" \
"}";
static int session_len = sizeof(session);
static uv_buf_t on_alloc(uv_handle_t* handle, size_t suggested_size) {
uv_buf_t buf;
buf.base = malloc(suggested_size);
buf.len = suggested_size;
return buf;
}
static void on_close(uv_handle_t* handle) {
printf("%p: Handle closed.\n", handle);
free(handle);
}
static void after_write(uv_write_t* req, int status) {
/* printf("%p: after_write\n", req->handle); */
free(req);
}
static void on_read(uv_stream_t* socket, ssize_t nread, uv_buf_t buf) {
client_t* client = socket->data;
if (nread > 0) {
int i;
for (i = 0; i < nread; i++) {
char c = buf.base[i];
switch (client->state) {
case START:
if (c == user_key[0]) {
client->state = USERNAME;
client->offset = 1;
}
else if (c == session_key[0]) {
client->state = SESSID;
client->offset = 1;
}
else {
uv_close((uv_handle_t*)socket, on_close);
i = nread;
}
break;
case USERNAME:
if (user_key[client->offset] != c) {
uv_close((uv_handle_t*)socket, on_close);
i = nread;
break;
}
if (client->offset == user_key_len - 1) {
uv_write_t* req = (uv_write_t*)malloc(sizeof(uv_write_t));
uv_buf_t data[] = {{ .base = user, .len = user_len }};
uv_write(req, socket, data, 1, after_write);
client->state = START;
break;
}
client->offset++;
break;
case SESSID:
if (session_key[client->offset] != c) {
uv_close((uv_handle_t*)socket, on_close);
i = nread;
break;
}
if (client->offset == session_key_len - 1) {
uv_write_t* req = (uv_write_t*)malloc(sizeof(uv_write_t));
uv_buf_t data[] = {{ .base = session, .len = session_len }};
uv_write(req, socket, data, 1, after_write);
client->state = START;
break;
}
client->offset++;
break;
}
}
}
free(buf.base);
if (nread < 0) {
uv_err_t err = uv_last_error(uv_default_loop());
if (err.code != UV_EOF) {
fprintf(stderr, "%p: %s: %s\n", socket, uv_err_name(err), uv_strerror(err));
}
uv_close((uv_handle_t*)socket, on_close);
}
}
static void on_connection(uv_stream_t* server, int status) {
client_t* client = malloc(sizeof(client_t));
client->state = START;
client->offset = 0;
uv_tcp_t* socket = &client->handle;
uv_tcp_init(uv_default_loop(), socket);
socket->data = client;
printf("%p: New Client.\n", socket);
if (uv_accept(server, (uv_stream_t*)socket)) {
uv_err_t err = uv_last_error(uv_default_loop());
fprintf(stderr, "%p: accept: %s\n", socket, uv_strerror(err));
exit(-1);
}
uv_read_start((uv_stream_t*)socket, on_alloc, on_read);
}
int main() {
uv_tcp_t server;
uv_tcp_init(uv_default_loop(), &server);
struct sockaddr_in address = uv_ip4_addr("0.0.0.0", 5555);
if (uv_tcp_bind((uv_tcp_t*)&server, address)) {
uv_err_t err = uv_last_error(uv_default_loop());
fprintf(stderr, "%p: bind: %s\n", &server, uv_strerror(err));
return -1;
}
if (uv_listen((uv_stream_t*)&server, 128, on_connection)) {
uv_err_t err = uv_last_error(uv_default_loop());
fprintf(stderr, "%p: listen: %s\n", &server, uv_strerror(err));
return -1;
}
printf("%p: Raw C database listening on port 5555\n", &server);
/* Block in the main loop */
uv_run(uv_default_loop());
return 0;
}
OS_NAME=$(shell uname -s)
LIBS+=-lm -ldl -lpthread
ifeq (${OS_NAME},Darwin)
LDFLAGS+=-framework CoreServices
else ifeq (${OS_NAME},Linux)
LDFLAGS+=-Wl,-E
LIBS+=-lrt
endif
all: db
libuv/uv.a:
$(MAKE) -C libuv
db.o: db.c
$(CC) --std=c89 -D_GNU_SOURCE -g -Wall -Werror -c db.c -o db.o -I libuv/include -D_LARGEFILE_SOURCE -D_FILE_OFFSET_BITS=64
db: db.o libuv/uv.a
$(CC) db.o libuv/uv.a $(LIBS) -o db
clean:
rm -f db db.o
$(MAKE) -C libuv clean
var net = require('net');
var EventEmitter = require('events').EventEmitter;
function connect(port, callback) {
var callbacks = [];
var db = new EventEmitter();
db.query = function (table, key, callback) {
callbacks.push(callback);
socket.write(table + "/" + key + "\0");
};
db.close = function () {
socket.end();
};
var socket = net.connect(port, function() {
callback(null, db);
});
// parse responses
socket.on("data", function (chunk) {
// TODO: Don't assume packets always contain exactly one response.
callbacks.shift()(null, JSON.parse(chunk.toString('ascii', 0, chunk.length - 1)));
});
socket.on("close", function () {
throw new Error("Connection closed");
});
socket.on("error", function (err) {
db.close();
throw err;
});
}
var done = 0;
function client() {
connect(5555, function (err, db) {
if (err) throw err;
next();
function next() {
db.query("sessions", "eo299pqyw9791jie7yp", function (err, session) {
if (err) throw err;
// console.log({session: session});
db.query("users", session.username, function (err, user) {
if (err) throw err;
// console.log({user: user});
done++;
next();
});
});
}
});
}
var before = Date.now();
setInterval(function () {
var now = Date.now();
var delta = now - before;
before = now;
var speed = done * 1000 / delta;
console.log("%s cycles in %sms (%s/second)", done, delta, speed);
done = 0;
}, 1000);
for (var i = 0; i < 2; i++) {
client();
}
local net = require('net')
local Emitter = require('core').Emitter
local JSON = require('json')
local setInterval = require('timer').setInterval
local Queue = {};
function Queue.new()
return {first = 0, last = -1}
end
function Queue.push (queue, value)
local last = queue.last + 1
queue.last = last
queue[last] = value
end
function Queue.shift (queue)
local first = queue.first
if first > queue.last then error("queue is empty") end
local value = queue[first]
queue[first] = nil
queue.first = first + 1
return value
end
local function connect(port, callback)
local callbacks = Queue.new()
local db = Emitter:new()
local socket
function db.query(table, key, callback)
socket:write(table .. "/" .. key .. "\0")
Queue.push(callbacks, callback)
end
function db.close()
socket:close()
end
socket = net.createConnection(port, function ()
callback(nil, db)
end)
-- parse responses
socket:on("data", function (chunk)
-- TODO: Don't assume packets always contain exactly one response.
Queue.shift(callbacks)(null, JSON.parse(chunk:sub(1, #chunk - 1)));
end)
socket:on("close", function ()
error("Connection Closed")
end)
socket:on("error", function (err)
db.close();
error(err)
end)
end
local done = 0
local function client()
connect(5555, function (err, db)
if err then error(err) end
local function next()
db.query("sessions", "eo299pqyw9791jie7yp", function (err, session)
if err then error(err) end
-- p({session=session})
db.query("users", session.username, function (err, user)
if err then error(err) end
-- p({user=user})
done = done + 1
next()
end)
end)
end
next()
end)
end
local hrtime = require('uv_native').hrtime
local before = hrtime()
setInterval(1000, function ()
local now = hrtime();
local delta = (now - before) / 10;
before = now;
local speed = done * 1000 / delta;
print(done .. " cycles in " .. delta .. "ms (" .. speed .. "/second)");
done = 0;
end)
client()
client()
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment