Skip to content

Instantly share code, notes, and snippets.

@lampeh
Created October 16, 2012 00:13
Show Gist options
  • Save lampeh/3896514 to your computer and use it in GitHub Desktop.
Save lampeh/3896514 to your computer and use it in GitHub Desktop.
Le perroquet: push updated data directly into varnish

Varnish key/value store

From "Smart Pre-Fetching: Varnish @ Yakaz.com by Pierre-Gilles Mialon, Yakaz.com"

Use varnish as a memcached with HTTP interface. Push data updates directly into the cache without having to buffer them outside varnish and pulling them through a backend.

  • no backend! the content is echoed by another varnish thread through vcl_error()
  • the content lives only in the varnish cache until it expires or the cache is cleared
  • gzip compression is handled by varnish. add "FC-Content-Encoding: gzip" to the request for client-side compression

PoC Limits:

  • default usable data size: ~5kB before base64-encoding, after optional client-side gzip
  • default total request header limit: 8kB (run-time parameter http_req_hdr_len). exceeding the limit results in "413 Request Entity Too Large"
  • default total request size limit: 32kB (run-time parameter http_req_size). exceeding the limit results in a connection reset: "11 SessionClose c blast"
  • default session workspace limit: 64kB (run-time parameter sess_workspace). should be several times larger than the base64 data
  • binary data requires either digest.synthetic_base64_decode (a frankenfunction built from base64_decode and null.synth) or libvmod-null and a FC-Data-Length header
  • the request muss not be "pass"ed in VCL or varnish won't cache the response
  • cache misses without Forcecontent header will be sent to the default backend unless handled elsewhere in your VCL
  • request body is not passed to the backend through a "miss", therefore special request headers must transport the base64-encoded content
  • request header size is always limited. future VCL body access could make it work with even larger objects regardless of http_req_(size|hdr_len).

Setup

VCL

import std;
import digest;

## point to local varnish instance
backend localvarnish {
    .host = "127.0.0.1";
    .port = "80";
}

sub vcl_recv {
    if (req.http.Forcecontent) {
        ## check signature. use hmac_sha256 instead of hash_sha256 if possible
        ## TODO: make FC-Content-Type and FC-Cache-Control optional, include in signature if present
        if (req.http.FC-Auth && req.http.FC-TS && req.http.FC-Content-Type &&
            (now - std.duration(req.http.FC-TS + "s", 0s)) < 300s &&
            req.http.FC-Auth == digest.hash_sha256(req.http.FC-TS + req.http.Forcecontent + "s3cretf00" + req.http.FC-Content-Type)) {
                if (req.http.FC-Echo == "1") {
                    ## echo content
                    error 623;
                } else {
                    ## invoke varnish parrot
                    set req.http.FC-Echo = "1";
                    set req.hash_ignore_busy = true;
                    set req.hash_always_miss = true;
                    set req.backend = localvarnish;
                    return(lookup);
                }
        } else {
            error 403 "Unauthorized";
        }
    }
}

sub vcl_fetch {
    if (req.http.Forcecontent) {
        ## gzip the response before caching it
        ## TODO: maybe blacklist (image|video)/ instead and/or
        ## compress only small objects?
        if (beresp.http.Content-Type ~ "^(text|application)/") {
            set beresp.do_gzip = true;
        }
    }
}

sub vcl_error {
    ## the varnish parrot
    ## echo content passed in Forcecontent
    if (obj.status == 623) {
        set obj.status = 200;
        set obj.response = "Ok";

        ## set response Content-Type
        if (req.http.FC-Content-Type) {
            set obj.http.Content-Type = req.http.FC-Content-Type;
        } else {
            set obj.http.Content-Type = "application/octet-stream";
        }

        ## set response Content-Encoding
        if (req.http.FC-Content-Encoding) {
            set obj.http.Content-Encoding = req.http.FC-Content-Encoding;
        } 

        ## set response Cache-Control
        if (req.http.FC-Cache-Control) {
            set obj.http.Cache-Control = req.http.FC-Cache-Control;
        } else {
            ## v-maxage needs VCL support to set beresp.ttl
            set obj.http.Cache-Control = "v-maxage=4294967295, max-age=300";
        }

        if (req.http.FC-Base64) {
            ## synthetic() expects a null-terminated string!
            synthetic("" + digest.base64_decode(req.http.Forcecontent));
            ## for binary data, use synthetic_base64_decode:
            #digest.synthetic_base64_decode("" + req.http.Forcecontent);
            ## or use libvmod-null and pass the decoded length in a request header
            #null.synth(digest.base64_decode(req.http.Forcecontent), req.http.FC-Data-Length);
        } else {
            synthetic("" + req.http.Forcecontent);
        }

        return(deliver);
    }
}   

Update key/value content

#!/bin/bash

#set -x

key="$1"
valuetype="$2"
value="`cat|base64 -w0`"

ts="`date "+%s"`"

curl -H "FC-Auth: `echo -n "${ts}${value}s3cretf00${valuetype}"|sha256sum|awk '{ print $1 }'`" \
     -H "FC-TS: ${ts}" \
     -H "FC-Base64: 1" \
     -H "FC-Content-Type: ${valuetype}" \
     -H "Forcecontent: ${value}" \
     -I ${key}

Usage

Update key

## text works well
echo "Updated at `date`" | ./fc.sh http://parrot.example.com/foo text/plain

## even binary content
convert bigpicture.jpg -resize 32x32 jpg:- | ./fc.sh http://parrot.example.com/bar image/jpeg

Fetch cached value

curl -i http://parrot.example.com/foo
@pmialon
Copy link

pmialon commented Jan 4, 2013

Hi,

I have made some modification in my fork, now when an object is updated in the cache, we ensure that the old version will be remove in the next 4 minutes from the cache storage.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment