Nginx modules for a memcached page cache cluster

Nginx already has a neat module included with it to proxy requests to a memcached server (memcached_pass). Combine that with the upstream round robin load balancer, you have the beginnings of a memcached page cache cluster for your nginx. However, calling each server sequentially can become quite expensive if the key resides in one of the last servers.

Libmemcached has a a few different ways of turning a bunch of memcached servers into a larger cluster. The easiest method to implement is modulo. With modulo, the key is hashed using say CRC32, divided by the number of memcached servers and the remainder is the ‘index’ in the list of memcached servers that the key will reside in. Now this method is fine for a small cluster; but as the cluster grows adding or losing (or even removing) memcached servers starts to become an expensive process.

Now this isn’t very scalable. So in steps the Ketama Algorithm. It is a bit more complex but it is able to spread the keys much more evenly across the cluster. Each server is split into a couple hundred ranges of unsigned ints on a ring (continuum) that ranges from 0 to 2^32. This is done by hashing the server address and a suffix and turning the hash into an unsigned int. When servers are added or removed from the cluster, you will no longer lose potentially huge chunks of contiguous keys. To access data in the cluster, the key is then hashed using the same algorithm, converted into an unsigned int and the server with the next largest value is used (or the first server if it’s past the highest point).

Here’s a graph from the Kai Project (which is a distributed key-value store) showing the reallocation ratio as more nodes are added to the cluster.

I implemented the weighted ketama algorithm found in libmemcached for my upstream module. This allows for a bit more fine tuning of the continuum if needed.

Another issue with using memcached for a page cache for nginx is that keys are limited to 250 bytes. If the page cache is keyed based on the URI, this limit can easily be reached. To combat this, I took the memcached_pass module and added a step that hashes the key using SHA-256. This produces a nice 64 byte key that is the base 16 (hex) representation of the hash.

memcached_hash_pass on GitHub
Direct link: git://github.com/dctrwatson/nginx-memcached-hash-pass.git

upstream_consistent on GitHub
Direct link: git://github.com/dctrwatson/nginx-upstream-consistent.git

To use these modules, nginx has to be recompiled. A great guide to nginx modules can be found on Evan Miller’s site.

Download the latest Nginx source

Quick and easy to compile in the modules:

./configure  --add-module=/path/to/ngx_http_memcached_hash_module \
		--add-module=/path/to/ngx_http_upstream_consistent_module
make
make install

For defining the memcached cluster, just add ‘consistent’ to the upstream definition. If you want to change the weight from the default ‘1’, add the weight attribute to the server definition. Common way is weighting each server by the amount of memory dedicated to each memcached server.

upstream memcached_cluster {
    server 10.0.1.2:11211;
    server 10.0.1.3:11211;
    server 10.0.1.2:11212;
    server 10.0.1.3:11212;
    consistent;
}

The memcached_hash_pass module is analogous to memcached_pass.

location / {
    set $memcached_hash_key $request_uri;
    memcached_hash_pass memcached_cluster;
    error_page 404 = @cache_miss;
}
location @cache_miss {
    internal;
    proxy_pass http://my-app-cluster;
}
  • Pingback: Tweets that mention Nginx modules for a memcached page cache cluster « Dctr Watson -- Topsy.com()

  • Pingback: ehcache.net()

  • http://www.facebook.com/artem.yankov Artem Yankov

    Cool stuff. Which version of nginx did you use?
    I tried to compile with nginx 0.6.39, but it throws errors
    ../nginx-memcached-hash-pass/ngx_http_memcached_hash_module.c:77: error: ‘ngx_http_upstream_bind_set_slot’ undeclared here (not in a function)

    • http://www.dctrwatson.com John Watson

      Originally developed against 0.8.x but I can confirm it works with 1.0.x as well.