In the upcoming 0.4 release of the nginx-push-stream-module, it will have support for the Nginx Gzip filter. Being able to gzip messages will free up bandwidth and decrease latency when under high load. However, the default deflate settings Nginx uses are not ideal for the high concurrency and small messages that are typically sent with the push-stream module. By default, Nginx may allocate up to a relatively large (264kb) chunk of memory for zlib upfront for every request that supports gzip. This adds up fast when there are thousands of concurrent connections to Nginx.
Cassandra Metrics Graphite Reporter Agent
With the release of Cassandra 1.2, many new metrics were instrumented with Metrics with CASSANDRA-4009. However, getting those metrics into something like Graphite was still a polling process. Metrics does have Reporters that let Java Agents push metrics stored in the registry to various datastores (Graphite, Ganglia, etc.) Currently, this requires writing the agent code, compiling it and loading it into Cassandra. Soon there will be a way to just configure these reporters using metrics-reporters-config with CASSANDRA-4430. For now though, this simple agent will push metrics into Graphite while filtering out some noise.
Datastax has a blog post with a brief outline of how to enable the GraphiteReporter but it doesn’t go into much detail or release any code. This post augments it with the missing pieces.
Jenkins and Phabricator sitting in a tree
We’ve been using Phabricator for just about a year here at Disqus. It was originally created at Facebook and open sourced in Spring 2011. To sum it up using their own words: “Phabricator is a open source collection of web applications which make it easier to write, review, and share source code.” The small team working on it at Phacility (the SaaS company behind Phabricator) is constantly improving it so it’s on a continuous release cycle.
Jenkins has been used for continuous integration testing here for much longer. I’m not exactly sure for how long since it was setup before I started in September 2011. David Cramer has always been pushing for an ideal continuous integration/deployment system (IE here here) so part of my duties has been to improve what we have to achieve that goal (we’re hiring).
Currently, there isn’t a direct CI hook into Phabricator that is as deep as say Github+Travis. However, with a little script and an simple event listener for Arcanist, we can replicate most of that functionality.
GNU Make, double quotes and lists
Our lead operations engineer, Scott, put together a nice system called fpm-recipes using Git, GNU Make and FPM to keep track of how we build DEB packages of various things at Disqus. Instead of each ops engineer having their own way for building packages that are stored in various places (IE: shell history) we now have a centralized and standardized system. No more do we have to ask each other to update a package they maintain or curse ourselves for not saving the steps somewhere organized/accessible.
In no time I was able to get erlang-nox and zeromq recipes written (since they haven’t been updated in Ubuntu 10.04 LTS (Lucid Lynx) in ages). However, when I went back and tried to add their dependencies, things got a little hairy. GNU Make’s foreach function assumes lists “are whitespace-separated words”, so having something like DEPENDS := "libuuid1 (>= 2.16)" really doesn’t work as intended when passing it to foreach. So I wrote a function, quoted_map, that will map another function of a quoted list of strings. In fpm-recipes, it adds the -d and makes sure it’s quote (-d "libuuid1 (>= 2.16)") and adds to the FPM args list.
See the code: Continue reading
mutt and gmail
Per recommendation from a neckbeard friend, Aaron, I set out to try out Mutt as my email client. Since my email is hosted by Gmail, there’s a little extra configuration needed than just setting up an IMAP inbox. Also, since people actually send multimedia emails, I wrote a small patch for Mutt that detects it’s talking to a Gmail IMAP server and adds a couple custom headers to the message, one of which is the permalink to the email so it can be easily opened in a browser if need be. I’m sure I am one of the few that actually like Google Contacts, so I use Goobook for address completion. And no reason to go through all the trouble of setting up Mutt and not setup GPG for signing/encryption too. I am a fan of Ethan Schoonover’s Solarized color scheme, but I prefer a bit more contrast: I modified the Mutt colors Solarized Dark 16 colors for this preference.
Latest versions of my conf/patch can be found at:
mutt conf GitHub repo
mutt gmail patch GitHub repo
limits.conf and daemons on Ubuntu
I recently was setting up a couple ElasticSearch and RabbitMQ instances when I noticed RabbitMQ was still reporting an abysmally low fd limit in its log file at startup. I double checked my /etc/security/limits.conf and sure enough, limits were properly set to 64000. Yet for some reason it was still only seeing a max of 1024.
It turns out that in Ubuntu 10.04, /etc/pam.d/common-session{,-noninteractive} does not contain:
session required pam_limits.so
Adding that, solved my issue:
=INFO REPORT==== 1-Feb-2012::00:05:47 ===
Limiting to approx 63900 file handles (57508 sockets)
UPDATE (Wed Apr 18 15:01:17 PDT 2012)
For RabbitMQ 2.8.x, the init script uses start-stop-daemon. Apply this patch:
--- /etc/init.d/rabbitmq-server.old 2012-04-18 21:54:05.852307662 +0000
+++ /etc/init.d/rabbitmq-server 2012-04-18 21:49:17.594182809 +0000
@@ -35,6 +35,8 @@
RETVAL=0
set -e
+[ -r /etc/default/${NAME} ] && . /etc/default/${NAME}
+
ensure_pid_dir () {
PID_DIR=`dirname ${PID_FILE}`
if [ ! -d ${PID_DIR} ] ; then
And then in /etc/default/rabbitmq-server
ulimit -n 65000
Realtime Postfix stats aggregator
With a user base as large as Disqus‘, there is a ton of new comment and reply notification emails to send. Indubitable there is a user that accidentally (maybe even purposefully) subscribes to extremely hot threads. When they start receiving a stream of emails, their email provider doesn’t appreciate the spike in traffic. They usually show their annoyance by temporarily blocking and then rate limiting our Postfix instance from relaying email to that inbox.
Unfortunately, the only decent Postfix stats aggregators I could find were written in Perl (pflogsumm.pl) and consumed log files for some ad-hoc stats generation. Though, I was quite lazy after trying a few variations and finding the same Perl tools over and over so please leave a comment about your favorite Postfix stats aggregator.
I really didn’t need anything too fancy so I decided to take a stab at it myself. After thinking about it for a few minutes, I decided to try out using Python threading to have a small pool of workers run some regex on a queue of lines from syslog. All the stats are then gathered in a dictionary and either spit out to stdout or there is a VERY simple TCP server thread listening for ‘stats’ or ‘prettystats’ to dump the current cumulative stats as a JSON dictionary. Full readme can be found on the Github page. Best part, it requires no 3rd party libraries.
postfix-stats on Github
postfix-stats on PyPi
View more to see example output… Continue reading
django-celery, eventlet and debugging blocking
I recently wrote a couple Celery tasks that are purely IO bound. So instead of using the default multiprocessing execution pool, I used the Eventlet execution pool. With just a small change in Celery settings, I was off to the races.
Wrong! After some amount of time, it just sits at 100% CPU and no longer processes tasks. Unfortunately, Celery calls monkey_patch a little late when coupled with Django. Django does some magic of its own to various components so monkey_patch needs to be called before Django does any initialization. After a little digging, I found I can just set an environment variable to prevent Celery from doing the monkey patching and at the same time use it to signal manage.py to call monkey_patch before the initialization my Django app.
OS X System Equalizer
I don’t have a music collection to listen to using iTunes. I listen to music through online streaming services like Pandora and Slacker. The downside to this is that Google Chrome doesn’t have a built-in audio equalizer like iTunes. I have a decent pair of headphones, Audio-Technica ATH-M50, and the default line-out from an Apple desktop machine has always sounded a little “empty” to me. Like they slightly tweak it for normal/smaller headphones. Today I finally figured out how to get a global system equalizer for OS X (for free) so I can push the bass up a little to compensate for the “emptiness”.
Continue reading
FiSH module for ZNC
So a bunch of us were sitting around talking one night and decided just for the lulz we would also start using FiSH Encryption in one of our back back channels on IRC. Not everyone that is in the channel are neck-beards and use irssi; some even don’t use XChat. To easy the transition for everyone, we setup a ZNC server to be a bouncer that also handles the FiSH encryption (yes this can defeat the purpose of FiSH, that’s why everyone is required to use SSL when connecting.)
As we were setting up, those of us who already were using FiSH couldn’t tell who was talking encrypted and who wasn’t. So @noah256 brought up the idea of prefixing encrypted messages. Later he also realized this can be spoofed; so he suggested to also prefix unencrypted messages from expected encrypted targets. At the same time, we realized when trying to talk to people without FiSH installed, it wasn’t really easy to temporarily send unencrypted messages so we also added a prefix to tell the module to not encrypt the message.
I used the fish.cpp from the fish – ZNC wiki page. For now, the behavior I added is hard coded. I hope when I have free time again, to make the prefixes and disable flag to be configurable.
My modified module can be found in my znc-fish repo on GitHub