Tropical Software Observations

13 June 2012

Posted by Yasith Fernando

at 3:14 PM

0 comments

Message Queue Implementations Reviewed

Message Queue Implementations Reviewed

After Googling around a while, I decided to look at the following message queues. This list was selected based on factors such as support, community size/activity, and feature set. It is by no means an exhaustive list.

Terms and Definitions

Most of the message queue implementations above make use of the following terminology.

Jobs

Tasks that are queued for processing. For example, if you want to send an email you would enqueue the email for future processing by a worker. This is a job. It has all the information needed by the worker to process it. In this case, we're talking about things such as recipient address, subject, body, etc.

Workers

Modules of code that handle the processing of jobs. A worker will fetch jobs from a queue and process them. In the email example above, a worker would grab the job from the queue and process it by sending off the email.

Delayed::Job

  • https://github.com/collectiveidea/delayed_job
  • This is a quite simple solution that will address the needs of 80% of developers out there
  • Delayed::Job was initially extracted from the Shopify.com codebase and was used by Github before they moved on to Resque
  • Delayed::Job stores its jobs in a table in your database
  • You can write jobs by extending a class provided by delayed_job
  • I find it to be a good solution if you need something very simple
  • Delayed::Job polls your database periodically for new jobs which is is not ideal but should not be an issue for most use cases
  • You can prioritise jobs and set the number of maximum retries per job. 

Starling

  • https://github.com/starling/starling
  • Starling was extracted out of the early Twitter codebase
  • Uses the memcached protocol
  • Jobs persisted to filesystem, so no database dependency
  • Lightweight
  • You need to write your own workers to perform polling and job execution
  • Polling means that Starling will be a bit slower than other queues such as RabbitMQ 
  • Last update 1 year ago

ActiveMQ

  • http://activemq.apache.org/
  • Mature, stable product that has been used by a lot of folks for some time
  • A top-level Apache project 
  • Can use the STOMP gem if you want to implement your own thing
  • Or use ActiveMessaging from ThoughtWorks 
  • Evented
  • Comes with a basic monitoring interface

RabbitMQ

  • http://www.rabbitmq.com/
  • Uses AMQP (advanced message queueing protocol) 
  • Written in Erlang 
  • Mature and one of the more well known message queues around 
  • Has lots of features, including file streaming 
  • Tons of clients/libraries for most of the major languages: http://www.rabbitmq.com/devtools.html
  • Fast 
  • Evented

ZeroMQ

  • http://www.zeromq.org/
  • Minimal, lightweight messaging framework 
  • No central server 
  • Ultra fast
  • Focused religiously on transporting messages between n points and nothing else 
  • Quite different from other MQs to the point where it's questionable as to whether ZeroMQ is a message queue or not :) 
  • If used to solve the email example mentioned above, the programmer will most likely have to handle the serialization/deserialization of the email, implementing of a daemon that runs in the background to receive email jobs, process them, handle retry logic, etc. 
  • Basically if you use ZeroMQ, you will have to reinvent some things that are provided out of the box in some of the solutions mentioned above
  • ZeroMQ is not meant to be RabbitMQ, Resque, etc. 
  • It does one job and it does it perfectly

Resque

  • https://github.com/defunkt/resque
  • Developed by Github for their internal use 
  • Uses Redis internally and Redis is super fast 
  • Feature rich 
  • Persists jobs to redis as json 
  • Comes with a rich Sinatra app that provides insight into what's happening with any queue
  • Not evented 
  • Popular within the Ruby community 
  • Well supported, stable, and well tested 

Conclusion

After going through the material I found online about the message queues listed above, I would use Delayed::Job for anything trivial, mainly because I have experience using it and it just works for most day-to-day tasks I've come across.

For use cases with high load and a need for stability, I would feel more comfortable going with Resque. It seems to be fairly straightforward to use and is quite popular so it enjoys good support from the community. Github uses it! And to seal the deal it seems to have the best management interface for monitoring queue internals. This is especially relevant since many message queue servers don't provide a query DSL, as databases typically do.

For instances where high performance and throughput are paramount, I would explore ZeroMQ. It almost certainly requires more work to get up and running (e.g. you'll need to implement most things needed for handling a message queue), but it offers great flexibility and speed.

If supporting multiple languages is a concern, RabbitMQ, ActiveMQ, and ZeroMQ get the nod due to their library availability for most popular languages and platforms.