Thursday, August 14, 2008

Monitoring BackgrounDRb workers with God

Updated 2008/09/24 for latest version of backgroundrb 1.0.4

The other day on our staging server, I noticed that the BackgrounDRb queue worker had died. As it turned out, the queue worker had died over 3 months ago!!

There was no cause for alarm as the staging server isn't critical but this did start me to worrying. We needed to implement a monitoring solution which not only verified that BackgrounDRb was running but also that particular workers were running.

As we had just implemented god monitoring with a custom condition for another issue, its a slam dunk to do the same again. (Thanks to Jesse Newland and his god tutorial at AtlRUG.)

Here's the configuration file that got it done for us.


1 #run on command line with 'god -c backgroundrb.god -D'
2 RAILS_ROOT = '/var/www/rails/rollbook/current'
3
4 #load required rails and backgroundrb files
5 require File.dirname(__FILE__) + '/../boot'
6 require File.dirname(__FILE__) + '/../environment'
7 require 'erb'
8 $LOAD_PATH << "#{RAILS_ROOT}/vendor/plugins/backgroundrb/lib"
9 require "#{RAILS_ROOT}/vendor/plugins/backgroundrb/lib/backgroundrb.rb"
10
11 #create custom condition for checking that QueryProcessingWorker is running
12 MiddleMan = BackgrounDRb::ClusterConnection.new
13 module God
14 module Conditions
15 class Backgroundrb < PollCondition
16 def initialize; super; end
17 def valid?; true; end
18
19 def test
20 begin
21 queue_worker = MiddleMan.all_worker_info.values.flatten.select { |w| :queue_processing_worker == w[:worker] }
22 queue_worker.empty?
23 rescue #if all_worker_info raises exception, then bdrb isn't running and we were unable to connect
24 true
25 end
26 end
27 end
28 end
29 end
30
31 God.watch do |w|
32 w.name = 'backgroundrb'
33 w.interval = 1.minute
34 w.restart = "cd #{RAILS_ROOT} && #{RAILS_ROOT}/script/backgroundrb -e production stop && #{RAILS_ROOT}/script/backgroundrb -e production start"
35 w.stop = "cd #{RAILS_ROOT} && #{RAILS_ROOT}/script/backgroundrb -e production stop"
36 w.start = "cd #{RAILS_ROOT} && #{RAILS_ROOT}/script/backgroundrb -e production start"
37 w.grace = 1.minute
38 w.pid_file = "#{RAILS_ROOT}/tmp/pids/backgroundrb_11000.pid"
39
40 w.start_if do |start|
41 start.condition(:process_running) do |c|
42 c.running = false
43 end
44 end
45
46 w.restart_if do |restart|
47 restart.condition(:backgroundrb) do |c|
48 #just restart it
49 end
50 end
51 end


In the select call on line 21, you can modify the condition to access :job_key or :status as well. Obviously, you need to modify RAILS_ROOT for your situation.

If you have suggestions for improvement or questions, hit me up in the comments. Enjoy!