lundi 4 septembre 2017

Resque workers dying silently

I'm having issues with the workers that I have set up for a website. Here's Some background of the app: It runs on rails 3.0.2, using mongodb, redis gem 2.2.2. In order to keep the workers up and running I have set up the god gem and specify that 6 workers should run in the production environment. I'll be pasting the resque.god.rb file bellow. Also, there's a unique Ubuntu server that is set up for the resque-workers and elasticsearch services only so it doesn't share any other service.

My problem is that for any reason, the workers keep dying and just log "***Exiting..." in my log/resque-worker.log file which is extremely annoying because I don't know what is going on. It doesn't log anything in the syslog file nor the dmesg

This is a piece of what I get in the log (not helpful for me)

*** Starting worker workers:19166:*
*** Starting worker workers:19133:*
*** Running before_first_fork hook
*** Exiting...
/usr/local/www/tap-production/shared/bundle/ruby/1.9.1/gems/rake-0.8.7/lib/rake/alt_system.rb:32: Use RbConfig instead of obsolete and deprecated Config.
(in /data/www/tap-production/releases/20170904162514)
/usr/local/www/tap-production/shared/bundle/ruby/1.9.1/gems/activesupport-3.0.20/lib/active_support/dependencies.rb:242:in `block in require': iconv will be deprecated in the future, use String#encode instead.
*** Running before_first_fork hook
*** Exiting...
/usr/local/www/tap-production/shared/bundle/ruby/1.9.1/gems/rake-0.8.7/lib/rake/alt_system.rb:32: Use RbConfig instead of obsolete and deprecated Config.
(in /data/www/tap-production/releases/20170904162514)
/usr/local/www/tap-production/shared/bundle/ruby/1.9.1/gems/activesupport-3.0.20/lib/active_support/dependencies.rb:242:in `block in require': iconv will be deprecated in the future, use String#encode instead.
*** Starting worker workers:19251:*
*** Starting worker workers:19217:*
*** Running before_first_fork hook
*** Exiting...
/usr/local/www/tap-production/shared/bundle/ruby/1.9.1/gems/rake-0.8.7/lib/rake/alt_system.rb:32: Use RbConfig instead of obsolete and deprecated Config.
(in /data/www/tap-production/releases/20170904162514)
/usr/local/www/tap-production/shared/bundle/ruby/1.9.1/gems/activesupport-3.0.20/lib/active_support/dependencies.rb:242:in `block in require': iconv will be deprecated in the future, use String#encode instead.
*** Running before_first_fork hook
*** Exiting...
/usr/local/www/tap-production/shared/bundle/ruby/1.9.1/gems/rake-0.8.7/lib/rake/alt_system.rb:32: Use RbConfig instead of obsolete and deprecated Config.
(in /data/www/tap-production/releases/20170904162514)
/usr/local/www/tap-production/shared/bundle/ruby/1.9.1/gems/activesupport-3.0.20/lib/active_support/dependencies.rb:242:in `block in require': iconv will be deprecated in the future, use String#encode instead.
*** Starting worker workers:19330:*
*** Starting worker workers:19297:*
*** Running before_first_fork hook
*** Exiting...
/usr/local/www/tap-production/shared/bundle/ruby/1.9.1/gems/rake-0.8.7/lib/rake/alt_system.rb:32: Use RbConfig instead of obsolete and deprecated Config.
(in /data/www/tap-production/releases/20170904162514)
/usr/local/www/tap-production/shared/bundle/ruby/1.9.1/gems/activesupport-3.0.20/lib/active_support/dependencies.rb:242:in `block in require': iconv will be deprecated in the future, use String#encode instead.

Here's my resque.god.rb code:

require 'tlsmail'
rails_env = ENV['RAILS_ENV']
rails_root = ENV['RAILS_ROOT']
rake_root = ENV['RAKE_ROOT']
num_workers = rails_env == 'production' ? 6 : 1
# Change cache to my_killer_worker_job if you are testing in development. remember to enable it on config/resque_schedule.yml - Fabian
queue = rails_env == 'production' ? '*' : 'my_killer_worker_job'

God::Contacts::Email.defaults do |d|
  Net::SMTP.enable_tls(OpenSSL::SSL::VERIFY_NONE)
  if rails_env == "production"
    #Change this settings for your own purposes
    d.from_name = "#{rails_env.upcase}: Process monitoring"
    d.delivery_method = :smtp
    d.server_host = 'smtp.gmail.com'
    d.server_port = 587
    d.server_auth = :login
    d.server_domain = 'gmail.com'
    d.server_user = 'XXXX@gmail.com'
    d.server_password = 'XXXX'
  end
end



God.contact(:email) do |c|
  c.name = 'engineering'
  c.group = 'developers'
  c.to_email = 'engineering@something.com'
end

num_workers.times do |num|
  God.watch do |w|
    w.name          = "resque-#{num}"
    w.group         = 'resque'
    w.interval      = 30.seconds
    w.env           = { 'RAILS_ENV' => rails_env, 'QUEUE' => queue, 'VERBOSE' => '1' }
    w.dir           = rails_root
    w.start         = "bundle exec #{rake_root}/rake resque:work"
    w.start_grace   = 10.seconds
    w.log           = File.join(rails_root, 'log', 'resque-worker.log')

    # restart if memory gets too high
    w.transition(:up, :restart) do |on|
      on.condition(:memory_usage) do |c|
        c.above = 200.megabytes
        c.times = 2
        # c.notify = 'engineering'
      end
    end

    # determine the state on startup
    w.transition(:init, { true => :up, false => :start }) do |on|
      on.condition(:process_running) do |c|
        c.running = true
        # c.notify = 'engineering'
      end
    end

    # determine when process has finished starting
    w.transition([:start, :restart], :up) do |on|
      on.condition(:process_running) do |c|
        c.running = true
        c.interval = 5.seconds
        # c.notify = 'engineering'
      end

      # failsafe
      on.condition(:tries) do |c|
        c.times = 5
        c.transition = :start
        c.interval = 5.seconds
        # c.notify = 'engineering'
      end
    end

    # start if process is not running
    w.transition(:up, :start) do |on|
      on.condition(:process_running) do |c|
        c.running = false
        c.notify = {:contacts => ['engineering'], :priority => 1, :category => "workers"}
      end
    end


  end
end

Please let me know your thoughts.

Aucun commentaire:

Enregistrer un commentaire