Sử dụng sidekiq
Bài đăng này đã không được cập nhật trong 9 năm
1. What is Sidekiq
Sidekiq uses threads to handle many messages at the same time in the same process. It does not require Rails but will integrate tightly with Rails 3 to make background message processing dead simple.
Sidekiq is compatible with Resque. It uses the exact same message format as Resque so it can integrate into an existing Resque processing farm. You can have Sidekiq and Resque run side-by-side at the same time and use the Resque client to enqueue messages in Redis to be processed by Sidekiq.
At the same time, Sidekiq uses multithreading so it is much more memory efficient than Resque (which forks a new process for every job). You'll find that you might need 50 200MB resque processes to peg your CPU whereas one 300MB Sidekiq process will peg the same CPU and perform the same amount of work. Please see my blog post on Resque's memory efficiency and how I was able to shrink a Carbon Five client's resque processing farm from 9 machines to 1 machine.
2. Requirements
I test on Ruby 1.9.3 and JRuby 1.6.x in 1.9 mode. Other versions/VMs are untested but I will do my best to support them. Ruby 1.8 is not supported. Redis 2.0 or greater is required.
3. Install
install Redis http://redis.io/topics/quickstart add gem 'sidekiq' to gem file
4. Feature
4.1. Delayed extensions
-ActionMailer:
Use delay to deliver your emails asynchronously. Use delay_for(interval)or delay_until(time)to deliver the email at some point in the future. UserMailer.delay.welcome_email(@user.id) UserMailer.delay_for(5.days).find_more_friends_email(@user.id) UserMailer.delay_until(5.days.from_now).find_more_friends_email(@user.id)
- ActiveRecord:
Use delay, delay_for(interval), or delay_until(time) to asynchronously execute arbitrary methods on your ActiveRecord instance or classes User.delay.delete_old_users('some', 'params') @user.delay.update_orders(1, 2, 3) User.delay_for(2.weeks).whatever User.delay_until(2.weeks.from_now).whatever Class method
Any class method can be delayed via the same methods as above: MyClass.delay.some_method(1, 'bob', true)
- Addvance option
You can tune the worker options used with a .delay call by passing in options:
MyClass.delay(:retry => false, :timeout => 10).some_method(1, 2, 3)
MyClass.delay_for(10.minutes, :retry => false, :timeout => 10).some_method(1, 2, 3)
- Troubleshooting
The delay extensions use YAML to dump the entire object instance to Redis. The psych engine is the only supported YAML engine; this is the default engine in Ruby 1.9. If you see a YAML dump/load stack trace, make sure syck is not listed.
Objects which contain Procs can be dumped but cannot be loaded by Psych. This can happen with ActiveRecord models that use the Paperclip gem. IMO this is a Ruby bug.
o = Object.new
o.instance_variable_set :@foo, Proc.new { 'hello world' }
yml = YAML.dump(o)
YAML.load(yml)
4.2. Error handing
Exception notification:
Sidekiq integrates with these notification services: Airbrake, Exceptional, ExceptionNotifier (the exception_notification gem now), and Honeybadger. When a worker raises an error, the error and message contents will be sent to the service so that you can be notified and fix the bug Automatic failure retry:
Sidekiq will retry processing failures with an exponential backoff using the formula retry_count**4 + 15 (i.e. 15, 16, 31, 96, 271, ... seconds). It will perform 25 retries over approximately 20 days. Assuming you deploy a bug fix within that time, the message will get retried and successfully processed. After 25 times, Sidekiq will drop the message assuming that it will never be successfully processed. You can disable retry support for a particular worker:
class NonRetryableWorker
include Sidekiq::Worker
sidekiq_options :retry => false
def perform(...)
end
end
Alternatively, you can specify the number of retries for a particular worker:
class NonRetryableWorker
include Sidekiq::Worker
sidekiq_options :retry => 5 # Only five retries and then we're outta here!
def perform(...)
end
end
4.3 Resque Compatibility
Sidekiq uses the exact same message format as Resque in redis so it can process the exact same messages.
Because of this, the Resque client API can push messages onto Redis for Sidekiq to process.
Sidekiq exposes the exact same metrics so the resque-web interface can be used to monitor Sidekiq. I do strongly recommend using sidekiq-web though since there are differences in how Sidekiq and resque treat failures.
- Sidekiq does require a slightly different worker API because your worker must be threadsafe. Asking for a class method to be threadsafe is a hilarious joke to most software developers:
resque:
class MyWorker
def self.perform(name, count)
# do something
end
end
sidekiq:
class MyWorker
include Sidekiq::Worker
def perform(name, count)
# do something
end
end
a. Client API
- Sidekiq has an identical API for pushing jobs onto the queue so you can simply search and replace in your project: resque:
Resque.enqueue(MyWorker, 'bob', 1)
sidekiq:
Sidekiq::Client.enqueue(MyWorker, 'bob', 1)
# equivalent to:
MyWorker.perform_async('bob', 1)
b. Performance
resque and delayed_job use the same inefficient, single-threaded process design. Sidekiq will use dramatically less resources so your processing farm might need 1 machine rather than 10.
c. Monitoring
Sidekiq emits the same metrics as Resque for monitoring purposes. You can actually use resque-web to monitor Sidekiq, just make sure you use the exact same Redis configuration as Resque, including the namespace (which is resque).
You can set Sidekiq to use the Resque namespace via the configure_server and configure_client blocks:
Sidekiq.configure_server do |config|
config.redis = { :namespace => 'resque' }
end
Sidekiq.configure_client do |config|
config.redis = { :namespace => 'resque' }
end
d.. Plugins
Sidekiq does not support Resque's callback-based plugins. It provides a middleware API for registering your own plugins to execute code before/after a message is processed.
e. Limitations
Jobs enqueued with the resque client API do not have parameters like retry in the payload. This means that jobs will not be automatically retried if they fail. Please migrate your code to use the Sidekiq API to automatically pick up this feature or other features that rely on such parameters.
4.4 Scheduled Jobs
Sidekiq allows you to schedule the time when a job will be executed. You use perform_in(interval, *args) or perform_at(timestamp, *args) rather than the standard perform_async(*args): MyWorker.perform_in(3.hours, 'mike', 1) MyWorker.perform_at(3.hours.from_now, 'mike', 1) Checking for New Jobs:
Sidekiq checks for scheduled jobs every 15 seconds by default. You can adjust this interval:
Sidekiq.configure_server do |config|
config.poll_interval = 15
end
5.Problems and Troubleshooting
Sidekiq is multithreaded so your Workers must be thread-safe. Thread-safe libraries Most popular Rubygems are thread-safe in my experience. A few exceptions to this rule:
right_aws aws-sdk (According to a somewhat old post in the discussion group, this gem is thread-safe with the exception of the use of autoload. More details here. Explicitly calling AWS.eager_autoload! during initialization should allow it to be used with Sidekiq) aws-s3 (For S3 and other AWS work, use Fog instead, it is substantially better and under active development (aws-s3 is old and hasn't been updated in years). There is a guide to using S3 with Fog available, and it has been tested to be thread safe). basecamp Some gems can be troublesome:
pg (the postgres driver, make sure PG::Connection.isthreadsafe returns true). RMagick (see #338, try mini_magick instead). therubyracer, versions before 0.11 can cause Sidekiq to hang. Writing thread-safe code
Well-factored code is typically thread-safe without any changes. Always prefer instance variables and methods to class variables and methods, for instance. Also remember that Ruby's require statement is not atomic, so you should require any files your threads will need before your worker starts up, as explained in this Stack Overflow answer. Forking
When using Passenger or Unicorn, you should configure the Sidekiq client within a block that runs after the child process is forked. If you use custom connection arguments like namespace for the server you must adjust the below code snippets to fit your setup:
config/unicorn.rb
after_fork do |server, worker|
Sidekiq.configure_client do |config|
config.redis = { :size => 1 }
end
end
config/initializers/sidekiq.rb
if defined?(PhusionPassenger)
PhusionPassenger.on_event(:starting_worker_process) do |forked|
Sidekiq.configure_client do |config|
config.redis = { :size => 1 }
end if forked
end
end
ActiveRecord
Take care to avoid unsafe code related to the ActiveRecord database connection or connection pool. Calls like verify_active_connections! manipulate the ConnectionPool in a thread-unsafe way. Avoid these calls from inside of your jobs' perform method.
"Cannot find ModelName with ID=12345"
Sidekiq is so fast that it is quite easy to get transactional race conditions where a job will try to access a database record that has not committed yet. The clean solution is to use after_commit:
class User < ActiveRecord::Base
after_commit :greet, :on => :create
def greet
UserMailer.delay.send_welcome_email(self.id)
end
end
Note: after_commit will not be invoked in your tests if you have use_transactional_fixtures enabled.If you aren't using ActiveRecord models, use a scheduled perform to run after you can be sure the transaction has committed: MyWorker.perform_in(5.seconds, 1, 2, 3)
Why does the Sidekiq Web UI look terrible / not render correctly in production but works fine in development?
Sidekiq Web wants to serve CSS/JS assets out of the gem. Your production web server is not forwarding CSS/JS requests to your app so Sidekiq Web can serve them but instead returning a 404 if they aren't found on the filesystem. The workers are not starting
If you are migrating from Resque make sure there the Redis database does not contain any old tasks. You can clear all data with redis-cli FLUSHALL. Another common problem is that you might have defined a namspace in Sidekiq.configure_server but not in Sidekiq.configure_client or named it something else. Also make sure to configure both! Too many connections to MongoDB
If you are using Mongoid you'll also want to use the kiqstand middleware to properly disconnect workers so your connections aren't overloaded.
6.Middleware
Sidekiq has a similar notion of middleware to Rack: these are small bits of code that can implement functionality. Sidekiq breaks middleware into client-side and server-side. Client-side middleware runs around the pushing of the message to Redis and allows you to modify/stop the message before it gets pushed.
Server-side middleware runs 'around' message processing. The error notification feature is implemented as a simple middleware. Writing your own middleware is easy; this is the server-side middleware which ensures that ActiveRecord connections are closed after each message is processed:
class Sidekiq::Middleware::Server::ActiveRecord
def call(worker, msg, queue)
yield
ensure
::ActiveRecord::Base.clear_active_connections! if defined?(::ActiveRecord)
end
end
Your middleware will be called with the worker instance which will process the message along with the full Hash which represents the message to process and the name of the queue it was pulled from. You then register your middleware as part of the chain:
class AcmeCo::MyMiddleware
def initialize(options=nil)
# options == { :foo => 1, :bar => 2 }
end
def call(worker, msg, queue)
yield
end
end
Sidekiq.configure_server do |config|
config.server_middleware do |chain|
chain.add AcmeCo::MyMiddleware, :foo => 1, :bar => 2
end
end
Remember that the workers running in the Sidekiq server can themselves push new jobs to Sidekiq, thus acting as clients. You must configure your client middleware within the configure_server block also in that case:
Sidekiq.configure_client do |config|
config.client_middleware do |chain|
chain.add AcmeCo::MyClientMiddleware
end
end
Sidekiq.configure_server do |config|
config.client_middleware do |chain|
chain.add AcmeCo::MyClientMiddleware
end
config.server_middleware do |chain|
chain.add AcmeCo::MyMiddleware, :foo => 1, :bar => 2
end
end
Default Middleware
def self.default_middleware
Sidekiq::Middleware::Chain.new do |m|
m.add Sidekiq::ExceptionHandler
m.add Sidekiq::Middleware::Server::Logging
m.add Sidekiq::Middleware::Server::RetryJobs
m.add Sidekiq::Middleware::Server::ActiveRecord
m.add Sidekiq::Middleware::Server::Timeout
end
end
If you need to remove a middleware for some reason, you can do this in your configuration:
Sidekiq.configure_server do |config|
config.server_middleware do |chain|
chain.remove Sidekiq::Middleware::Server::RetryJobs
end
end
All rights reserved