Module: PluckInBatches::Extensions::RelationExtension
- Defined in:
- lib/pluck_in_batches/extensions.rb
Instance Method Summary collapse
-
#pluck_each(*columns, start: nil, finish: nil, of: 1000, batch_size: of, error_on_ignore: nil, order: :asc, cursor_column: primary_key, &block) ⇒ Object
Yields each set of values corresponding to the specified columns that was found by the passed options.
-
#pluck_in_batches(*columns, start: nil, finish: nil, of: 1000, batch_size: of, error_on_ignore: nil, cursor_column: primary_key, order: :asc, &block) ⇒ Object
Yields each batch of values corresponding to the specified columns that was found by the passed options as an array.
Instance Method Details
#pluck_each(*columns, start: nil, finish: nil, of: 1000, batch_size: of, error_on_ignore: nil, order: :asc, cursor_column: primary_key, &block) ⇒ Object
Yields each set of values corresponding to the specified columns that was found by the passed options. If one column specified - returns its value, if an array of columns - returns an array of values.
See #pluck_in_batches for all the details.
16 17 18 19 |
# File 'lib/pluck_in_batches/extensions.rb', line 16 def pluck_each(*columns, start: nil, finish: nil, of: 1000, batch_size: of, error_on_ignore: nil, order: :asc, cursor_column: primary_key, &block) iterator = Iterator.new(self) iterator.each(*columns, start: start, finish: finish, batch_size: batch_size, error_on_ignore: error_on_ignore, cursor_column: cursor_column, order: order, &block) end |
#pluck_in_batches(*columns, start: nil, finish: nil, of: 1000, batch_size: of, error_on_ignore: nil, cursor_column: primary_key, order: :asc, &block) ⇒ Object
Yields each batch of values corresponding to the specified columns that was found by the passed options as an array.
User.where("age > 21").pluck_in_batches(:email) do |emails|
jobs = emails.map { |email| PartyReminderJob.new(email) }
ActiveJob.perform_all_later(jobs)
end
If you do not provide a block to #pluck_in_batches, it will return an Enumerator for chaining with other methods:
User.pluck_in_batches(:name, :email).with_index do |group, index|
puts "Processing group ##{index}"
jobs = group.map { |name, email| PartyReminderJob.new(name, email) }
ActiveJob.perform_all_later(jobs)
end
Options
-
:batch_size
- Specifies the size of the batch. Defaults to 1000. -
:of
- Same as:batch_size
. -
:start
- Specifies the primary key value to start from, inclusive of the value. -
:finish
- Specifies the primary key value to end at, inclusive of the value. -
:error_on_ignore
- Overrides the application config to specify if an error should be raised when an order is present in the relation. -
:cursor_column
- Specifies the column(s) on which the iteration should be done. This column(s) should be orderable (e.g. an integer or string). Defaults to primary key. -
:order
- Specifies the cursor column(s) order (can be:asc
or:desc
or an array consisting of :asc or :desc). Defaults to:asc
.class Book < ActiveRecord::Base self.primary_key = [:author_id, :version] end Book.pluck_in_batches(:title, order: [:asc, :desc])
In the above code,
author_id
is sorted in ascending order andversion
in descending order.
Limits are honored, and if present there is no requirement for the batch size: it can be less than, equal to, or greater than the limit.
The options start
and finish
are especially useful if you want multiple workers dealing with the same processing queue. You can make worker 1 handle all the records between id 1 and 9999 and worker 2 handle from 10000 and beyond by setting the :start
and :finish
option on each worker.
# Let's process from record 10_000 on.
User.pluck_in_batches(:email, start: 10_000) do |emails|
jobs = emails.map { |email| PartyReminderJob.new(email) }
ActiveJob.perform_all_later(jobs)
end
NOTE: Order can be ascending (:asc) or descending (:desc). It is automatically set to ascending on the primary key (“id ASC”). This also means that this method only works when the primary key is orderable (e.g. an integer or string).
NOTE: By its nature, batch processing is subject to race conditions if other processes are modifying the database.
81 82 83 84 |
# File 'lib/pluck_in_batches/extensions.rb', line 81 def pluck_in_batches(*columns, start: nil, finish: nil, of: 1000, batch_size: of, error_on_ignore: nil, cursor_column: primary_key, order: :asc, &block) iterator = Iterator.new(self) iterator.each_batch(*columns, start: start, finish: finish, batch_size: batch_size, error_on_ignore: error_on_ignore, cursor_column: cursor_column, order: order, &block) end |