Class: ModelIterator
- Inherits:
-
Object
- Object
- ModelIterator
- Defined in:
- lib/model_iterator.rb
Overview
Iterates over large models, storing state in Redis.
Defined Under Namespace
Classes: MaxIterations
Constant Summary collapse
- VERSION =
"1.0.2"
Class Attribute Summary collapse
-
.redis ⇒ Object
Gets or sets a default Redis client object for iterators.
Instance Attribute Summary collapse
-
#clause ⇒ Object
readonly
Gets a String SQL Where clause fragment.
-
#clause_args ⇒ Object
readonly
Gets an Array of values to be sql-escaped and joined with the clause.
-
#current_id(refresh = false) ⇒ Object
Public: Points to the latest record that was yielded, by database ID.
-
#id_clause ⇒ Object
readonly
Gets the String fully qualified ID field (with the table name).
-
#id_field ⇒ Object
readonly
Gets the String name of the ID field.
-
#job ⇒ Object
Gets or sets a Proc that is called with each model instance while iterating.
-
#joins ⇒ Object
readonly
Gets the :joins value for ActiveRecord::Base.find.
-
#klass ⇒ Object
readonly
Gets a reference to the ActiveRecord::Base class that is iterated.
-
#limit ⇒ Object
Gets or sets the number of records that are returned in each database query.
-
#max ⇒ Object
readonly
Gets a Fixnum value of the maximum iterations to run, or 0.
-
#prefix ⇒ Object
readonly
Gets a String used to prefix the redis keys used by this object.
-
#redis ⇒ Object
Gets or sets the Redis client object.
Instance Method Summary collapse
-
#cleanup ⇒ Object
Public: Cleans up any redis keys.
-
#conditions ⇒ Object
Public: Gets an ActiveRecord :connections value, ready for ActiveRecord::Base.all.
-
#each ⇒ Object
(also: #run)
Public: Iterates through the whole dataset, yielding individual records as they are received.
-
#each_set(&block) ⇒ Object
Public: Iterates through the whole dataset.
-
#find_options ⇒ Object
Public: Builds the ActiveRecord::Base.find options for a single query.
-
#initialize(klass, *args) ⇒ ModelIterator
constructor
Initializes a ModelIterator instance.
- #key ⇒ Object
-
#records ⇒ Object
Public: Queries the database for the next page of records.
Constructor Details
#initialize(klass, *args) ⇒ ModelIterator
Initializes a ModelIterator instance.
klass - ActiveRecord::Base class to iterate. clause - String SQL WHERE clause, with ‘?’ placeholders for values. *values - Optional array of values to be added to a custom SQL WHERE
clause.
options - Optional Hash options.
:redis - A Redis object for storing the state.
:order - Symbol specifying the order to iterate. :asc or
:desc. Default: :asc
:id_field - String name of the ID column. Default: "id"
:id_clause - String name of the fully qualified ID column.
Prepends the model's table name to the front of
the ID field. Default: "table_name.id"
:start_id - Fixnum to start iterating from. Default: 1
:prefix - Custom String prefix for redis keys.
:select - Optional String of the columns to retrieve.
:joins - Optional Symbol or Hash :joins option for
ActiveRecord::Base.find.
:max - Optional Fixnum of the maximum number of iterations.
Use max * limit to process a known number of records
at a time.
:limit - Fixnum limit of objects to fetch from the db.
Default: 100
:conditions - Array of String SQL WHERE clause and optional values
(Will override clause/values given in arguments.)
ModelIterator.new(Repository, :start_id => 5000)
ModelIterator.new(Repository, 'public=?', true, :start_id => 1000)
92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 |
# File 'lib/model_iterator.rb', line 92 def initialize(klass, *args) @klass = klass @options = if args.last.respond_to?(:fetch) args.pop else {} end @redis = @options[:redis] || self.class.redis @id_field = @options[:id_field] || klass.primary_key @id_clause = @options[:id_clause] || "#{klass.table_name}.#{@id_field}" @order = @options[:order] == :desc ? :desc : :asc op = @order == :asc ? '>' : '<' @max = @options[:max].to_i @joins = @options[:joins] @clause = "#{@id_clause} #{op} ?" if @options[:conditions] conditions = Array(@options[:conditions]) @clause += " AND (#{conditions.first})" @clause_args = conditions[1..-1] elsif !args.empty? @clause += " AND (#{args.shift})" @clause_args = args end @current_id = @options[:start_id] @limit = @options[:limit] || 100 @job = @prefix = @key = nil end |
Class Attribute Details
.redis ⇒ Object
Gets or sets a default Redis client object for iterators.
15 16 17 |
# File 'lib/model_iterator.rb', line 15 def redis @redis end |
Instance Attribute Details
#clause ⇒ Object (readonly)
Gets a String SQL Where clause fragment. Use ‘?` for variable substitution.
Returns a String.
33 34 35 |
# File 'lib/model_iterator.rb', line 33 def clause @clause end |
#clause_args ⇒ Object (readonly)
Gets an Array of values to be sql-escaped and joined with the clause.
Returns an Array of unescaped sql values.
38 39 40 |
# File 'lib/model_iterator.rb', line 38 def clause_args @clause_args end |
#current_id(refresh = false) ⇒ Object
Public: Points to the latest record that was yielded, by database ID.
refresh - Boolean that determines if the instance variable cache should
be reset first. Default: false.
Returns a Fixnum.
126 127 128 129 |
# File 'lib/model_iterator.rb', line 126 def current_id(refresh = false) @current_id = nil if refresh @current_id ||= @redis.get(key).to_i end |
#id_clause ⇒ Object (readonly)
Gets the String fully qualified ID field (with the table name).
50 51 52 |
# File 'lib/model_iterator.rb', line 50 def id_clause @id_clause end |
#id_field ⇒ Object (readonly)
Gets the String name of the ID field.
47 48 49 |
# File 'lib/model_iterator.rb', line 47 def id_field @id_field end |
#job ⇒ Object
Gets or sets a Proc that is called with each model instance while iterating. This is set automatically by #each.
57 58 59 |
# File 'lib/model_iterator.rb', line 57 def job @job end |
#joins ⇒ Object (readonly)
Gets the :joins value for ActiveRecord::Base.find.
53 54 55 |
# File 'lib/model_iterator.rb', line 53 def joins @joins end |
#klass ⇒ Object (readonly)
Gets a reference to the ActiveRecord::Base class that is iterated.
Returns a Class.
21 22 23 |
# File 'lib/model_iterator.rb', line 21 def klass @klass end |
#limit ⇒ Object
Gets or sets the number of records that are returned in each database query.
Returns a Fixnum.
27 28 29 |
# File 'lib/model_iterator.rb', line 27 def limit @limit end |
#max ⇒ Object (readonly)
Gets a Fixnum value of the maximum iterations to run, or 0.
44 45 46 |
# File 'lib/model_iterator.rb', line 44 def max @max end |
#prefix ⇒ Object (readonly)
Gets a String used to prefix the redis keys used by this object.
41 42 43 |
# File 'lib/model_iterator.rb', line 41 def prefix @prefix end |
#redis ⇒ Object
Gets or sets the Redis client object.
60 61 62 |
# File 'lib/model_iterator.rb', line 60 def redis @redis end |
Instance Method Details
#cleanup ⇒ Object
Public: Cleans up any redis keys.
Returns nothing.
181 182 183 184 |
# File 'lib/model_iterator.rb', line 181 def cleanup @redis.del(key) @current_id = nil end |
#conditions ⇒ Object
Public: Gets an ActiveRecord :connections value, ready for ActiveRecord::Base.all.
Returns an Array with a String query clause, and unescaped db values.
199 200 201 |
# File 'lib/model_iterator.rb', line 199 def conditions [@clause, current_id, *@clause_args] end |
#each ⇒ Object Also known as: run
Public: Iterates through the whole dataset, yielding individual records as they are received. This calls #records multiple times, setting the #current_id after each run. If an exception is raised, the ModelIterator instance can safely be restarted, since all state is stored in Redis.
&block - Block that gets called with each ActiveRecord::Base instance.
Returns nothing.
142 143 144 145 146 147 148 149 150 151 |
# File 'lib/model_iterator.rb', line 142 def each @job = block = (block_given? ? Proc.new : @job) each_set do |records| records.each do |record| block.call(record) @current_id = record.send(@id_field) end end cleanup end |
#each_set(&block) ⇒ Object
Public: Iterates through the whole dataset. This calls #records multiple times, but does not set the #current_id after each record.
&block - Block that gets called with each ActiveRecord::Base instance.
Returns nothing.
159 160 161 162 163 164 165 166 167 168 169 170 171 172 |
# File 'lib/model_iterator.rb', line 159 def each_set(&block) loops = 0 while records = self.records begin block.call(records) loops += 1 if @max > 0 && loops >= @max raise MaxIterations, self end ensure @redis.set(key, @current_id) if @current_id end end end |
#find_options ⇒ Object
Public: Builds the ActiveRecord::Base.find options for a single query.
Returns a Hash.
215 216 217 218 219 220 221 222 |
# File 'lib/model_iterator.rb', line 215 def opt = {:conditions => conditions, :limit => @limit, :order => "#{@id_clause} #{@order}"} if columns = @options[:select] opt[:select] = columns end opt[:joins] = @joins if @joins opt end |
#key ⇒ Object
191 192 193 |
# File 'lib/model_iterator.rb', line 191 def key @key ||= "#{prefix}:current" end |
#records ⇒ Object
Public: Queries the database for the next page of records.
Returns an Array of ActiveRecord::Base instances if any results are returned, or nil.
207 208 209 210 |
# File 'lib/model_iterator.rb', line 207 def records arr = @klass.all() arr.empty? ? nil : arr end |