Method: ActiveRecord::Base.bulk_import
- Defined in:
- lib/activerecord-import/import.rb
.bulk_import(*args) ⇒ Object
Imports a collection of values to the database.
This is more efficient than using ActiveRecord::Base#create or ActiveRecord::Base#save multiple times. This method works well if you want to create more than one record at a time and do not care about having ActiveRecord objects returned for each record inserted.
This can be used with or without validations. It does not utilize the ActiveRecord::Callbacks during creation/modification while performing the import.
Usage
Model.import array_of_models
Model.import column_names, array_of_models
Model.import array_of_hash_objects
Model.import column_names, array_of_hash_objects
Model.import column_names, array_of_values
Model.import column_names, array_of_values,
Model.import array_of_models
With this form you can call import passing in an array of model objects that you want updated.
Model.import column_names, array_of_values
The first parameter column_names
is an array of symbols or strings which specify the columns that you want to update.
The second parameter, array_of_values
, is an array of arrays. Each subarray is a single set of values for a new record. The order of values in each subarray should match up to the order of the column_names
.
Model.import column_names, array_of_values, options
The first two parameters are the same as the above form. The third parameter, options
, is a hash. This is optional. Please see below for what options
are available.
Options
-
validate
- true|false, tells import whether or not to use ActiveRecord validations. Validations are enforced by default. It skips the uniqueness validation for performance reasons. You can find more details here: github.com/zdennis/activerecord-import/issues/228 -
ignore
- true|false, an alias for on_duplicate_key_ignore. -
on_duplicate_key_ignore
- true|false, tells import to discard records that contain duplicate keys. For Postgres 9.5+ it adds ON CONFLICT DO NOTHING, for MySQL it uses INSERT IGNORE, and for SQLite it uses INSERT OR IGNORE. Cannot be enabled on a recursive import. For database adapters that normally support setting primary keys on imported objects, this option prevents that from occurring. -
on_duplicate_key_update
- :all, an Array, or Hash, tells import to use MySQL’s ON DUPLICATE KEY UPDATE or Postgres/SQLite ON CONFLICT DO UPDATE ability. See On Duplicate Key Update below. -
synchronize
- an array of ActiveRecord instances for the model that you are currently importing data into. This synchronizes existing model instances in memory with updates from the import. -
timestamps
- true|false, tells import to not add timestamps (if false) even if record timestamps is disabled in ActiveRecord::Base -
recursive
- true|false, tells import to import all has_many/has_one associations if the adapter supports setting the primary keys of the newly imported objects. PostgreSQL only. -
batch_size
- an integer value to specify the max number of records to include per insert. Defaults to the total number of records to import.
Examples
class BlogPost < ActiveRecord::Base ; end
# Example using array of model objects
posts = [ BlogPost.new author_name: 'Zach Dennis', title: 'AREXT',
BlogPost.new author_name: 'Zach Dennis', title: 'AREXT2',
BlogPost.new author_name: 'Zach Dennis', title: 'AREXT3' ]
BlogPost.import posts
# Example using array_of_hash_objects
# NOTE: column_names will be determined by using the keys of the first hash in the array. If later hashes in the
# array have different keys an exception will be raised. If you have hashes to import with different sets of keys
# we recommend grouping these into batches before importing.
values = [ {author_name: 'zdennis', title: 'test post'} ], [ {author_name: 'jdoe', title: 'another test post'} ] ]
BlogPost.import values
# Example using column_names and array_of_hash_objects
columns = [ :author_name, :title ]
values = [ {author_name: 'zdennis', title: 'test post'} ], [ {author_name: 'jdoe', title: 'another test post'} ] ]
BlogPost.import columns, values
# Example using column_names and array_of_values
columns = [ :author_name, :title ]
values = [ [ 'zdennis', 'test post' ], [ 'jdoe', 'another test post' ] ]
BlogPost.import columns, values
# Example using column_names, array_of_value and options
columns = [ :author_name, :title ]
values = [ [ 'zdennis', 'test post' ], [ 'jdoe', 'another test post' ] ]
BlogPost.import( columns, values, validate: false )
# Example synchronizing existing instances in memory
post = BlogPost.where(author_name: 'zdennis').first
puts post.author_name # => 'zdennis'
columns = [ :author_name, :title ]
values = [ [ 'yoda', 'test post' ] ]
BlogPost.import posts, synchronize: [ post ]
puts post.author_name # => 'yoda'
# Example synchronizing unsaved/new instances in memory by using a uniqued imported field
posts = [BlogPost.new(title: "Foo"), BlogPost.new(title: "Bar")]
BlogPost.import posts, synchronize: posts, synchronize_keys: [:title]
puts posts.first.persisted? # => true
On Duplicate Key Update (MySQL)
The :on_duplicate_key_update option can be either :all, an Array, or a Hash.
Using :all
The :on_duplicate_key_update option can be set to :all. All columns other than the primary key are updated. If a list of column names is supplied, only those columns will be updated. Below is an example:
BlogPost.import columns, values, on_duplicate_key_update: :all
Using an Array
The :on_duplicate_key_update option can be an array of column names. The column names are the only fields that are updated if a duplicate record is found. Below is an example:
BlogPost.import columns, values, on_duplicate_key_update: [ :date_modified, :content, :author ]
Using A Hash
The :on_duplicate_key_update option can be a hash of column names to model attribute name mappings. This gives you finer grained control over what fields are updated with what attributes on your model. Below is an example:
BlogPost.import columns, attributes, on_duplicate_key_update: { title: :title }
On Duplicate Key Update (Postgres 9.5+ and SQLite 3.24+)
The :on_duplicate_key_update option can be :all, an Array, or a Hash with up to three attributes, :conflict_target (and optionally :index_predicate) or :constraint_name (Postgres), and :columns.
Using :all
The :on_duplicate_key_update option can be set to :all. All columns other than the primary key are updated. If a list of column names is supplied, only those columns will be updated. Below is an example:
BlogPost.import columns, values, on_duplicate_key_update: :all
Using an Array
The :on_duplicate_key_update option can be an array of column names. This option only handles inserts that conflict with the primary key. If a table does not have a primary key, this will not work. The column names are the only fields that are updated if a duplicate record is found. Below is an example:
BlogPost.import columns, values, on_duplicate_key_update: [ :date_modified, :content, :author ]
Using a Hash
The :on_duplicate_key_update option can be a hash with up to three attributes, :conflict_target (and optionally :index_predicate) or :constraint_name, and :columns. Unlike MySQL, Postgres requires the conflicting constraint to be explicitly specified. Using this option allows you to specify a constraint other than the primary key.
:conflict_target
The :conflict_target attribute specifies the columns that make up the conflicting unique constraint and can be a single column or an array of column names. This attribute is ignored if :constraint_name is included, but it is the preferred method of identifying a constraint. It will default to the primary key. Below is an example:
BlogPost.import columns, values, on_duplicate_key_update: { conflict_target: [ :author_id, :slug ], columns: [ :date_modified ] }
:index_predicate
The :index_predicate attribute optionally specifies a WHERE condition on :conflict_target, which is required for matching against partial indexes. This attribute is ignored if :constraint_name is included. Below is an example:
BlogPost.import columns, values, on_duplicate_key_update: { conflict_target: [ :author_id, :slug ], index_predicate: 'status <> 0', columns: [ :date_modified ] }
:constraint_name
The :constraint_name attribute explicitly identifies the conflicting unique index by name. Postgres documentation discourages using this method of identifying an index unless absolutely necessary. Below is an example:
BlogPost.import columns, values, on_duplicate_key_update: { constraint_name: :blog_posts_pkey, columns: [ :date_modified ] }
:condition
The :condition attribute optionally specifies a WHERE condition on :conflict_action. Only rows for which this expression returns true will be updated. Note that it’s evaluated last, after a conflict has been identified as a candidate to update. Below is an example:
BlogPost.import columns, values, on_duplicate_key_update: { conflict_target: [ :author_id ], condition: "blog_posts.title NOT LIKE '%sample%'", columns: [ :author_name ] }
:columns
The :columns attribute can be either :all, an Array, or a Hash.
Using :all
The :columns attribute can be :all. All columns other than the primary key will be updated. If a list of column names is supplied, only those columns will be updated. Below is an example:
BlogPost.import columns, values, on_duplicate_key_update: { conflict_target: :slug, columns: :all }
Using an Array
The :columns attribute can be an array of column names. The column names are the only fields that are updated if a duplicate record is found. Below is an example:
BlogPost.import columns, values, on_duplicate_key_update: { conflict_target: :slug, columns: [ :date_modified, :content, :author ] }
Using a Hash
The :columns option can be a hash of column names to model attribute name mappings. This gives you finer grained control over what fields are updated with what attributes on your model. Below is an example:
BlogPost.import columns, attributes, on_duplicate_key_update: { conflict_target: :slug, columns: { title: :title } }
Returns
This returns an object which responds to failed_instances
and num_inserts
.
-
failed_instances - an array of objects that fails validation and were not committed to the database. An empty array if no validation is performed.
-
num_inserts - the number of insert statements it took to import the data
-
ids - the primary keys of the imported ids if the adapter supports it, otherwise an empty array.
-
results - import results if the adapter supports it, otherwise an empty array.
530 531 532 533 534 535 536 537 538 539 540 |
# File 'lib/activerecord-import/import.rb', line 530 def bulk_import(*args) if args.first.is_a?( Array ) && args.first.first.is_a?(ActiveRecord::Base) = {} .merge!( args.pop ) if args.last.is_a?(Hash) models = args.first import_helper(models, ) else import_helper(*args) end end |