Module: Chewy::Index::Import::ClassMethods

Defined in:: lib/chewy/index/import.rb

Instance Method Summary collapse

#bulk(**options) ⇒ Hash
Wraps elasticsearch API bulk method, adds additional features like bulk_size and suffix.
#compose(object, crutches = nil, fields: []) ⇒ Hash
Composes a single document from the passed object.
#import(*collection, **options) ⇒ true, false
Basically, one of the main methods for an index.
#import!(*collection, **options) ⇒ Object
(see #import).

Instance Method Details

#bulk(**options) ⇒ `Hash`

Wraps elasticsearch API bulk method, adds additional features like bulk_size and suffix.

Parameters:

options (Hash{Symbol => Object}) —
besides specific import options, it accepts all the options suitable for the bulk API call like refresh or timeout

Options Hash (**options):

suffix (String) —
bulk API chunk size in bytes; if passed, the request is performed several times for each chunk, empty by default
bulk_size (Integer) —
bulk API chunk size in bytes; if passed, the request is performed several times for each chunk, empty by default
body (Array<Hash>) —
elasticsearch API bulk method body

Returns:

(Hash) —
tricky transposed errors hash, empty if everything is fine

#compose(object, crutches = nil, fields: []) ⇒ `Hash`

Composes a single document from the passed object. Uses either witchcraft or normal composing under the hood.

Parameters:

object (Object) —
a data source object
crutches (Object) (defaults to: nil) —
optional crutches object; if omitted - a crutch for the single passed object is created as a fallback
fields (Array<Symbol>) (defaults to: []) —
and array of fields to restrict the generated document

Returns:

(Hash) —
a JSON-ready hash

# File 'lib/chewy/index/import.rb', line 118

def compose(object, crutches = nil, fields: [])
  crutches ||= Chewy::Index::Crutch::Crutches.new self, [object]

  if witchcraft? && root.children.present?
    cauldron(fields: fields).brew(object, crutches)
  else
    root.compose(object, crutches, fields: fields)
  end
end

#import(*collection, **options) ⇒ `true`, `false`

Basically, one of the main methods for an index. Performs any objects import to the index. Does all the objects handling routines. Performs document import by utilizing bulk API. Bulk size and objects batch size are controlled by the corresponding options.

It accepts ORM/ODM objects, PORO, hashes, ids which are used by adapter to fetch objects from the source depending on the used adapter. It destroys passed objects from the index if they are not in the default scope or marked for destruction.

It handles parent-child relationships with a join field reindexing children when the parent is reindexed.

Performs journaling if enabled: it stores all the ids of the imported objects to a specialized index. It is possible to replay particular import later to restore the data consistency.

Performs partial index update using update bulk action if any fields are specified. Note that if document doesn't exist yet, an error will be raised by ES, but import catches this an errors and performs full indexing for the corresponding documents. This feature can be disabled by setting update_failover to false.

Utilizes ActiveSupport::Notifications, so it is possible to get imported objects later by listening to the import_objects.chewy queue. It is also possible to get the list of occurred errors from the payload if something went wrong.

Import can also be run in parallel using the Parallel gem functionality.

Examples:

UsersIndex.import(parallel: true) # imports everything in parallel with automatic workers number
UsersIndex.import(parallel: 3) # using 3 workers
UsersIndex.import(parallel: {in_threads: 10}) # in 10 threads

Parameters:

collection (Array<Object>) —
and array or anything to import
options (Hash{Symbol => Object}) —
besides specific import options, it accepts all the options suitable for the bulk API call like refresh or timeout

Options Hash (**options):

suffix (String) —
an index name suffix, used for zero-downtime reset mostly, no suffix by default
bulk_size (Integer) —
bulk API chunk size in bytes; if passed, the request is performed several times for each chunk, empty by default
batch_size (Integer) —
passed to the adapter import method, used to split imported objects in chunks, 1000 by default
direct_import (Boolean) —
skips object reloading in ORM adapter, false by default
journal (true, false) —
enables imported objects journaling, false by default
update_fields (Array<Symbol, String>) —
list of fields for the partial import, empty by default
update_failover (true, false) —
enables full objects reimport in cases of partial update errors, true by default
parallel (true, Integer, Hash) —
enables parallel import processing with the Parallel gem, accepts the number of workers or any Parallel gem acceptable options

Returns:

(true, false) —
false in case of errors

#import!(*collection, **options) ⇒ `Object`

(see #import)

The only difference from #import is that it raises an exception in case of any import errors.

Raises:

(Chewy::ImportFailed) —
in case of errors

# File 'lib/chewy/index/import.rb', line 86

def import!(*args)
  errors = intercept_import_using_strategy(*args)

  raise Chewy::ImportFailed.new(self, errors) if errors.present?

  true
end

Module: Chewy::Index::Import::ClassMethods

Instance Method Summary collapse

Instance Method Details

#bulk(**options) ⇒ Hash

#compose(object, crutches = nil, fields: []) ⇒ Hash

#import(*collection, **options) ⇒ true, false

Examples:

#import!(*collection, **options) ⇒ Object

#bulk(**options) ⇒ `Hash`

#compose(object, crutches = nil, fields: []) ⇒ `Hash`

#import(*collection, **options) ⇒ `true`, `false`

#import!(*collection, **options) ⇒ `Object`