Class: Dynamoid::Criteria::Chain

Inherits:

Object

Object
Dynamoid::Criteria::Chain

show all

Includes:: Enumerable

Defined in:: lib/dynamoid/criteria/chain.rb

Overview

The criteria chain is equivalent to an ActiveRecord relation (and realistically I should change the name from chain to relation). It is a chainable object that builds up a query and eventually executes it by a Query or Scan.

Constant Summary collapse

ALLOWED_FIELD_OPERATORS =

Set.new(
  %w[
    eq ne gt lt gte lte between begins_with in contains not_contains null not_null
  ]
).freeze

Instance Attribute Summary collapse

#consistent_read ⇒ Object readonly

Returns the value of attribute consistent_read.
#key_fields_detector ⇒ Object readonly

Returns the value of attribute key_fields_detector.
#source ⇒ Object readonly

Returns the value of attribute source.

Instance Method Summary collapse

#all ⇒ Enumerator::Lazy

Returns all the records matching the criteria.
#batch(batch_size) ⇒ Dynamoid::Criteria::Chain

Set the batch size.
#consistent ⇒ Dynamoid::Criteria::Chain

Turns on strongly consistent reads.
#count ⇒ Integer

Returns the actual number of items in a table matching the criteria.
#delete_all ⇒ Object (also: #destroy_all)

Deletes all the items matching the criteria.
#each(&block) ⇒ Object

Allows to use the results of a search as an enumerable over the results found.
#find_by_pages(&block) ⇒ Enumerator::Lazy

Iterates over the pages returned by DynamoDB.
#first(*args) ⇒ Model|nil

Returns the first item matching the criteria.
#initialize(source) ⇒ Chain constructor

Create a new criteria chain.
#last ⇒ Model|nil

Returns the last item matching the criteria.
#pluck(*args) ⇒ Array

Select only specified fields.
#project(*fields) ⇒ Dynamoid::Criteria::Chain

Select only specified fields.
#record_limit(limit) ⇒ Dynamoid::Criteria::Chain

Set the record limit.
#scan_index_forward(scan_index_forward) ⇒ Dynamoid::Criteria::Chain

Reverse the sort order.
#scan_limit(limit) ⇒ Dynamoid::Criteria::Chain

Set the scan limit.
#start(start) ⇒ Dynamoid::Criteria::Chain

Set the start item.
#where(args) ⇒ Dynamoid::Criteria::Chain

Returns a chain which is a result of filtering current chain with the specified conditions.
#with_index(index_name) ⇒ Dynamoid::Criteria::Chain

Force the index name to use for queries.

Constructor Details

#initialize(source) ⇒ `Chain`

Create a new criteria chain.

Parameters:

source (Class) —

the class upon which the ultimate query will be performed.

# File 'lib/dynamoid/criteria/chain.rb', line 25

def initialize(source)
  @where_conditions = WhereConditions.new
  @source = source
  @consistent_read = false
  @scan_index_forward = true

  # we should re-initialize keys detector every time we change @where_conditions
  @key_fields_detector = KeyFieldsDetector.new(@where_conditions, @source)
end

Instance Attribute Details

#consistent_read ⇒ `Object` (readonly)

Returns the value of attribute consistent_read.



12
13
14

# File 'lib/dynamoid/criteria/chain.rb', line 12

def consistent_read
  @consistent_read
end

#key_fields_detector ⇒ `Object` (readonly)

Returns the value of attribute key_fields_detector.



12
13
14

# File 'lib/dynamoid/criteria/chain.rb', line 12

def key_fields_detector
  @key_fields_detector
end

#source ⇒ `Object` (readonly)

Returns the value of attribute source.



12
13
14

# File 'lib/dynamoid/criteria/chain.rb', line 12

def source
  @source
end

Instance Method Details

#all ⇒ `Enumerator::Lazy`

Returns all the records matching the criteria.

Since where and most of the other methods return a Chain the only way to get a result as a collection is to call the all method. It returns Enumerator which could be used directly or transformed into Array

Post.all                            # => Enumerator
Post.where(links_count: 2).all      # => Enumerator
Post.where(links_count: 2).all.to_a # => Array

When the result set is too large DynamoDB divides it into separate pages. While an enumerator iterates over the result models each page is loaded lazily. So even an extra large result set can be loaded and processed with considerably small memory footprint and throughput consumption.

Returns:

(Enumerator::Lazy)

Since:

0.2.0



145
146
147

# File 'lib/dynamoid/criteria/chain.rb', line 145

def all
  records
end

#batch(batch_size) ⇒ `Dynamoid::Criteria::Chain`

Set the batch size.

The batch size is a number of items which will be lazily loaded one by one. When the batch size is set then items will be loaded batch by batch of the specified size instead of relying on the default paging mechanism of DynamoDB.

Post.where(links_count: 2).batch(1000).all.each do |post|
  # process a post
end

It’s useful to limit memory usage or throughput consumption

Returns:

(Dynamoid::Criteria::Chain)

# File 'lib/dynamoid/criteria/chain.rb', line 306

def batch(batch_size)
  @batch_size = batch_size
  self
end

#consistent ⇒ `Dynamoid::Criteria::Chain`

Turns on strongly consistent reads.

By default reads are eventually consistent.

Post.where('size.gt' => 1000).consistent

Returns:

(Dynamoid::Criteria::Chain)

# File 'lib/dynamoid/criteria/chain.rb', line 121

def consistent
  @consistent_read = true
  self
end

#count ⇒ `Integer`

Returns the actual number of items in a table matching the criteria.

Post.where(links_count: 2).count

Internally it uses either ‘Scan` or `Query` DynamoDB’s operation so it costs like all the matching items were read from a table.

The only difference is that items are read by DynemoDB but not actually loaded on the client side. DynamoDB returns only count of items after filtering.

Returns:

(Integer)

# File 'lib/dynamoid/criteria/chain.rb', line 161

def count
  if @key_fields_detector.key_present?
    count_via_query
  else
    count_via_scan
  end
end

#delete_all ⇒ `Object` Also known as: destroy_all

Deletes all the items matching the criteria.

Post.where(links_count: 2).delete_all

If called without criteria then it deletes all the items in a table.

Post.delete_all

It loads all the items either with Scan or Query operation and deletes them in batch with BatchWriteItem operation. BatchWriteItem is limited by request size and items count so it’s quite possible the deletion will require several BatchWriteItem calls.

# File 'lib/dynamoid/criteria/chain.rb', line 224

def delete_all
  ids = []
  ranges = []

  if @key_fields_detector.key_present?
    Dynamoid.adapter.query(source.table_name, query_key_conditions, query_non_key_conditions, query_options).flat_map { |i| i }.collect do |hash|
      ids << hash[source.hash_key.to_sym]
      ranges << hash[source.range_key.to_sym] if source.range_key
    end
  else
    Dynamoid.adapter.scan(source.table_name, scan_conditions, scan_options).flat_map { |i| i }.collect do |hash|
      ids << hash[source.hash_key.to_sym]
      ranges << hash[source.range_key.to_sym] if source.range_key
    end
  end

  Dynamoid.adapter.delete(source.table_name, ids, range_key: ranges.presence)
end

#each(&block) ⇒ `Object`

Allows to use the results of a search as an enumerable over the results found.

Post.each do |post|
end

Post.all.each do |post|
end

Post.where(links_count: 2).each do |post|
end

It works similar to the all method so results are loaded lazily.

Since:

0.2.0



402
403
404

# File 'lib/dynamoid/criteria/chain.rb', line 402

def each(&block)
  records.each(&block)
end

#find_by_pages(&block) ⇒ `Enumerator::Lazy`

Iterates over the pages returned by DynamoDB.

DynamoDB has its own paging machanism and divides a large result set into separate pages. The find_by_pages method provides access to these native DynamoDB pages.

The pages are loaded lazily.

Post.where('views_count.gt' => 1000).find_by_pages do |posts, options|
  # process posts
end

It passes as block argument an Array of models and a Hash with options.

Options Hash contains only one option :last_evaluated_key. The last evaluated key is a Hash with key attributes of the last item processed by DynamoDB. It can be used to resume querying using the start method.

posts, options = Post.where('views_count.gt' => 1000).find_by_pages.first
last_key = options[:last_evaluated_key]

# ...

Post.where('views_count.gt' => 1000).start(last_key).find_by_pages do |posts, options|
end

If it’s called without a block then it returns an Enumerator.

enum = Post.where('views_count.gt' => 1000).find_by_pages

enum.each do |posts, options|
  # process posts
end

Returns:

(Enumerator::Lazy)



441
442
443

# File 'lib/dynamoid/criteria/chain.rb', line 441

def find_by_pages(&block)
  pages.each(&block)
end

#first(*args) ⇒ `Model|nil`

Returns the first item matching the criteria.

Post.where(links_count: 2).first

Applies ‘record_limit(1)` to ensure only a single record is fetched when no non-key conditions are present and `scan_limit(1)` when no conditions are present at all.

If used without criteria it just returns the first item of some arbitrary order.

Post.first

Returns:

(Model|nil)

# File 'lib/dynamoid/criteria/chain.rb', line 183

def first(*args)
  n = args.first || 1

  return dup.scan_limit(n).to_a.first(*args) if @where_conditions.empty?
  return super if @key_fields_detector.non_key_present?

  dup.record_limit(n).to_a.first(*args)
end

#last ⇒ `Model|nil`

Returns the last item matching the criteria.

Post.where(links_count: 2).last

DynamoDB doesn’t support ordering by some arbitrary attribute except a sort key. So this method is mostly useful during development and testing.

If used without criteria it just returns the last item of some arbitrary order.

Post.last

It isn’t efficient from the performance point of view as far as it reads and loads all the filtered items from DynamoDB.

Returns:

(Model|nil)



208
209
210

# File 'lib/dynamoid/criteria/chain.rb', line 208

def last
  all.to_a.last
end

#pluck(*args) ⇒ `Array`

Select only specified fields.

It takes one or more field names and returns an array of either values or arrays of values.

Post.pluck(:id)                   # => ['1', '2']
Post.pluck(:title, :title)        # => [['1', 'Title #1'], ['2', 'Title#2']]

Post.where('views_count.gt' => 1000).pluck(:title)

There are some differences between pluck and project. pluck

doesn’t instantiate models
it isn’t chainable and returns Array instead of Chain

It deserializes values if a field type isn’t supported by DynamoDB natively.

It can be used to avoid loading large field values and to decrease a memory footprint.

Returns:

(Array)

# File 'lib/dynamoid/criteria/chain.rb', line 483

def pluck(*args)
  fields = args.map(&:to_sym)

  # `project` has a side effect - it sets `@project` instance variable.
  # So use a duplicate to not pollute original chain.
  scope = dup
  scope.project(*fields)

  if fields.many?
    scope.items.map do |item|
      fields.map { |key| Undumping.undump_field(item[key], source.attributes[key]) }
    end.to_a
  else
    key = fields.first
    scope.items.map { |item| Undumping.undump_field(item[key], source.attributes[key]) }.to_a
  end
end

#project(*fields) ⇒ `Dynamoid::Criteria::Chain`

Select only specified fields.

It takes one or more field names and returns a collection of models with only these fields set.

Post.where('views_count.gt' => 1000).project(:title)
Post.where('views_count.gt' => 1000).project(:title, :created_at)
Post.project(:id)

It can be used to avoid loading large field values and to decrease a memory footprint.

Returns:

(Dynamoid::Criteria::Chain)

# File 'lib/dynamoid/criteria/chain.rb', line 458

def project(*fields)
  @project = fields.map(&:to_sym)
  self
end

#record_limit(limit) ⇒ `Dynamoid::Criteria::Chain`

Set the record limit.

The record limit is the limit of evaluated items returned by the Query or Scan. In other words it’s how many items should be returned in response.

Post.where(links_count: 2).record_limit(1000) # => 1000 models
Post.record_limit(1000)                       # => 1000 models

It could be very inefficient in terms of HTTP requests in pathological cases. DynamoDB doesn’t support out of the box the limits for items count after filtering. So it’s possible to make a lot of HTTP requests to find items matching criteria and skip not matching. It means that the cost (read capacity units) is unpredictable.

Because of such issues with performance and cost it’s mostly useful in development and testing.

When called without criteria it works like scan_limit.

Returns:

(Dynamoid::Criteria::Chain)

# File 'lib/dynamoid/criteria/chain.rb', line 265

def record_limit(limit)
  @record_limit = limit
  self
end

#scan_index_forward(scan_index_forward) ⇒ `Dynamoid::Criteria::Chain`

Reverse the sort order.

By default the sort order is ascending (by the sort key value). Set a false value to reverse the order.

Post.where(id: id, 'views_count.gt' => 1000).scan_index_forward(false)

It works only for queries with a partition key condition e.g. id: ‘some-id’ which internally performs Query operation.

Returns:

(Dynamoid::Criteria::Chain)

# File 'lib/dynamoid/criteria/chain.rb', line 352

def scan_index_forward(scan_index_forward)
  @scan_index_forward = scan_index_forward
  self
end

#scan_limit(limit) ⇒ `Dynamoid::Criteria::Chain`

Set the scan limit.

The scan limit is the limit of records that DynamoDB will internally read with Query or Scan. It’s different from the record limit as with filtering DynamoDB may look at N scanned items but return 0 items if none passes the filter. So it can return less items than was specified with the limit.

Post.where(links_count: 2).scan_limit(1000)   # => 850 models
Post.scan_limit(1000)                         # => 1000 models

By contrast with record_limit the cost (read capacity units) and performance is predictable.

When called without criteria it works like record_limit.

Returns:

(Dynamoid::Criteria::Chain)

# File 'lib/dynamoid/criteria/chain.rb', line 287

def scan_limit(limit)
  @scan_limit = limit
  self
end

#start(start) ⇒ `Dynamoid::Criteria::Chain`

Set the start item.

When the start item is set the items will be loaded starting right after the specified item.

Post.where(links_count: 2).start(post)

It can be used to implement an own pagination mechanism.

Post.where(author_id: author_id).start(last_post).scan_limit(50)

The specified start item will not be returned back in a result set.

Actually it doesn’t need all the item attributes to start - an item may have only the primary key attributes (partition and sort key if it’s declared).

Post.where(links_count: 2).start(Post.new(id: id))

It also supports a Hash argument with the keys attributes - a partition key and a sort key (if it’s declared).

Post.where(links_count: 2).start(id: id)

Returns:

(Dynamoid::Criteria::Chain)

# File 'lib/dynamoid/criteria/chain.rb', line 336

def start(start)
  @start = start
  self
end

#where(args) ⇒ `Dynamoid::Criteria::Chain`

Returns a chain which is a result of filtering current chain with the specified conditions.

It accepts conditions in the form of a hash.

Post.where(links_count: 2)

A key could be either string or symbol.

In order to express conditions other than equality predicates could be used. Predicate should be added to an attribute name to form a key ‘created_at.gt’ => Date.yesterday

Currently supported following predicates:

gt - greater than
gte - greater or equal
lt - less than
lte - less or equal
ne - not equal
between - an attribute value is greater than the first value and less than the second value
in - check an attribute in a list of values
begins_with - check for a prefix in string
contains - check substring or value in a set or array
not_contains - check for absence of substring or a value in set or array
null - attribute doesn’t exists in an item
not_null - attribute exists in an item

All the predicates match operators supported by DynamoDB’s ComparisonOperator

Post.where('size.gt' => 1000)
Post.where('size.gte' => 1000)
Post.where('size.lt' => 35000)
Post.where('size.lte' => 35000)
Post.where('author.ne' => 'John Doe')
Post.where('created_at.between' => [Time.now - 3600, Time.now])
Post.where('category.in' => ['tech', 'fashion'])
Post.where('title.begins_with' => 'How long')
Post.where('tags.contains' => 'Ruby')
Post.where('tags.not_contains' => 'Ruby on Rails')
Post.where('legacy_attribute.null' => true)
Post.where('optional_attribute.not_null' => true)

There are some limitations for a sort key. Only following predicates are supported - gt, gte, lt, lte, between, begins_with.

where without argument will return the current chain.

Multiple calls can be chained together and conditions will be merged:

Post.where('size.gt' => 1000).where('title' => 'some title')

It’s equivalent to:

Post.where('size.gt' => 1000, 'title' => 'some title')

But only one condition can be specified for a certain attribute. The last specified condition will override all the others. Only condition ‘size.lt’ => 200 will be used in following examples:

Post.where('size.gt' => 100, 'size.lt' => 200)
Post.where('size.gt' => 100).where('size.lt' => 200)

Internally where performs either Scan or Query operation.

Returns:

(Dynamoid::Criteria::Chain)

Since:

0.2.0

# File 'lib/dynamoid/criteria/chain.rb', line 100

def where(args)
  detector = NonexistentFieldsDetector.new(args, @source)
  if detector.found?
    Dynamoid.logger.warn(detector.warning_message)
  end

  @where_conditions.update(args.symbolize_keys)

  # we should re-initialize keys detector every time we change @where_conditions
  @key_fields_detector = KeyFieldsDetector.new(@where_conditions, @source, forced_index_name: @forced_index_name)

  self
end

#with_index(index_name) ⇒ `Dynamoid::Criteria::Chain`

Force the index name to use for queries.

By default allows the library to select the most appropriate index. Sometimes you have more than one index which will fulfill your query’s needs. When this case occurs you may want to force an order. This occurs when you are searching by hash key, but not specifying a range key.

class Comment
  include Dynamoid::Document

  table key: :post_id
  range_key :author_id

  field :post_date, :datetime

  global_secondary_index name: :time_sorted_comments, hash_key: :post_id, range_key: post_date, projected_attributes: :all
end

 Comment.where(post_id: id).with_index(:time_sorted_comments).scan_index_forward(false)

Returns:

(Dynamoid::Criteria::Chain)

Raises:

(Dynamoid::Errors::InvalidIndex)

# File 'lib/dynamoid/criteria/chain.rb', line 379

def with_index(index_name)
  raise Dynamoid::Errors::InvalidIndex, "Unknown index #{index_name}" unless @source.find_index_by_name(index_name)

  @forced_index_name = index_name
  @key_fields_detector = KeyFieldsDetector.new(@where_conditions, @source, forced_index_name: index_name)
  self
end

Class: Dynamoid::Criteria::Chain

Overview

Constant Summary collapse

Instance Attribute Summary collapse

Instance Method Summary collapse

Constructor Details

#initialize(source) ⇒ Chain

Instance Attribute Details

#consistent_read ⇒ Object (readonly)

#key_fields_detector ⇒ Object (readonly)

#source ⇒ Object (readonly)

Instance Method Details

#all ⇒ Enumerator::Lazy

#batch(batch_size) ⇒ Dynamoid::Criteria::Chain

#consistent ⇒ Dynamoid::Criteria::Chain

#count ⇒ Integer

#delete_all ⇒ Object Also known as: destroy_all

#each(&block) ⇒ Object

#find_by_pages(&block) ⇒ Enumerator::Lazy

#first(*args) ⇒ Model|nil

#last ⇒ Model|nil

#pluck(*args) ⇒ Array

#project(*fields) ⇒ Dynamoid::Criteria::Chain

#record_limit(limit) ⇒ Dynamoid::Criteria::Chain

#scan_index_forward(scan_index_forward) ⇒ Dynamoid::Criteria::Chain

#scan_limit(limit) ⇒ Dynamoid::Criteria::Chain

#start(start) ⇒ Dynamoid::Criteria::Chain

#where(args) ⇒ Dynamoid::Criteria::Chain

#with_index(index_name) ⇒ Dynamoid::Criteria::Chain

#initialize(source) ⇒ `Chain`

#consistent_read ⇒ `Object` (readonly)

#key_fields_detector ⇒ `Object` (readonly)

#source ⇒ `Object` (readonly)

#all ⇒ `Enumerator::Lazy`

#batch(batch_size) ⇒ `Dynamoid::Criteria::Chain`

#consistent ⇒ `Dynamoid::Criteria::Chain`

#count ⇒ `Integer`

#delete_all ⇒ `Object` Also known as: destroy_all

#each(&block) ⇒ `Object`

#find_by_pages(&block) ⇒ `Enumerator::Lazy`

#first(*args) ⇒ `Model|nil`

#last ⇒ `Model|nil`

#pluck(*args) ⇒ `Array`

#project(*fields) ⇒ `Dynamoid::Criteria::Chain`

#record_limit(limit) ⇒ `Dynamoid::Criteria::Chain`

#scan_index_forward(scan_index_forward) ⇒ `Dynamoid::Criteria::Chain`

#scan_limit(limit) ⇒ `Dynamoid::Criteria::Chain`

#start(start) ⇒ `Dynamoid::Criteria::Chain`

#where(args) ⇒ `Dynamoid::Criteria::Chain`

#with_index(index_name) ⇒ `Dynamoid::Criteria::Chain`