Class: Dynamoid::Criteria::Chain
- Inherits:
-
Object
- Object
- Dynamoid::Criteria::Chain
- Includes:
- Enumerable
- Defined in:
- lib/dynamoid/criteria/chain.rb
Overview
The criteria chain is equivalent to an ActiveRecord relation (and realistically I should change the name from chain to relation). It is a chainable object that builds up a query and eventually executes it by a Query or Scan.
Instance Method Summary collapse
-
#all ⇒ Enumerator::Lazy
Returns all the records matching the criteria.
-
#batch(batch_size) ⇒ Dynamoid::Criteria::Chain
Set the batch size.
-
#consistent ⇒ Dynamoid::Criteria::Chain
Turns on strongly consistent reads.
-
#count ⇒ Integer
Returns the actual number of items in a table matching the criteria.
-
#delete_all ⇒ Object
(also: #destroy_all)
Deletes all the items matching the criteria.
-
#each(&block) ⇒ Object
Allows to use the results of a search as an enumerable over the results found.
-
#find_by_pages(&block) ⇒ Enumerator::Lazy
Iterates over the pages returned by DynamoDB.
-
#first(*args) ⇒ Model|nil
Returns the first item matching the criteria.
-
#initialize(source) ⇒ Chain
constructor
Create a new criteria chain.
-
#last ⇒ Model|nil
Returns the last item matching the criteria.
-
#pluck(*args) ⇒ Array
Select only specified fields.
-
#project(*fields) ⇒ Dynamoid::Criteria::Chain
Select only specified fields.
-
#record_limit(limit) ⇒ Dynamoid::Criteria::Chain
Set the record limit.
-
#scan_index_forward(scan_index_forward) ⇒ Dynamoid::Criteria::Chain
Reverse the sort order.
-
#scan_limit(limit) ⇒ Dynamoid::Criteria::Chain
Set the scan limit.
-
#start(start) ⇒ Dynamoid::Criteria::Chain
Set the start item.
-
#where(conditions, placeholders = nil) ⇒ Dynamoid::Criteria::Chain
Returns a chain which is a result of filtering current chain with the specified conditions.
-
#with_index(index_name) ⇒ Dynamoid::Criteria::Chain
Force the index name to use for queries.
Constructor Details
#initialize(source) ⇒ Chain
Create a new criteria chain.
27 28 29 30 31 32 33 34 35 |
# File 'lib/dynamoid/criteria/chain.rb', line 27 def initialize(source) @where_conditions = WhereConditions.new @source = source @consistent_read = false @scan_index_forward = true # we should re-initialize keys detector every time we change @where_conditions @key_fields_detector = KeyFieldsDetector.new(@where_conditions, @source) end |
Instance Method Details
#all ⇒ Enumerator::Lazy
Returns all the records matching the criteria.
Since where and most of the other methods return a Chain
the only way to get a result as a collection is to call the all
method. It returns Enumerator which could be used directly or
transformed into Array
Post.all # => Enumerator
Post.where(links_count: 2).all # => Enumerator
Post.where(links_count: 2).all.to_a # => Array
When the result set is too large DynamoDB divides it into separate pages. While an enumerator iterates over the result models each page is loaded lazily. So even an extra large result set can be loaded and processed with considerably small memory footprint and throughput consumption.
154 155 156 |
# File 'lib/dynamoid/criteria/chain.rb', line 154 def all records end |
#batch(batch_size) ⇒ Dynamoid::Criteria::Chain
Set the batch size.
The batch size is a number of items which will be lazily loaded one by one. When the batch size is set then items will be loaded batch by batch of the specified size instead of relying on the default paging mechanism of DynamoDB.
Post.where(links_count: 2).batch(1000).all.each do |post|
# process a post
end
It's useful to limit memory usage or throughput consumption
317 318 319 320 |
# File 'lib/dynamoid/criteria/chain.rb', line 317 def batch(batch_size) @batch_size = batch_size self end |
#consistent ⇒ Dynamoid::Criteria::Chain
Turns on strongly consistent reads.
By default reads are eventually consistent.
Post.where('size.gt' => 1000).consistent
130 131 132 133 |
# File 'lib/dynamoid/criteria/chain.rb', line 130 def consistent @consistent_read = true self end |
#count ⇒ Integer
Returns the actual number of items in a table matching the criteria.
Post.where(links_count: 2).count
Internally it uses either Scan or Query DynamoDB's operation so it
costs like all the matching items were read from a table.
The only difference is that items are read by DynemoDB but not actually loaded on the client side. DynamoDB returns only count of items after filtering.
170 171 172 173 174 175 176 |
# File 'lib/dynamoid/criteria/chain.rb', line 170 def count if @key_fields_detector.key_present? count_via_query else count_via_scan end end |
#delete_all ⇒ Object Also known as: destroy_all
Deletes all the items matching the criteria.
Post.where(links_count: 2).delete_all
If called without criteria then it deletes all the items in a table.
Post.delete_all
It loads all the items either with Scan or Query operation and
deletes them in batch with BatchWriteItem operation. BatchWriteItem
is limited by request size and items count so it's quite possible the
deletion will require several BatchWriteItem calls.
233 234 235 236 237 238 239 240 241 242 243 244 245 246 247 248 249 250 |
# File 'lib/dynamoid/criteria/chain.rb', line 233 def delete_all ids = [] ranges = [] if @key_fields_detector.key_present? Dynamoid.adapter.query(source.table_name, query_key_conditions, query_non_key_conditions, ).flat_map { |i| i }.collect do |hash| ids << hash[source.hash_key.to_sym] ranges << hash[source.range_key.to_sym] if source.range_key end else Dynamoid.adapter.scan(source.table_name, scan_conditions, ).flat_map { |i| i }.collect do |hash| ids << hash[source.hash_key.to_sym] ranges << hash[source.range_key.to_sym] if source.range_key end end Dynamoid.adapter.delete(source.table_name, ids, range_key: ranges.presence) end |
#each(&block) ⇒ Object
Allows to use the results of a search as an enumerable over the results found.
Post.each do |post|
end
Post.all.each do |post|
end
Post.where(links_count: 2).each do |post|
end
It works similar to the all method so results are loaded lazily.
416 417 418 |
# File 'lib/dynamoid/criteria/chain.rb', line 416 def each(&block) records.each(&block) end |
#find_by_pages(&block) ⇒ Enumerator::Lazy
Iterates over the pages returned by DynamoDB.
DynamoDB has its own paging machanism and divides a large result set
into separate pages. The find_by_pages method provides access to
these native DynamoDB pages.
The pages are loaded lazily.
Post.where('views_count.gt' => 1000).find_by_pages do |posts, |
# process posts
end
It passes as block argument an Array of models and a Hash with options.
Options Hash contains only one option :last_evaluated_key. The last
evaluated key is a Hash with key attributes of the last item processed by
DynamoDB. It can be used to resume querying using the start method.
posts, = Post.where('views_count.gt' => 1000).find_by_pages.first
last_key = [:last_evaluated_key]
# ...
Post.where('views_count.gt' => 1000).start(last_key).find_by_pages do |posts, |
end
If it's called without a block then it returns an Enumerator.
enum = Post.where('views_count.gt' => 1000).find_by_pages
enum.each do |posts, |
# process posts
end
461 462 463 |
# File 'lib/dynamoid/criteria/chain.rb', line 461 def find_by_pages(&block) pages.each(&block) end |
#first(*args) ⇒ Model|nil
Returns the first item matching the criteria.
Post.where(links_count: 2).first
Applies record_limit(1) to ensure only a single record is fetched
when no non-key conditions are present and scan_limit(1) when no
conditions are present at all.
If used without criteria it just returns the first item of some arbitrary order.
Post.first
192 193 194 195 196 197 198 199 |
# File 'lib/dynamoid/criteria/chain.rb', line 192 def first(*args) n = args.first || 1 return dup.scan_limit(n).to_a.first(*args) if @where_conditions.empty? return super if @key_fields_detector.non_key_present? dup.record_limit(n).to_a.first(*args) end |
#last ⇒ Model|nil
Returns the last item matching the criteria.
Post.where(links_count: 2).last
DynamoDB doesn't support ordering by some arbitrary attribute except a sort key. So this method is mostly useful during development and testing.
If used without criteria it just returns the last item of some arbitrary order.
Post.last
It isn't efficient from the performance point of view as far as it reads and loads all the filtered items from DynamoDB.
217 218 219 |
# File 'lib/dynamoid/criteria/chain.rb', line 217 def last all.to_a.last end |
#pluck(*args) ⇒ Array
Select only specified fields.
It takes one or more field names and returns an array of either values or arrays of values.
Post.pluck(:id) # => ['1', '2']
Post.pluck(:title, :title) # => [['1', 'Title #1'], ['2', 'Title#2']]
Post.where('views_count.gt' => 1000).pluck(:title)
There are some differences between pluck and project. pluck
- doesn't instantiate models
- it isn't chainable and returns
Arrayinstead ofChain
It deserializes values if a field type isn't supported by DynamoDB natively.
It can be used to avoid loading large field values and to decrease a memory footprint.
503 504 505 506 507 508 509 510 511 512 513 514 515 516 517 518 519 |
# File 'lib/dynamoid/criteria/chain.rb', line 503 def pluck(*args) fields = args.map(&:to_sym) # `project` has a side effect - it sets `@project` instance variable. # So use a duplicate to not pollute original chain. scope = dup scope.project(*fields) if fields.many? scope.items.map do |item| fields.map { |key| Undumping.undump_field(item[key], source.attributes[key]) } end.to_a else key = fields.first scope.items.map { |item| Undumping.undump_field(item[key], source.attributes[key]) }.to_a end end |
#project(*fields) ⇒ Dynamoid::Criteria::Chain
Select only specified fields.
It takes one or more field names and returns a collection of models with only these fields set.
Post.where('views_count.gt' => 1000).project(:title)
Post.where('views_count.gt' => 1000).project(:title, :created_at)
Post.project(:id)
It can be used to avoid loading large field values and to decrease a memory footprint.
478 479 480 481 |
# File 'lib/dynamoid/criteria/chain.rb', line 478 def project(*fields) @project = fields.map(&:to_sym) self end |
#record_limit(limit) ⇒ Dynamoid::Criteria::Chain
Set the record limit.
The record limit is the limit of evaluated items returned by the
Query or Scan. In other words it's how many items should be
returned in response.
Post.where(links_count: 2).record_limit(1000) # => 1000 models
Post.record_limit(1000) # => 1000 models
It could be very inefficient in terms of HTTP requests in pathological cases. DynamoDB doesn't support out of the box the limits for items count after filtering. So it's possible to make a lot of HTTP requests to find items matching criteria and skip not matching. It means that the cost (read capacity units) is unpredictable.
Because of such issues with performance and cost it's mostly useful in development and testing.
When called without criteria it works like scan_limit.
274 275 276 277 |
# File 'lib/dynamoid/criteria/chain.rb', line 274 def record_limit(limit) @record_limit = limit self end |
#scan_index_forward(scan_index_forward) ⇒ Dynamoid::Criteria::Chain
Reverse the sort order.
By default the sort order is ascending (by the sort key value). Set a
false value to reverse the order.
Post.where(id: id, 'views_count.gt' => 1000).scan_index_forward(false)
It works only for queries with a partition key condition e.g. +id:
'some-id'+ which internally performs Query operation.
363 364 365 366 |
# File 'lib/dynamoid/criteria/chain.rb', line 363 def scan_index_forward(scan_index_forward) @scan_index_forward = scan_index_forward self end |
#scan_limit(limit) ⇒ Dynamoid::Criteria::Chain
Set the scan limit.
The scan limit is the limit of records that DynamoDB will internally
read with Query or Scan. It's different from the record limit as
with filtering DynamoDB may look at N scanned items but return 0
items if none passes the filter. So it can return less items than was
specified with the limit.
Post.where(links_count: 2).scan_limit(1000) # => 850 models
Post.scan_limit(1000) # => 1000 models
By contrast with record_limit the cost (read capacity units) and
performance is predictable.
When called without criteria it works like record_limit.
296 297 298 299 |
# File 'lib/dynamoid/criteria/chain.rb', line 296 def scan_limit(limit) @scan_limit = limit self end |
#start(start) ⇒ Dynamoid::Criteria::Chain
Set the start item.
When the start item is set the items will be loaded starting right after the specified item.
Post.where(links_count: 2).start(post)
It can be used to implement an own pagination mechanism.
Post.where(author_id: ).start(last_post).scan_limit(50)
The specified start item will not be returned back in a result set.
Actually it doesn't need all the item attributes to start - an item may have only the primary key attributes (partition and sort key if it's declared).
Post.where(links_count: 2).start(Post.new(id: id))
It also supports a Hash argument with the keys attributes - a
partition key and a sort key (if it's declared).
Post.where(links_count: 2).start(id: id)
347 348 349 350 |
# File 'lib/dynamoid/criteria/chain.rb', line 347 def start(start) @start = start self end |
#where(conditions, placeholders = nil) ⇒ Dynamoid::Criteria::Chain
Returns a chain which is a result of filtering current chain with the specified conditions.
It accepts conditions in the form of a hash.
Post.where(links_count: 2)
A key could be either string or symbol.
In order to express conditions other than equality predicates could be used.
Predicate should be added to an attribute name to form a key 'created_at.gt' => Date.yesterday
Currently supported following predicates:
gt- greater thangte- greater or equallt- less thanlte- less or equalne- not equalbetween- an attribute value is greater than the first value and less than the second valuein- check an attribute in a list of valuesbegins_with- check for a prefix in stringcontains- check substring or value in a set or arraynot_contains- check for absence of substring or a value in set or arraynull- attribute doesn't exists in an itemnot_null- attribute exists in an item
All the predicates match operators supported by DynamoDB's ComparisonOperator
Post.where('size.gt' => 1000)
Post.where('size.gte' => 1000)
Post.where('size.lt' => 35000)
Post.where('size.lte' => 35000)
Post.where('author.ne' => 'John Doe')
Post.where('created_at.between' => [Time.now - 3600, Time.now])
Post.where('category.in' => ['tech', 'fashion'])
Post.where('title.begins_with' => 'How long')
Post.where('tags.contains' => 'Ruby')
Post.where('tags.not_contains' => 'Ruby on Rails')
Post.where('legacy_attribute.null' => true)
Post.where('optional_attribute.not_null' => true)
There are some limitations for a sort key. Only following predicates
are supported - gt, gte, lt, lte, between, begins_with.
where without argument will return the current chain.
Multiple calls can be chained together and conditions will be merged:
Post.where('size.gt' => 1000).where('title' => 'some title')
It's equivalent to:
Post.where('size.gt' => 1000, 'title' => 'some title')
But only one condition can be specified for a certain attribute. The last specified condition will override all the others. Only condition 'size.lt' => 200 will be used in following examples:
Post.where('size.gt' => 100, 'size.lt' => 200)
Post.where('size.gt' => 100).where('size.lt' => 200)
Internally where performs either Scan or Query operation.
Conditions can be specified as an expression as well:
Post.where('links_count = :v', v: 2)
This way complex expressions can be constructed (e.g. with AND, OR, and NOT keyword):
Address.where('city = :c AND (post_code = :pc1 OR post_code = :pc2)', city: 'A', pc1: '001', pc2: '002')
See documentation for condition expression's syntax and examples:
115 116 117 118 119 120 121 |
# File 'lib/dynamoid/criteria/chain.rb', line 115 def where(conditions, placeholders = nil) if conditions.is_a?(Hash) where_with_hash(conditions) else where_with_string(conditions, placeholders) end end |
#with_index(index_name) ⇒ Dynamoid::Criteria::Chain
Force the index name to use for queries.
By default allows the library to select the most appropriate index. Sometimes you have more than one index which will fulfill your query's needs. When this case occurs you may want to force an order. This occurs when you are searching by hash key, but not specifying a range key.
class Comment
include Dynamoid::Document
table key: :post_id
range_key :author_id
field :post_date, :datetime
global_secondary_index name: :time_sorted_comments, hash_key: :post_id, range_key: post_date, projected_attributes: :all
end
Comment.where(post_id: id).with_index(:time_sorted_comments).scan_index_forward(false)
391 392 393 394 395 396 397 |
# File 'lib/dynamoid/criteria/chain.rb', line 391 def with_index(index_name) raise Dynamoid::Errors::InvalidIndex, "Unknown index #{index_name}" unless @source.find_index_by_name(index_name) @forced_index_name = index_name @key_fields_detector = KeyFieldsDetector.new(@where_conditions, @source, forced_index_name: index_name) self end |