Module: DatastaxRails::SearchMethods
- Included in:
- Relation
- Defined in:
- lib/datastax_rails/relation/search_methods.rb
Defined Under Namespace
Classes: WhereProxy
Instance Method Summary collapse
-
#allow_filtering ⇒ DatastaxRails::Relation
By default, Cassandra will throw an error if you try to set a where condition on either a column with no index or on more than one column that isn’t part of the primary key.
-
#compute_stats(*fields) ⇒ DatastaxRails::Relation
Have SOLR compute stats for a given numeric field.
-
#consistency(level) ⇒ DatastaxRails::Relation
The default consistency level for DSR is QUORUM when searching by ID.
-
#dont_escape ⇒ DatastaxRails::Relation
Normally special characters (other than wild cards) are escaped before the search is submitted.
-
#extending(*modules) ⇒ DatastaxRails::Relation
Used to extend a scope with additional methods, either through a module or a block provided.
-
#fulltext(query, opts = {}) ⇒ DatastaxRails::Relation
Specifies a full text search string to be processed by SOLR.
- #greater_than(_value) ⇒ Object
-
#group(attribute) ⇒ DatastaxRails::Relation
Group results by a given attribute only returning the top results for each group.
-
#highlight(*args) ⇒ Object
Enables highlighting on specific fields when used with full text searching.
- #less_than(_value) ⇒ Object
-
#limit(value) ⇒ DatastaxRails::Relation
(also: #per_page)
Limit a single page to
value
records. -
#order(attribute) ⇒ DatastaxRails::Relation
Orders the result set by a particular attribute.
-
#page(value) ⇒ DatastaxRails::Relation
Sets the page number to retrieve.
-
#paginate(options = {}) ⇒ DatastaxRails::Relation
WillPaginate compatible method for paginating.
-
#query_parser(parser) ⇒ DatastaxRails::Relation
By default, DatastaxRails uses the LuceneQueryParser.
-
#reverse_order ⇒ DatastaxRails::Relation
Reverses the order of the results.
-
#select(*fields) ⇒ Object
Works in two unique ways.
-
#slow_order(attribute) ⇒ DatastaxRails::Relation
Orders the result set in memory after all matching records have been retrieved.
-
#solr_format(attribute, value) ⇒ Object
Formats a value for solr (assuming this is a solr query).
-
#where(attribute) ⇒ DatastaxRails::Relation
Specifies restrictions (scoping) on the result set.
-
#where_not(attribute) ⇒ DatastaxRails::Relation, DatastaxRails::SearchMethods::WhereProxy
Specifies restrictions (scoping) that should not match the result set.
-
#with_cassandra ⇒ DatastaxRails::Relation
By default, DatastaxRails will try to pick the right method of performing a search.
-
#with_solr ⇒ DatastaxRails::Relation
By default, DatastaxRails will try to pick the right method of performing a search.
Instance Method Details
#allow_filtering ⇒ DatastaxRails::Relation
By default, Cassandra will throw an error if you try to set a where condition on either a column with no index or on more than one column that isn’t part of the primary key. If you are confident that the number of records that need to be searched is low, then you can instruct it to ignore the warning. Generally you only want to do this when either the number of records in the table is very small or when one of the other where conditions that has an index will reduce the number of records to a small number.
Model.where(:name => 'johndoe', :active => true).allow_filtering
NOTE that this only applies when doing a search via a cassandra index.
16 17 18 19 20 |
# File 'lib/datastax_rails/relation/search_methods.rb', line 16 def allow_filtering clone.tap do |r| r.allow_filtering_value = true end end |
#compute_stats(*fields) ⇒ DatastaxRails::Relation
Have SOLR compute stats for a given numeric field. Status computed include:
-
min
-
max
-
sum
-
sum of squares
-
mean
-
standard deviation
Model.compute_stats(:price) Model.compute_stats(:price, :quantity)
NOTE: This is only compatible with solr queries. It will be ignored when a CQL query is made.
315 316 317 318 319 320 321 |
# File 'lib/datastax_rails/relation/search_methods.rb', line 315 def compute_stats(*fields) return self if fields.empty? clone.tap do |r| r.stats_values += Array.wrap(fields) end end |
#consistency(level) ⇒ DatastaxRails::Relation
The default consistency level for DSR is QUORUM when searching by ID. For all searches using SOLR, the default consistency is ONE. Use this to override it in either case.
Model.consistency(:local_quorum).find("12345")
Note that Solr searches don’t allow you to specify the consistency level. DSR sort of gets around this by taking the search results and then going to Cassandra to retrieve the objects by ID using the consistency you specified. However, it is possible that you might not get all of the records you are expecting if the SOLR node you were talking to hasn’t been updated yet with the results. In practice, this should not happen for records that were created over your connection, but it is possible for other connections to create records that you can’t see yet.
Valid consistency levels are:
-
:any
-
:one
-
:quorum
-
:local_quorum (if using Network Topology)
-
:each_quorum (if using Network Topology)
-
:all
47 48 49 50 51 52 53 54 55 56 |
# File 'lib/datastax_rails/relation/search_methods.rb', line 47 def consistency(level) level = level.to_s.upcase unless self.valid_consistency?(level) fail ArgumentError, "'#{level}' is not a valid Cassandra consistency level" end clone.tap do |r| r.consistency_value = level end end |
#dont_escape ⇒ DatastaxRails::Relation
Normally special characters (other than wild cards) are escaped before the search is submitted. If you want to handle escaping yourself because you need to use those special characters, then just include this in your chain.
Model.dont_escape.where(:name => "(some stuff I don\'t want escaped)")
Note that fulltext searches are NEVER escaped. Use Relation.solr_escape if you want that done.
68 69 70 71 72 |
# File 'lib/datastax_rails/relation/search_methods.rb', line 68 def dont_escape clone.tap do |r| r.escape_value = false end end |
#extending(*modules) ⇒ DatastaxRails::Relation
Used to extend a scope with additional methods, either through a module or a block provided
The object returned is a relation which can be further extended
81 82 83 84 85 86 87 88 89 |
# File 'lib/datastax_rails/relation/search_methods.rb', line 81 def extending(*modules) modules << Module.new(&Proc.new) if block_given? return self if modules.empty? clone.tap do |r| r.send(:apply_modules, modules.flatten) end end |
#fulltext(query, opts = {}) ⇒ DatastaxRails::Relation
Specifies a full text search string to be processed by SOLR
Model.fulltext("john smith")
You can also pass in an options hash with the following options:
-
:fields => list of fields to search instead of the default of all fields
Model.fulltext(“john smith”, fields: [:title])
463 464 465 466 467 468 469 470 471 |
# File 'lib/datastax_rails/relation/search_methods.rb', line 463 def fulltext(query, opts = {}) return self if query.blank? opts[:query] = downcase_query(query) clone.tap do |r| r.fulltext_values << opts end end |
#greater_than(_value) ⇒ Object
530 531 532 533 |
# File 'lib/datastax_rails/relation/search_methods.rb', line 530 def greater_than(_value) fail(ArgumentError, '#greater_than can only be called after an appropriate where call. ' \ 'e.g. where(:created_at).greater_than(1.day.ago)') end |
#group(attribute) ⇒ DatastaxRails::Relation
Group results by a given attribute only returning the top results for each group. In Lucene, this is often referred to as Field Collapsing.
This modifies the behavior of pagination. When using a group, per_page
will specify the number of results returned *for each group*. In addition, page
will move all groups forward by one page possibly resulting in some groups showing up empty if they have fewer matching entires than others.
When grouping is being used, the sort values will be used to sort results within a given group. Any sorting of the groups themselves will need to be handled after-the-fact as the groups are returned as hash of Collection objects.
Because SOLR is doing the grouping work, we can only group on single-valued fields (i.e., not text
or collections). In the future, SOLR may support grouping on multi-valued fields.
NOTE: Group names will be lower-cased
Model.group(:program_id)
The object the hash entries point to will be a DatastaxRails::Collection
168 169 170 171 172 173 174 |
# File 'lib/datastax_rails/relation/search_methods.rb', line 168 def group(attribute) return self if attribute.blank? clone.tap do |r| r.group_value = attribute end end |
#highlight(*args, opts) ⇒ DatastaxRails::Relation #highlight(*args) ⇒ DatastaxRails::Relation
Enables highlighting on specific fields when used with full text searching. In order for highlighting to work, the highlighted field(s) must be :stored
Model.fulltext("ruby on rails").highlight(:tags, :body)
Model.fulltext("pizza").highlight(:description, snippets: 3, fragsize: 150)
In addition to the array of field names to highlight, you can pass in an options hash with the following options:
-
:snippets => number of highlight snippets to return
-
:fragsize => number of characters for each snippet length
-
:pre_tag => text which appears before a highlighted term
-
:post_tag => text which appears after a highlighted term
-
:merge_contiguous => collapse contiguous fragments into a single fragment
-
:use_fast_vector => enables the Solr FastVectorHighlighter
Note: When enabling :use_fast_vector
, the highlighted fields must be also have :term_vectors
, :term_positions
, and :term_offsets
enabled. For more information about these options, refer to Solr’s wiki on HighlightingParameters.
511 512 513 514 515 516 517 518 519 520 521 |
# File 'lib/datastax_rails/relation/search_methods.rb', line 511 def highlight(*args) return self if args.blank? opts = args.last.is_a?(Hash) ? args.pop : {} clone.tap do |r| opts[:fields] = r.[:fields] || [] opts[:fields] |= args # Union unique field names r..merge! opts end end |
#less_than(_value) ⇒ Object
524 525 526 527 |
# File 'lib/datastax_rails/relation/search_methods.rb', line 524 def less_than(_value) fail(ArgumentError, '#less_than can only be called after an appropriate where call. ' \ 'e.g. where(:created_at).less_than(1.day.ago)') end |
#limit(value) ⇒ DatastaxRails::Relation Also known as: per_page
Limit a single page to value
records
Model.limit(1)
Model.per_page(50)
Normally DatastaxRails searches are paginated at a really high number so as to effectively disable pagination. However, you can cause all requests to be paginated on a per-model basis by overriding the default_page_size
class method in your model:
class Model < DatastaxRails::Base
def self.default_page_size
30
end
end
109 110 111 112 113 |
# File 'lib/datastax_rails/relation/search_methods.rb', line 109 def limit(value) clone.tap do |r| r.per_page_value = value.to_i end end |
#order(attribute) ⇒ DatastaxRails::Relation
Orders the result set by a particular attribute. Note that text fields may not be used for ordering as they are tokenized. Valid candidates are fields of type string
, integer
, long
, float
, double
, and time
. In addition, the symbol :score
can be used to sort on the relevance rating returned by Solr. The default direction is ascending but may be reversed by passing a hash where the field is the key and the value is :desc
Model.order(:name)
Model.order(name: :desc)
WARNING: If this call is combined with #with_cassandra, you can only order on the cluster_by column. If this doesn’t mean anything to you, then you probably don’t want to use these together.
193 194 195 196 197 198 199 |
# File 'lib/datastax_rails/relation/search_methods.rb', line 193 def order(attribute) return self if attribute.blank? clone.tap do |r| r.order_values << (attribute.is_a?(Hash) ? attribute : { attribute.to_sym => :asc }) end end |
#page(value) ⇒ DatastaxRails::Relation
Sets the page number to retrieve
Model.page(2)
122 123 124 125 126 |
# File 'lib/datastax_rails/relation/search_methods.rb', line 122 def page(value) clone.tap do |r| r.page_value = value.to_i end end |
#paginate(options = {}) ⇒ DatastaxRails::Relation
WillPaginate compatible method for paginating
Model.paginate(page: 2, per_page: 10)
136 137 138 139 140 141 142 |
# File 'lib/datastax_rails/relation/search_methods.rb', line 136 def paginate( = {}) = .reverse_merge(page: 1, per_page: 30) clone.tap do |r| r.page_value = [:page] r.per_page_value = [:per_page] end end |
#query_parser(parser) ⇒ DatastaxRails::Relation
By default, DatastaxRails uses the LuceneQueryParser. disMax is also supported. eDisMax probably works as well.
*This only applies to fulltext queries*
Model.query_parser('disMax').fulltext("john smith")
291 292 293 294 295 296 297 |
# File 'lib/datastax_rails/relation/search_methods.rb', line 291 def query_parser(parser) return self if parser.blank? clone.tap do |r| r.query_parser_value = parser end end |
#reverse_order ⇒ DatastaxRails::Relation
Reverses the order of the results. The following are equivalent:
Model.order(:name).reverse_order
Model.order(name: :desc)
Model.order(:name).reverse_order.reverse_order
Model.order(name: :asc)
276 277 278 279 280 |
# File 'lib/datastax_rails/relation/search_methods.rb', line 276 def reverse_order clone.tap do |r| r.reverse_order_value == !r.reverse_order_value end end |
#select(*fields) ⇒ Object
Works in two unique ways.
First: takes a block so it can be used just like Array#select.
Model.scoped.select { |m| m.field == value }
This will build an array of objects from the database for the scope, converting them into an array and iterating through them using Array#select.
Second: Modifies the query so that only certain fields are retrieved:
>> Model.select(:field)
=> [#<Model field:value>]
Although in the above example it looks as though this method returns an array, it actually returns a relation object and can have other query methods appended to it, such as the other methods in DatastaxRails::SearchMethods.
This method will also take multiple parameters:
>> Model.select(:field, :other_field, :and_one_more)
=> [#<Model field: "value", other_field: "value", and_one_more: "value">]
Any attributes that do not have fields retrieved by a select will return ‘nil` when the getter method for that attribute is used:
>> Model.select(:field).first.other_field
=> nil
The exception to this rule is when an attribute is lazy-loaded (e.g., binary). In that case, it is never retrieved until you call the getter method.
256 257 258 259 260 261 262 263 264 265 |
# File 'lib/datastax_rails/relation/search_methods.rb', line 256 def select(*fields) if block_given? to_a.select { |*block_args| yield(*block_args) } else railse ArgumentError, 'Call this with at least one field' if fields.empty? clone.tap do |r| r.select_values += fields end end end |
#slow_order(attribute) ⇒ DatastaxRails::Relation
Orders the result set in memory after all matching records have been retrieved.
This means that limit is ignored until the end. ALL matching records WILL be retrieved and sorted before taking #limit records and returning them to the caller.
Why would you do this? If you are retrieving records from a cassandra index but don’t have the appropriate clustering order you can use this, but you should only do so if you are confident that the number of records returned will be low.
A warning will be printed to the log if this results in a very inefficient operation.
USE WITH CARE!!!!!!
218 219 220 221 222 223 |
# File 'lib/datastax_rails/relation/search_methods.rb', line 218 def slow_order(attribute) return self if attribute.blank? clone.tap do |r| r.slow_order_values << (attribute.is_a?(Hash) ? attribute : { attribute.to_sym => :asc }) end end |
#solr_format(attribute, value) ⇒ Object
Formats a value for solr (assuming this is a solr query).
536 537 538 539 540 541 542 543 544 545 546 547 548 549 550 551 552 553 554 555 556 557 558 559 |
# File 'lib/datastax_rails/relation/search_methods.rb', line 536 def solr_format(attribute, value) # rubocop:disable Style/CyclomaticComplexity return value unless use_solr_value column = attribute.is_a?(DatastaxRails::Column) ? attribute : klass.column_for_attribute(attribute) # value = column.type_cast_for_solr(value) case when value.is_a?(Time) || value.is_a?(DateTime) || value.is_a?(Date) column.type_cast_for_solr(value) when value.is_a?(Array) || value.is_a?(Set) value = value.to_a.compact if column.primary value.map { |v| column.type_cast_for_solr(v, column.[:holds]).to_s.gsub(/ /, '\\ ') }.join(' OR ') when value.is_a?(Fixnum) value < 0 ? "\\#{value}" : value when value.is_a?(Range) "[#{solr_format(attribute, value.first)} TO #{solr_format(attribute, value.last)}]" when value.is_a?(String) solr_escape(downcase_query(value.gsub(/ /, '\\ '))) when value.is_a?(FalseClass), value.is_a?(TrueClass) value.to_s # when value.is_a?(::Cql::Uuid) # value.to_s else value end end |
#where(attribute) ⇒ DatastaxRails::Relation
Specifies restrictions (scoping) on the result set. Expects a hash in the form attribute: value for equality comparisons.
Model.where(group_id: '1234', active: true)
The value of the comparison does not need to be a scalar. For example:
Model.where(name: ["Bob", "Tom", "Sally"]) # Finds where name is any of the three names
Model.where(age: 18..65) # Finds where age is anywhere in the range
Inequality comparisons such as greater_than and less_than are specified via chaining:
Model.where(:created_at).greater_than(1.day.ago)
Model.where(:age).less_than(65)
There is an alternate form of specifying greater than/less than queries that can be done with a single call. This is useful for remote APIs and such.
Model.where(:created_at => {greater_than: 1.day.ago})
Model.where(:age => {less_than: 65})
NOTE: Due to the way SOLR handles range queries, all greater/less than queries are actually greater/less than or equal to queries. There is no way to perform a strictly greater/less than query.
382 383 384 385 386 387 388 389 390 391 392 393 394 395 396 397 398 399 400 401 402 403 404 405 |
# File 'lib/datastax_rails/relation/search_methods.rb', line 382 def where(attribute) return self if attribute.blank? if attribute.is_a?(Symbol) || attribute.is_a?(String) WhereProxy.new(self, attribute) else clone.tap do |r| attributes = attribute.dup attributes.each do |k, v| if v.is_a?(Hash) comp, value = v.first if (comp.to_s == 'greater_than') r.greater_than_values << { k => value } elsif (comp.to_s == 'less_than') r.less_than_values << { k => value } else r.where_values << { k => value } end attributes.delete(k) end end r.where_values << attributes unless attributes.empty? end end end |
#where_not(attribute) ⇒ DatastaxRails::Relation, DatastaxRails::SearchMethods::WhereProxy
Specifies restrictions (scoping) that should not match the result set. Expects a hash in the form attribute: value.
Model.where_not(group_id: '1234', active: false)
Passing an array will search for records where none of the array entries are present
Model.where_not(group_id: ['1234', '5678'])
The above would find all models where group id is neither 1234 or 5678.
423 424 425 426 427 428 429 430 431 432 433 434 435 436 437 438 439 440 441 442 443 444 445 446 447 |
# File 'lib/datastax_rails/relation/search_methods.rb', line 423 def where_not(attribute) return self if attribute.blank? if attribute.is_a?(Symbol) WhereProxy.new(self, attribute, true) else clone.tap do |r| attributes = attribute.dup attributes.each do |k, v| if v.is_a?(Hash) comp, value = v.first if (comp.to_s == 'greater_than') r.less_than_values << { k => value } elsif (comp.to_s == 'less_than') r.greater_than_values << { k => value } else r.where_not_values << { k => value } end attributes.delete(k) end end r.where_not_values << attributes unless attributes.empty? end end end |
#with_cassandra ⇒ DatastaxRails::Relation
By default, DatastaxRails will try to pick the right method of performing a search. You can use this method to force it to make the query via cassandra.
NOTE that this method assumes that you have all the proper secondary indexes in place before you attempt to use it. If not, you will get an error.
346 347 348 349 350 |
# File 'lib/datastax_rails/relation/search_methods.rb', line 346 def with_cassandra clone.tap do |r| r.use_solr_value = false end end |
#with_solr ⇒ DatastaxRails::Relation
By default, DatastaxRails will try to pick the right method of performing a search. You can use this method to force it to make the query via SOLR.
NOTE that the time between when a record is placed into Cassandra and when it becomes available in SOLR is not guaranteed to be insignificant. It’s very possible to insert a new record and not find it when immediately doing a SOLR search for it.
332 333 334 335 336 |
# File 'lib/datastax_rails/relation/search_methods.rb', line 332 def with_solr clone.tap do |r| r.use_solr_value = true end end |