Module: ElasticsearchRecord::Relation::CalculationMethods
- Defined in:
- lib/elasticsearch_record/relation/calculation_methods.rb
Instance Method Summary collapse
-
#average(column_name) ⇒ Float?
Calculates the average value on a given column.
-
#boxplot(column_name) ⇒ Hash?
A boxplot metrics aggregation that computes boxplot of numeric values extracted from the aggregated documents.
-
#calculate_aggregation(metric, *columns, opts: {}, node: nil) ⇒ Object
(also: #calculate)
creates a aggregation with the provided metric (e.g. :sum) and columns.
-
#cardinality(column_name) ⇒ Integer?
Calculates the cardinality on a given column.
-
#count(column_name = nil) ⇒ Object
Count the records.
-
#matrix_stats(*column_names) ⇒ Hash?
The matrix_stats aggregation is a numeric aggregation that computes the following statistics over a set of document fields: count Number of per field samples included in the calculation.
-
#maximum(column_name) ⇒ Float?
Calculates the maximum value on a given column.
-
#median_absolute_deviation(column_name) ⇒ Float?
This single-value aggregation approximates the median absolute deviation of its search results.
-
#minimum(column_name) ⇒ Float?
Calculates the minimum value on a given column.
-
#percentile_ranks(column_name, values) ⇒ Hash?
A multi-value metrics aggregation that calculates one or more percentile ranks over numeric values extracted from the aggregated documents.
-
#percentiles(column_name) ⇒ Hash?
A multi-value metrics aggregation that calculates one or more percentiles over numeric values extracted from the aggregated documents.
-
#stats(column_name) ⇒ Hash?
A multi-value metrics aggregation that computes stats over numeric values extracted from the aggregated documents.
-
#string_stats(column_name) ⇒ Hash?
A multi-value metrics aggregation that computes statistics over string values extracted from the aggregated documents.
-
#sum(column_name) ⇒ Float?
Calculates the sum of values on a given column.
Instance Method Details
#average(column_name) ⇒ Float?
returns nil on a NullRelation
Calculates the average value on a given column. Returns +nil+ if there's no row. See #calculate for examples with options.
Person.all.average(:age) # => 35.8
217 218 219 |
# File 'lib/elasticsearch_record/relation/calculation_methods.rb', line 217 def average(column_name) calculate_aggregation(:avg, column_name, node: :value) end |
#boxplot(column_name) ⇒ Hash?
returns nil on a NullRelation
A boxplot metrics aggregation that computes boxplot of numeric values extracted from the aggregated documents. These values can be generated from specific numeric or histogram fields in the documents.
The boxplot aggregation returns essential information for making a box plot: minimum, maximum, median, first quartile (25th percentile) and third quartile (75th percentile) values.
Person.all.boxplot(:age)
{ "min": 0.0, "max": 990.0, "q1": 167.5, "q2": 445.0, "q3": 722.5, "lower": 0.0, "upper": 990.0 }
71 72 73 |
# File 'lib/elasticsearch_record/relation/calculation_methods.rb', line 71 def boxplot(column_name) calculate_aggregation(:boxplot, column_name) end |
#calculate_aggregation(metric, *columns, opts: {}, node: nil) ⇒ Object Also known as: calculate
returns nil on a NullRelation
creates a aggregation with the provided metric (e.g. :sum) and columns. returns the metric node (default: :value) from the aggregations result.
298 299 300 301 302 303 304 305 306 307 308 309 310 311 312 313 314 315 316 |
# File 'lib/elasticsearch_record/relation/calculation_methods.rb', line 298 def calculate_aggregation(metric, *columns, opts: {}, node: nil) # prevent execution on a *NullRelation* return if null_relation? metric_key = "calculate_#{metric}" # spawn a new aggregation and return the aggs response = if columns.size == 1 aggregate(metric_key, { metric => { field: columns[0] }.merge(opts) }).aggregations else aggregate(metric_key, { metric => { fields: columns }.merge(opts) }).aggregations end if node.present? response[metric_key][node] else response[metric_key] end end |
#cardinality(column_name) ⇒ Integer?
returns nil on a NullRelation
Calculates the cardinality on a given column. Returns +0+ if there's no row.
Person.all.cardinality(:age)
12
203 204 205 |
# File 'lib/elasticsearch_record/relation/calculation_methods.rb', line 203 def cardinality(column_name) calculate_aggregation(:cardinality, column_name, node: :value) end |
#count(column_name = nil) ⇒ Object
Count the records.
Person.all.count => the total count of all people
Person.all.count(:age) => returns the total count of all people whose age is present in database
11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 |
# File 'lib/elasticsearch_record/relation/calculation_methods.rb', line 11 def count(column_name = nil) # fallback to default return super() if block_given? # check for already failed query return 0 if null_relation? # reset column_name, if +:all+ was provided ... column_name = nil if column_name == :all # check for combined cases if self.distinct_value && column_name self.cardinality(column_name) elsif column_name where(:filter, { exists: { field: column_name } }).count elsif self.group_values.any? self.composite(*self.group_values) elsif self.select_values.any? self.composite(*self.select_values) elsif limit_value == 0 # Shortcut when limit is zero. return 0 elsif limit_value # since total will be limited to 10000 results, we need to resolve the real values by a custom query. # This query is called through +#select_count+. # # HINT: +:__query__+ directly interacts with the query-object and sets the 'terminate_after' argument # see @ ElasticsearchRecord::Query#arguments & Arel::Collectors::ElasticsearchQuery#assign arel = spawn.unscope!(:offset, :limit, :order, :configure, :aggs).configure!(:__query__, argument: { terminate_after: limit_value }).arel klass.connection.select_count(arel, "#{klass.name} Count") else # since total will be limited to 10000 results, we need to resolve the real values by a custom query. # This query is called through +#select_count+. arel = spawn.unscope!(:offset, :limit, :order, :configure, :aggs) klass.connection.select_count(arel, "#{klass.name} Count") end end |
#matrix_stats(*column_names) ⇒ Hash?
returns nil on a NullRelation
The matrix_stats aggregation is a numeric aggregation that computes the following statistics over a set of document fields: count Number of per field samples included in the calculation. mean The average value for each field. variance Per field Measurement for how spread out the samples are from the mean. skewness Per field measurement quantifying the asymmetric distribution around the mean. kurtosis Per field measurement quantifying the shape of the distribution. covariance A matrix that quantitatively describes how changes in one field are associated with another. correlation The covariance matrix scaled to a range of -1 to 1, inclusive. Describes the relationship between field distributions.
134 135 136 |
# File 'lib/elasticsearch_record/relation/calculation_methods.rb', line 134 def matrix_stats(*column_names) calculate_aggregation(:matrix_stats, *column_names) end |
#maximum(column_name) ⇒ Float?
returns nil on a NullRelation
Calculates the maximum value on a given column. The value is returned with the same data type of the column, or +nil+ if there's no row. See
calculate for examples with options.
Person.all.maximum(:age) # => 93
249 250 251 |
# File 'lib/elasticsearch_record/relation/calculation_methods.rb', line 249 def maximum(column_name) calculate_aggregation(:max, column_name, node: :value) end |
#median_absolute_deviation(column_name) ⇒ Float?
returns nil on a NullRelation
This single-value aggregation approximates the median absolute deviation of its search results. Median absolute deviation is a measure of variability. It is a robust statistic, meaning that it is useful for describing data that may have outliers, or may not be normally distributed. For such data it can be more descriptive than standard deviation.
It is calculated as the median of each data point’s deviation from the median of the entire sample. That is, for a random variable X, the median absolute deviation is median(|median(X) - Xi|).
Person.all.median_absolute_deviation(:age) # => 91
269 270 271 |
# File 'lib/elasticsearch_record/relation/calculation_methods.rb', line 269 def median_absolute_deviation(column_name) calculate_aggregation(:median_absolute_deviation, column_name) end |
#minimum(column_name) ⇒ Float?
returns nil on a NullRelation
Calculates the minimum value on a given column. The value is returned with the same data type of the column, or +nil+ if there's no row.
Person.all.minimum(:age)
7
233 234 235 |
# File 'lib/elasticsearch_record/relation/calculation_methods.rb', line 233 def minimum(column_name) calculate_aggregation(:min, column_name, node: :value) end |
#percentile_ranks(column_name, values) ⇒ Hash?
returns nil on a NullRelation
A multi-value metrics aggregation that calculates one or more percentile ranks over numeric values extracted from the aggregated documents.
Percentile rank show the percentage of observed values which are below certain value. For example, if a value is greater than or equal to 95% of the observed values it is said to be at the 95th percentile rank.
Person.all.percentile_ranks(:year, [500,600])
{ "1.0" => 2016.0, "5.0" => 2016.0, "25.0" => 2016.0, "50.0" => 2017.0, "75.0" => 2017.0, "95.0" => 2021.0, "99.0" => 2022.0 }
188 189 190 |
# File 'lib/elasticsearch_record/relation/calculation_methods.rb', line 188 def percentile_ranks(column_name, values) calculate_aggregation(:percentile_ranks, column_name, opts: { values: values }, node: :values) end |
#percentiles(column_name) ⇒ Hash?
returns nil on a NullRelation
A multi-value metrics aggregation that calculates one or more percentiles over numeric values extracted from the aggregated documents. Returns a hash with empty values (but keys still exists) if there is no row.
Person.all.percentiles(:year)
{ "1.0" => 2016.0, "5.0" => 2016.0, "25.0" => 2016.0, "50.0" => 2017.0, "75.0" => 2017.0, "95.0" => 2021.0, "99.0" => 2022.0 }
159 160 161 |
# File 'lib/elasticsearch_record/relation/calculation_methods.rb', line 159 def percentiles(column_name) calculate_aggregation(:percentiles, column_name, node: :values) end |
#stats(column_name) ⇒ Hash?
returns nil on a NullRelation
A multi-value metrics aggregation that computes stats over numeric values extracted from the aggregated documents. # The stats that are returned consist of: min, max, sum, count and avg.
Person.all.stats(:age)
{ "count": 10, "min": 0.0, "max": 990.0, "sum": 16859, "avg": 75.5 }
93 94 95 |
# File 'lib/elasticsearch_record/relation/calculation_methods.rb', line 93 def stats(column_name) calculate_aggregation(:stats, column_name) end |
#string_stats(column_name) ⇒ Hash?
returns nil on a NullRelation
A multi-value metrics aggregation that computes statistics over string values extracted from the aggregated documents. These values can be retrieved either from specific keyword fields.
Person.all.string_stats(:name)
{ "count": 5, "min_length": 24, "max_length": 30, "avg_length": 28.8, "entropy": 3.94617750050791 }
115 116 117 |
# File 'lib/elasticsearch_record/relation/calculation_methods.rb', line 115 def string_stats(column_name) calculate_aggregation(:string_stats, column_name) end |
#sum(column_name) ⇒ Float?
returns nil on a NullRelation
Calculates the sum of values on a given column. The value is returned with the same data type of the column, +0+ if there's no row. See
calculate for examples with options.
Person.all.sum(:age) # => 4562
285 286 287 |
# File 'lib/elasticsearch_record/relation/calculation_methods.rb', line 285 def sum(column_name) calculate_aggregation(:sum, column_name, node: :value) end |