Module: ElasticGraph::SchemaDefinition::Indexing::ListCountsMapping

Defined in:
lib/elastic_graph/schema_definition/indexing/list_counts_mapping.rb

Overview

To support filtering on the ‘count` of a list field, we need to index the counts as we ingest events. This is responsible for defining the mapping for the special `__counts` field in which we store the list counts.

Class Method Summary collapse

Class Method Details

.merged_into(mapping_hash, for_type:) ⇒ Object

Builds the ‘__counts` field mapping for the given `for_type`. Returns a new `mapping_hash` with the extra `__counts` field merged into it.



23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
# File 'lib/elastic_graph/schema_definition/indexing/list_counts_mapping.rb', line 23

def self.merged_into(mapping_hash, for_type:)
  counts_properties = for_type.indexing_fields_by_name_in_index.values.flat_map do |field|
    field.paths_to_lists_for_count_indexing.map do |path|
      # We chose the `integer` type here because:
      #
      # - While we expect datasets with more documents than the max integer value (~2B), we don't expect
      #   individual documents to have any list fields with more elements than can fit in an integer.
      # - Using `long` would allow for much larger counts, but we don't want to take up double the
      #   storage space for this.
      #
      # Note that `new_list_filter_input_type` (in `schema_definition/factory.rb`) relies on this, and
      # has chosen to use `IntFilterInput` (rather than `JsonSafeLongFilterInput`) for filtering these count values.
      # If we change the mapping type here, we should re-evaluate the filter used there.
      [path, {"type" => "integer"}]
    end
  end.to_h

  return mapping_hash if counts_properties.empty?

  Support::HashUtil.deep_merge(mapping_hash, {
    "properties" => {
      LIST_COUNTS_FIELD => {
        "properties" => counts_properties
      }
    }
  })
end