Class: UnnestedInFilters::Rewriter

Inherits:
Object
  • Object
show all
Includes:
Gitlab::Utils::StrongMemoize
Defined in:
lib/unnested_in_filters/rewriter.rb

Defined Under Namespace

Classes: IndexCoverage, ValueTable

Instance Method Summary collapse

Constructor Details

#initialize(relation) ⇒ Rewriter

Returns a new instance of Rewriter.



197
198
199
# File 'lib/unnested_in_filters/rewriter.rb', line 197

def initialize(relation)
  @relation = relation
end

Instance Method Details

#rewriteObject

Rewrites the given ActiveRecord::Relation object to utilize the DB indices efficiently.

Currently Postgres will produce inefficient query plans which use a filter_predicate instead of a access_predicate to filter by IN clause contents. This behaviour does a table read of the data for filtering, disregarding the structure of the index and losing any benefit from any sorting applied to the index as it will have to resort the table read data.

Rewriting the query using the unnest command induces Postgres into using the appropriate index search behaviour for each column in the index by generating a cartesian product between the individual items of the IN filter items and queried table. This means each read column will maintain the sort order provided by the index, avoiding a memory sort node in the final query plan.

This will not work if queried columns are not all present in the index, or if unqueried columns exist in the index that are not at the end, as this makes that part of the index useless to Postgres and will result in a table scan anyways from that point.

Example usage;

relation = Vulnerabilities::Read.where(state: [1, 4])
relation = relation.order(severity: :desc, vulnerability_id: :desc)

rewriter = UnnestedInFilters::Rewriter.new(relation)
optimized_relation = rewriter.rewrite

In the above example. the relation object would produce the following SQL query;

SELECT
  "vulnerability_reads".*
FROM
  "vulnerability_reads"
WHERE
  "vulnerability_reads"."state" IN (1, 4)
ORDER BY
  "vulnerability_reads"."severity" DESC,
  "vulnerability_reads"."vulnerability_id" DESC
LIMIT 20;

And the optimized_relation object would would produce the following query to utilize the index on (state, severity, vulnerability_id);

SELECT
  "vulnerability_reads".*
FROM
  unnest('{1, 4}'::smallint[]) AS "states" ("state"),
  LATERAL (
    SELECT
      "vulnerability_reads".*
    FROM
      "vulnerability_reads"
    WHERE
      (vulnerability_reads."state" = "states"."state")
    ORDER BY
      "vulnerability_reads"."severity" DESC,
      "vulnerability_reads"."vulnerability_id" DESC
    LIMIT 20) AS vulnerability_reads
ORDER BY
  "vulnerability_reads"."severity" DESC,
  "vulnerability_reads"."vulnerability_id" DESC
LIMIT 20

If one of the columns being used for filtering or ordering is the primary key, then the query will be further optimized to use an index-only scan for initial filtering before selecting all columns using the primary key.

Using the prior query as an example, where vulnerability_id is the primary key, This will be rewritten to:

SELECT
  "vulnerability_reads".*
FROM
  "vulnerability_reads"
WHERE
  "vulnerability_reads"."vulnerability_id"
IN (
  SELECT
    "vulnerability_reads"."vulnerability_id"
  FROM
    unnest('{1, 4}'::smallint[]) AS "states" ("state"),
    LATERAL (
      SELECT
        "vulnerability_reads"."vulnerability_id"
      FROM
        "vulnerability_reads"
      WHERE
        (vulnerability_reads."state" = "states"."state")
      ORDER BY
        "vulnerability_reads"."severity" DESC,
        "vulnerability_reads"."vulnerability_id" DESC
        LIMIT 20
      ) AS vulnerability_reads
   )
ORDER BY
  "vulnerability_reads"."severity" DESC,
  "vulnerability_reads"."vulnerability_id" DESC
LIMIT 20


299
300
301
302
303
304
305
# File 'lib/unnested_in_filters/rewriter.rb', line 299

def rewrite
  log_rewrite

  return filter_query unless primary_key_present?

  index_only_filter_query
end

#rewrite?Boolean

Returns:

  • (Boolean)


307
308
309
310
311
# File 'lib/unnested_in_filters/rewriter.rb', line 307

def rewrite?
  strong_memoize(:rewrite) do
    in_filters.present? && has_index_coverage?
  end
end