Module: OnlineMigrations::BackgroundMigrations::MigrationHelpers

Included in:
SchemaStatements
Defined in:
lib/online_migrations/background_migrations/migration_helpers.rb

Instance Method Summary collapse

Instance Method Details

#backfill_column_for_type_change_in_background(table_name, column_name, model_name: nil, type_cast_function: nil, **options) ⇒ OnlineMigrations::BackgroundMigrations::Migration

Note:

This method is better suited for large tables (10/100s of millions of records). For smaller tables it is probably better and easier to use more flexible ‘backfill_column_for_type_change`.

Backfills data from the old column to the new column using background migrations.

Examples:

backfill_column_for_type_change_in_background(:files, :size)

With type casting

backfill_column_for_type_change_in_background(:users, :settings, type_cast_function: "jsonb")

Additional background migration options

backfill_column_for_type_change_in_background(:files, :size, batch_size: 10_000)

Parameters:

  • table_name (String, Symbol)
  • column_name (String, Symbol)
  • model_name (String) (defaults to: nil)

    If Active Record multiple databases feature is used, the class name of the model to get connection from.

  • type_cast_function (String, Symbol) (defaults to: nil)

    Some type changes require casting data to a new type. For example when changing from ‘text` to `jsonb`. In this case, use the `type_cast_function` option. You need to make sure there is no bad data and the cast will always succeed

  • options (Hash)

    used to control the behavior of background migration. See ‘#enqueue_background_data_migration`

Returns:



86
87
88
89
90
91
92
93
94
95
# File 'lib/online_migrations/background_migrations/migration_helpers.rb', line 86

def backfill_column_for_type_change_in_background(table_name, column_name, model_name: nil,
                                                  type_cast_function: nil, **options)
  backfill_columns_for_type_change_in_background(
    table_name,
    column_name,
    model_name: model_name,
    type_cast_functions: { column_name => type_cast_function },
    **options
  )
end

#backfill_column_in_background(table_name, column_name, value, model_name: nil, **options) ⇒ OnlineMigrations::BackgroundMigrations::Migration

Note:

This method is better suited for large tables (10/100s of millions of records). For smaller tables it is probably better and easier to use more flexible ‘update_column_in_batches`.

Note:

Consider ‘backfill_columns_in_background` when backfilling multiple columns to avoid rewriting the table multiple times.

Backfills column data using background migrations.

Examples:

backfill_column_in_background(:users, :admin, false)

Additional background migration options

backfill_column_in_background(:users, :admin, false, batch_size: 10_000)

Parameters:

  • table_name (String, Symbol)
  • column_name (String, Symbol)
  • value
  • model_name (String) (defaults to: nil)

    If Active Record multiple databases feature is used, the class name of the model to get connection from.

  • options (Hash)

    used to control the behavior of background migration. See ‘#enqueue_background_data_migration`

Returns:



30
31
32
33
# File 'lib/online_migrations/background_migrations/migration_helpers.rb', line 30

def backfill_column_in_background(table_name, column_name, value, model_name: nil, **options)
  backfill_columns_in_background(table_name, { column_name => value },
                                 model_name: model_name, **options)
end

#backfill_columns_for_type_change_in_background(table_name, *column_names, model_name: nil, type_cast_functions: {}, **options) ⇒ Object

Same as ‘backfill_column_for_type_change_in_background` but for multiple columns.

Parameters:

  • type_cast_functions (Hash) (defaults to: {})

    if not empty, keys - column names, values - corresponding type cast functions

See Also:



104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
# File 'lib/online_migrations/background_migrations/migration_helpers.rb', line 104

def backfill_columns_for_type_change_in_background(table_name, *column_names, model_name: nil,
                                                   type_cast_functions: {}, **options)
  if model_name.nil? && Utils.multiple_databases?
    raise ArgumentError, "You must pass a :model_name when using multiple databases."
  end

  tmp_columns = column_names.map { |column_name| "#{column_name}_for_type_change" }
  model_name = model_name.name if model_name.is_a?(Class)

  enqueue_background_data_migration(
    "CopyColumn",
    table_name,
    column_names,
    tmp_columns,
    model_name,
    type_cast_functions,
    **options
  )
end

#backfill_columns_in_background(table_name, updates, model_name: nil, **options) ⇒ Object

Same as ‘backfill_column_in_background` but for multiple columns.

Examples:

backfill_columns_in_background(:users, { admin: false, status: "active" })

Parameters:

  • updates (Hash)

    keys - column names, values - corresponding values

See Also:



44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
# File 'lib/online_migrations/background_migrations/migration_helpers.rb', line 44

def backfill_columns_in_background(table_name, updates, model_name: nil, **options)
  if model_name.nil? && Utils.multiple_databases?
    raise ArgumentError, "You must pass a :model_name when using multiple databases."
  end

  model_name = model_name.name if model_name.is_a?(Class)

  enqueue_background_data_migration(
    "BackfillColumn",
    table_name,
    updates,
    model_name,
    **options
  )
end

#copy_column_in_background(table_name, copy_from, copy_to, model_name: nil, type_cast_function: nil, **options) ⇒ OnlineMigrations::BackgroundMigrations::Migration

Note:

This method is better suited for large tables (10/100s of millions of records). For smaller tables it is probably better and easier to use more flexible ‘update_column_in_batches`.

Copies data from the old column to the new column using background migrations.

Examples:

copy_column_in_background(:users, :id, :id_for_type_change)

Parameters:

  • table_name (String, Symbol)
  • copy_from (String, Symbol)

    source column name

  • copy_to (String, Symbol)

    destination column name

  • model_name (String) (defaults to: nil)

    If Active Record multiple databases feature is used, the class name of the model to get connection from.

  • type_cast_function (String, Symbol) (defaults to: nil)

    Some type changes require casting data to a new type. For example when changing from ‘text` to `jsonb`. In this case, use the `type_cast_function` option. You need to make sure there is no bad data and the cast will always succeed

  • options (Hash)

    used to control the behavior of background migration. See ‘#enqueue_background_data_migration`

Returns:



145
146
147
148
149
150
151
152
153
154
# File 'lib/online_migrations/background_migrations/migration_helpers.rb', line 145

def copy_column_in_background(table_name, copy_from, copy_to, model_name: nil, type_cast_function: nil, **options)
  copy_columns_in_background(
    table_name,
    [copy_from],
    [copy_to],
    model_name: model_name,
    type_cast_functions: { copy_from => type_cast_function },
    **options
  )
end

#copy_columns_in_background(table_name, copy_from, copy_to, model_name: nil, type_cast_functions: {}, **options) ⇒ Object

Same as ‘copy_column_in_background` but for multiple columns.

Parameters:

  • type_cast_functions (Hash) (defaults to: {})

    if not empty, keys - column names, values - corresponding type cast functions

See Also:



163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
# File 'lib/online_migrations/background_migrations/migration_helpers.rb', line 163

def copy_columns_in_background(table_name, copy_from, copy_to, model_name: nil, type_cast_functions: {}, **options)
  if model_name.nil? && Utils.multiple_databases?
    raise ArgumentError, "You must pass a :model_name when using multiple databases."
  end

  model_name = model_name.name if model_name.is_a?(Class)

  enqueue_background_data_migration(
    "CopyColumn",
    table_name,
    copy_from,
    copy_to,
    model_name,
    type_cast_functions,
    **options
  )
end

#create_background_data_migration(migration_name, *arguments, **options) ⇒ Object



437
438
439
440
441
442
443
444
445
446
447
448
449
450
451
452
453
454
455
456
457
458
459
# File 'lib/online_migrations/background_migrations/migration_helpers.rb', line 437

def create_background_data_migration(migration_name, *arguments, **options)
  options.assert_valid_keys(:batch_column_name, :min_value, :max_value, :batch_size, :sub_batch_size,
      :batch_pause, :sub_batch_pause_ms, :batch_max_attempts)

  migration_name = migration_name.name if migration_name.is_a?(Class)

  # Can't use `find_or_create_by` or hash syntax here, because it does not correctly work with json `arguments`.
  existing_migration = Migration.find_by("migration_name = ? AND arguments = ? AND shard IS NULL", migration_name, arguments.to_json)
  return existing_migration if existing_migration

  Migration.create!(**options, migration_name: migration_name, arguments: arguments, shard: nil) do |migration|
    shards = Utils.shard_names(migration.migration_model)
    if shards.size > 1
      migration.children = shards.map do |shard|
        child = migration.dup
        child.shard = shard
        child
      end

      migration.composite = true
    end
  end
end

#delete_associated_records_in_background(model_name, record_id, association, **options) ⇒ OnlineMigrations::BackgroundMigrations::Migration

Note:

This method is better suited for large tables (10/100s of millions of records). For smaller tables it is probably better and easier to directly delete associated records.

Deletes associated records for a specific parent record using background migrations. This is useful when you are planning to remove a parent object (user, account etc) and needs to remove lots of its associated objects.

Examples:

delete_associated_records_in_background("Link", 1, :clicks)

Parameters:

  • model_name (String)
  • record_id (Integer, String)

    parent record primary key’s value

  • association (String, Symbol)

    association name for which records will be removed

  • options (Hash)

    used to control the behavior of background migration. See ‘#enqueue_background_data_migration`

Returns:



266
267
268
269
270
271
272
273
274
275
276
# File 'lib/online_migrations/background_migrations/migration_helpers.rb', line 266

def delete_associated_records_in_background(model_name, record_id, association, **options)
  model_name = model_name.name if model_name.is_a?(Class)

  enqueue_background_data_migration(
    "DeleteAssociatedRecords",
    model_name,
    record_id,
    association,
    **options
  )
end

#delete_orphaned_records_in_background(model_name, *associations, **options) ⇒ OnlineMigrations::BackgroundMigrations::Migration

Note:

This method is better suited for large tables (10/100s of millions of records). For smaller tables it is probably better and easier to directly find and delete orpahed records.

Deletes records with one or more missing relations using background migrations. This is useful when some referential integrity in the database is broken and you want to delete orphaned records.

Examples:

delete_orphaned_records_in_background("Post", :author)

Parameters:

  • model_name (String)
  • associations (Array)
  • options (Hash)

    used to control the behavior of background migration. See ‘#enqueue_background_data_migration`

Returns:



237
238
239
240
241
242
243
244
245
246
# File 'lib/online_migrations/background_migrations/migration_helpers.rb', line 237

def delete_orphaned_records_in_background(model_name, *associations, **options)
  model_name = model_name.name if model_name.is_a?(Class)

  enqueue_background_data_migration(
    "DeleteOrphanedRecords",
    model_name,
    associations,
    **options
  )
end

#enqueue_background_data_migration(migration_name, *arguments, **options) ⇒ OnlineMigrations::BackgroundMigrations::Migration Also known as: enqueue_background_migration

Note:

For convenience, the enqueued background migration is run inline in development and test environments

Creates a background migration for the given job class name.

A background migration runs one job at a time, computing the bounds of the next batch based on the current migration settings and the previous batch bounds. Each job’s execution status is tracked in the database as the migration runs.

Examples:

enqueue_background_data_migration("BackfillProjectIssuesCount",
    batch_size: 10_000, batch_max_attempts: 10)

# Given the background migration exists:

class BackfillProjectIssuesCount < OnlineMigrations::BackgroundMigration
  def relation
    Project.all
  end

  def process_batch(projects)
    projects.update_all(
      "issues_count = (SELECT COUNT(*) FROM issues WHERE issues.project_id = projects.id)"
    )
  end

  # To be able to track progress, you need to define this method
  def count
    Project.maximum(:id)
  end
end

Parameters:

  • migration_name (String, Class)

    Background migration job class name

  • arguments (Array)

    Extra arguments to pass to the job instance when the migration runs

  • options (Hash)

    a customizable set of options

Options Hash (**options):

  • :batch_column_name (Symbol, String) — default: primary key

    Column name the migration will batch over

  • :min_value (Integer)

    Value in the column the batching will begin at, defaults to ‘SELECT MIN(batch_column_name)`

  • :max_value (Integer)

    Value in the column the batching will end at, defaults to ‘SELECT MAX(batch_column_name)`

  • :batch_size (Integer) — default: 20_000

    Number of rows to process in a single background migration run

  • :sub_batch_size (Integer) — default: 1000

    Smaller batches size that the batches will be divided into

  • :batch_pause (Integer) — default: 0

    Pause interval between each background migration job’s execution (in seconds)

  • :sub_batch_pause_ms (Integer) — default: 100

    Number of milliseconds to sleep between each sub_batch execution

  • :batch_max_attempts (Integer) — default: 5

    Maximum number of batch run attempts

Returns:



372
373
374
375
376
377
378
379
380
381
# File 'lib/online_migrations/background_migrations/migration_helpers.rb', line 372

def enqueue_background_data_migration(migration_name, *arguments, **options)
  migration = create_background_data_migration(migration_name, *arguments, **options)

  if Utils.run_background_migrations_inline?
    runner = MigrationRunner.new(migration)
    runner.run_all_migration_jobs
  end

  migration
end

#ensure_background_data_migration_succeeded(migration_name, arguments: nil) ⇒ Object Also known as: ensure_background_migration_succeeded

Ensures that the background data migration with the provided configuration succeeded.

If the enqueued migration was not found in development (probably when resetting a dev environment followed by ‘db:migrate`), then a log warning is printed. If enqueued migration was not found in production, then the error is raised. If enqueued migration was found but is not succeeded, then the error is raised.

Examples:

Without arguments

ensure_background_data_migration_succeeded("BackfillProjectIssuesCount")

With arguments

ensure_background_data_migration_succeeded("CopyColumn", arguments: ["users", "id", "id_for_type_change"])

Parameters:

  • migration_name (String, Class)

    Background migration job class name

  • arguments (Array, nil) (defaults to: nil)

    Arguments with which background migration was enqueued



414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
# File 'lib/online_migrations/background_migrations/migration_helpers.rb', line 414

def ensure_background_data_migration_succeeded(migration_name, arguments: nil)
  migration_name = migration_name.name if migration_name.is_a?(Class)

  configuration = { migration_name: migration_name }

  if arguments
    arguments = Array(arguments)
    migration = Migration.parents.for_configuration(migration_name, arguments).first
    configuration[:arguments] = arguments.to_json
  else
    migration = Migration.parents.for_migration_name(migration_name).first
  end

  if migration.nil?
    Utils.raise_in_prod_or_say_in_dev("Could not find background data migration for the given configuration: #{configuration}")
  elsif !migration.succeeded?
    raise "Expected background data migration for the given configuration to be marked as 'succeeded', " \
          "but it is '#{migration.status}': #{configuration}"
  end
end

#perform_action_on_relation_in_background(model_name, conditions, action, updates: nil, **options) ⇒ OnlineMigrations::BackgroundMigrations::Migration

Note:

This method is better suited for large tables (10/100s of millions of records). For smaller tables it is probably better and easier to directly perform the action on associated records.

Performs specific action on a relation or individual records. This is useful when you want to delete/destroy/update/etc records based on some conditions.

Examples:

Delete records

perform_action_on_relation_in_background("User", { banned: true }, :delete_all)

Destroy records

perform_action_on_relation_in_background("User", { banned: true }, :destroy_all)

Update records

perform_action_on_relation_in_background("User", { banned: nil }, :update_all, updates: { banned: false })

Perform custom method on individual records

class User < ApplicationRecord
  def generate_invite_token
    self.invite_token = # some complex logic
  end
end

perform_action_on_relation_in_background("User", { invite_token: nil }, :generate_invite_token)

Parameters:

  • model_name (String)
  • conditions (Array, Hash, String)

    conditions to filter the relation

  • action (String, Symbol)

    action to perform on the relation or individual records. Relation-wide available actions: ‘:delete_all`, `:destroy_all`, and `:update_all`.

  • updates (Hash) (defaults to: nil)

    updates to perform when ‘action` is set to `:update_all`

  • options (Hash)

    used to control the behavior of background migration. See ‘#enqueue_background_data_migration`

Returns:



312
313
314
315
316
317
318
319
320
321
322
323
# File 'lib/online_migrations/background_migrations/migration_helpers.rb', line 312

def perform_action_on_relation_in_background(model_name, conditions, action, updates: nil, **options)
  model_name = model_name.name if model_name.is_a?(Class)

  enqueue_background_data_migration(
    "PerformActionOnRelation",
    model_name,
    conditions,
    action,
    { updates: updates },
    **options
  )
end

#remove_background_data_migration(migration_name, *arguments) ⇒ Object Also known as: remove_background_migration

Removes the background migration for the given class name and arguments, if exists.

Examples:

remove_background_data_migration("BackfillProjectIssuesCount")

Parameters:

  • migration_name (String, Class)

    Background migration job class name

  • arguments (Array)

    Extra arguments the migration was originally created with



392
393
394
395
# File 'lib/online_migrations/background_migrations/migration_helpers.rb', line 392

def remove_background_data_migration(migration_name, *arguments)
  migration_name = migration_name.name if migration_name.is_a?(Class)
  Migration.for_configuration(migration_name, arguments).delete_all
end

#reset_counters_in_background(model_name, *counters, touch: nil, **options) ⇒ OnlineMigrations::BackgroundMigrations::Migration

Note:

This method is better suited for large tables (10/100s of millions of records). For smaller tables it is probably better and easier to use ‘reset_counters` from the Active Record.

Resets one or more counter caches to their correct value using background migrations. This is useful when adding new counter caches, or if the counter has been corrupted or modified directly by SQL.

Examples:

reset_counters_in_background("User", :projects, :friends, touch: true)

Touch specific column

reset_counters_in_background("User", :projects, touch: :touched_at)

Touch with specific time value

reset_counters_in_background("User", :projects, touch: [time: 2.days.ago])

Parameters:

  • model_name (String)
  • counters (Array)
  • touch (Boolean, Symbol, Array) (defaults to: nil)

    touch timestamp columns when updating.

    • when ‘true` - will touch `updated_at` and/or `updated_on`

    • when ‘Symbol` or `Array` - will touch specific column(s)

  • options (Hash)

    used to control the behavior of background migration. See ‘#enqueue_background_data_migration`

Returns:

See Also:



208
209
210
211
212
213
214
215
216
217
218
# File 'lib/online_migrations/background_migrations/migration_helpers.rb', line 208

def reset_counters_in_background(model_name, *counters, touch: nil, **options)
  model_name = model_name.name if model_name.is_a?(Class)

  enqueue_background_data_migration(
    "ResetCounters",
    model_name,
    counters,
    { touch: touch },
    **options
  )
end