Module: OnlineMigrations::ChangeColumnTypeHelpers

Included in:
SchemaStatements
Defined in:
lib/online_migrations/change_column_type_helpers.rb

Overview

To safely change the type of the column, we need to perform the following steps:

1. create a new column based on the old one (covered by `initialize_column_type_change`)
2. ensure data stays in sync (via triggers) (covered by `initialize_column_type_change`)
3. backfill data from the old column (`backfill_column_for_type_change`)
4. copy indexes, foreign keys, check constraints, NOT NULL constraint,
  make new column a Primary Key if we change type of the primary key column,
  swap new column in place (`finalize_column_type_change`)
5. remove copy trigger and old column (`cleanup_column_type_change`)

For example, suppose we need to change ‘files`.`size` column’s type from ‘integer` to `bigint`:

  1. Create a new column and keep data in sync

```
  class InitializeFilesSizeTypeChangeToBigint < ActiveRecord::Migration
    def change
      initialize_column_type_change(:files, :size, :bigint)
    end
  end
```
  1. Backfill data

```
  class BackfillFilesSizeTypeChangeToBigint < ActiveRecord::Migration
    def up
      backfill_column_for_type_change(:files, :size, progress: true)
    end

    def down
      # no op
    end
  end
```
  1. Copy indexes, foreign keys, check constraints, NOT NULL constraint, swap new column in place

```
  class FinalizeFilesSizeTypeChangeToBigint < ActiveRecord::Migration
    def change
      finalize_column_type_change(:files, :size)
    end
  end
```
  1. Finally, if everything is working as expected, remove copy trigger and old column

```
  class CleanupFilesSizeTypeChangeToBigint < ActiveRecord::Migration
    def up
      cleanup_column_type_change(:files, :size)
    end

    def down
      initialize_column_type_change(:files, :size, :integer)
    end
  end
```

Instance Method Summary collapse

Instance Method Details

#backfill_column_for_type_change(table_name, column_name, type_cast_function: nil, **options) ⇒ void

Note:

This method should not be run within a transaction

Note:

For large tables (10/100s of millions of records) it is recommended to use ‘backfill_column_for_type_change_in_background`.

This method returns an undefined value.

Backfills data from the old column to the new column.

Examples:

backfill_column_for_type_change(:files, :size)

With type casting

backfill_column_for_type_change(:users, :settings, type_cast_function: "jsonb")
backfill_column_for_type_change(:users, :company_id, type_cast_function: Arel.sql("company_id::integer"))

Additional batch options

backfill_column_for_type_change(:files, :size, batch_size: 10_000)

Parameters:

  • table_name (String, Symbol)
  • column_name (String, Symbol)
  • type_cast_function (String, Symbol) (defaults to: nil)

    Some type changes require casting data to a new type. For example when changing from ‘text` to `jsonb`. In this case, use the `type_cast_function` option. You need to make sure there is no bad data and the cast will always succeed

  • options (Hash)

    used to control the behavior of ‘update_column_in_batches`



195
196
197
198
# File 'lib/online_migrations/change_column_type_helpers.rb', line 195

def backfill_column_for_type_change(table_name, column_name, type_cast_function: nil, **options)
  backfill_columns_for_type_change(table_name, column_name,
      type_cast_functions: { column_name => type_cast_function }, **options)
end

#backfill_columns_for_type_change(table_name, *column_names, type_cast_functions: {}, **options) ⇒ Object

Same as ‘backfill_column_for_type_change` but for multiple columns.

Parameters:

  • type_cast_functions (Hash) (defaults to: {})

    if not empty, keys - column names, values - corresponding type cast functions

See Also:



207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
# File 'lib/online_migrations/change_column_type_helpers.rb', line 207

def backfill_columns_for_type_change(table_name, *column_names, type_cast_functions: {}, **options)
  conversions = column_names.map do |column_name|
    tmp_column = __change_type_column(column_name)

    old_value = Arel::Table.new(table_name)[column_name]
    if (type_cast_function = type_cast_functions.with_indifferent_access[column_name])
      old_value =
        case type_cast_function
        when Arel::Nodes::SqlLiteral
          type_cast_function
        else
          Arel::Nodes::NamedFunction.new(type_cast_function.to_s, [old_value])
        end
    end

    [tmp_column, old_value]
  end

  update_columns_in_batches(table_name, conversions, **options)
end

#cleanup_column_type_change(table_name, column_name) ⇒ void

Note:

This method is not reversible by default in migrations. You need to use ‘initialize_column_type_change` in `down` method with the original column type to be able to revert.

This method returns an undefined value.

Finishes the process of column type change

This helper removes copy triggers and old column.

Examples:

cleanup_column_type_change(:files, :size)

Parameters:

  • table_name (String, Symbol)
  • column_name (String, Symbol)


358
359
360
# File 'lib/online_migrations/change_column_type_helpers.rb', line 358

def cleanup_column_type_change(table_name, column_name)
  cleanup_columns_type_change(table_name, column_name)
end

#cleanup_columns_type_change(table_name, *column_names) ⇒ Object

Same as ‘cleanup_column_type_change` but for multiple columns



365
366
367
368
369
370
371
372
373
374
# File 'lib/online_migrations/change_column_type_helpers.rb', line 365

def cleanup_columns_type_change(table_name, *column_names)
  conversions = column_names.index_with do |column_name|
    __change_type_column(column_name)
  end

  transaction do
    __remove_copy_triggers(table_name, conversions.keys, conversions.values)
    remove_columns(table_name, *conversions.values)
  end
end

#finalize_column_type_change(table_name, column_name) ⇒ Object

Note:

This method should not be run within a transaction

Copies ‘NOT NULL` constraint, indexes, foreign key, and check constraints from the old column to the new column

Note: If a column contains one or more indexes that don’t contain the name of the original column, this procedure will fail. In that case, you’ll first need to rename these indexes.

Examples:

finalize_column_type_change(:files, :size)


239
240
241
# File 'lib/online_migrations/change_column_type_helpers.rb', line 239

def finalize_column_type_change(table_name, column_name)
  finalize_columns_type_change(table_name, column_name)
end

#finalize_columns_type_change(table_name, *column_names) ⇒ Object

Same as ‘finalize_column_type_change` but for multiple columns



246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
# File 'lib/online_migrations/change_column_type_helpers.rb', line 246

def finalize_columns_type_change(table_name, *column_names)
  __ensure_not_in_transaction!

  conversions = column_names.to_h do |column_name|
    [column_name.to_s, __change_type_column(column_name)]
  end

  primary_key = primary_key(table_name)

  conversions.each do |column_name, tmp_column_name|
    old_column = column_for(table_name, column_name)
    column = column_for(table_name, tmp_column_name)

    # We already set default and NOT NULL for to-be-PK columns
    # for PG >= 11, so can skip this case
    if !old_column.null && column.null
      add_not_null_constraint(table_name, tmp_column_name, validate: false)
      validate_not_null_constraint(table_name, tmp_column_name)

      # At this point we are sure there are no NULLs in this column
      transaction do
        __set_not_null(table_name, tmp_column_name)
        remove_not_null_constraint(table_name, tmp_column_name)
      end
    end

    __copy_indexes(table_name, column_name, tmp_column_name)
    __copy_foreign_keys(table_name, column_name, tmp_column_name)
    __copy_check_constraints(table_name, column_name, tmp_column_name)

    # Exclusion constraints were added in https://github.com/rails/rails/pull/40224.
    if Utils.ar_version >= 7.1
      __copy_exclusion_constraints(table_name, column_name, tmp_column_name)
    end

    if column_name == primary_key
      __finalize_primary_key_type_change(table_name, column_name, column_names)
    end
  end

  # Swap all non-PK columns at once, because otherwise when this helper possibly
  # will have a need to be rerun, it will be impossible to know which columns
  # already were swapped and which were not.
  transaction do
    conversions
      .reject { |column_name, _tmp_column_name| column_name == primary_key }
      .each do |column_name, tmp_column_name|
        swap_column_names(table_name, column_name, tmp_column_name)
      end

    __reset_trigger_function(table_name, column_names)
  end
end

#initialize_column_type_change(table_name, column_name, new_type, **options) ⇒ void

This method returns an undefined value.

Initialize the process of changing column type. Creates a new column from the old one and ensures that data stays in sync.

Examples:

initialize_column_type_change(:files, :size, :bigint)

With additional column options

initialize_column_type_change(:users, :name, :string, limit: 64)

With type casting

initialize_column_type_change(:users, :settings, :jsonb, type_cast_function: "jsonb")
initialize_column_type_change(:users, :company_id, :integer, type_cast_function: Arel.sql("company_id::integer"))

Parameters:

  • table_name (String, Symbol)
  • column_name (String, Symbol)
  • new_type (String, Symbol)
  • options (Hash)

    additional options that apply to a new type, ‘:limit` for example



79
80
81
# File 'lib/online_migrations/change_column_type_helpers.rb', line 79

def initialize_column_type_change(table_name, column_name, new_type, **options)
  initialize_columns_type_change(table_name, [[column_name, new_type]], column_name => options)
end

#initialize_columns_type_change(table_name, columns_and_types, **options) ⇒ Object

Same as ‘initialize_column_type_change` but for multiple columns at once

This is useful to avoid multiple costly disk rewrites of large tables when changing type of each column separately.

Parameters:

  • table_name (String, Symbol)
  • columns_and_types (Array<Array<(Symbol, Symbol)>>)

    columns and new types, represented as nested arrays. Example: ‘[[:id, :bigint], [:name, :string]]`

  • options (Hash)

    keys - column names, values - options for specific columns (additional options that apply to a new type, ‘:limit` for example)

See Also:



96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
# File 'lib/online_migrations/change_column_type_helpers.rb', line 96

def initialize_columns_type_change(table_name, columns_and_types, **options)
  if !columns_and_types.is_a?(Array) || !columns_and_types.all?(Array)
    raise ArgumentError, "columns_and_types must be an array of arrays"
  end

  conversions = columns_and_types.to_h do |(column_name, _new_type)|
    [column_name, __change_type_column(column_name)]
  end

  if (extra_keys = (options.keys - conversions.keys)).any?
    raise ArgumentError, "Options has unknown keys: #{extra_keys.map(&:inspect).join(', ')}. " \
                         "Can contain only column names: #{conversions.keys.map(&:inspect).join(', ')}."
  end

  transaction do
    type_cast_functions = {}.with_indifferent_access

    columns_and_types.each do |(column_name, new_type)|
      old_col = column_for(table_name, column_name)
      old_col_options = __options_from_column(old_col, [:collation, :comment])
      column_options = options[column_name] || {}
      type_cast_function = column_options.delete(:type_cast_function)
      type_cast_functions[column_name] = type_cast_function if type_cast_function
      tmp_column_name = conversions[column_name]

      if database_version >= 11_00_00
        if primary_key(table_name) == column_name.to_s && old_col.type == :integer
          # For PG < 11 and Primary Key conversions, setting a column as the PK
          # converts even check constraints to NOT NULL column constraints
          # and forces an inline re-verification of the whole table.
          # To avoid this, we instead set it to `NOT NULL DEFAULT 0` and we'll
          # copy the correct values when backfilling.
          add_column(table_name, tmp_column_name, new_type,
            **old_col_options, **column_options, default: old_col.default || 0, null: false)
        else
          if !old_col.default.nil?
            old_col_options = old_col_options.merge(default: old_col.default, null: old_col.null)
          end
          add_column(table_name, tmp_column_name, new_type, **old_col_options, **column_options)
        end
      else
        add_column(table_name, tmp_column_name, new_type, **old_col_options, **column_options)
        change_column_default(table_name, tmp_column_name, old_col.default) if !old_col.default.nil?
      end
    end

    __create_copy_triggers(table_name, conversions.keys, conversions.values, type_cast_functions: type_cast_functions)
  end
end

#revert_finalize_column_type_change(table_name, column_name) ⇒ void

This method returns an undefined value.

Reverts operations performed by ‘finalize_column_type_change`

Examples:

revert_finalize_column_type_change(:files, :size)

Parameters:

  • table_name (String, Symbol)
  • column_name (String, Symbol)


309
310
311
# File 'lib/online_migrations/change_column_type_helpers.rb', line 309

def revert_finalize_column_type_change(table_name, column_name)
  revert_finalize_columns_type_change(table_name, column_name)
end

#revert_finalize_columns_type_change(table_name, *column_names) ⇒ Object

Same as ‘revert_finalize_column_type_change` but for multiple columns



316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
# File 'lib/online_migrations/change_column_type_helpers.rb', line 316

def revert_finalize_columns_type_change(table_name, *column_names)
  __ensure_not_in_transaction!

  conversions = column_names.to_h do |column_name|
    [column_name.to_s, __change_type_column(column_name)]
  end

  primary_key = primary_key(table_name)
  primary_key_conversion = conversions.delete(primary_key)

  # No need to remove indexes, foreign keys etc, because it  can take a significant amount
  # of time and will be automatically removed if decided to remove the column itself.
  if conversions.any?
    transaction do
      conversions.each do |column_name, tmp_column_name|
        swap_column_names(table_name, column_name, tmp_column_name)
      end

      __reset_trigger_function(table_name, column_names)
    end
  end

  if primary_key_conversion
    __finalize_primary_key_type_change(table_name, primary_key, column_names)
  end
end

#revert_initialize_column_type_change(table_name, column_name, _new_type = nil, **_options) ⇒ void

This method returns an undefined value.

Reverts operations performed by initialize_column_type_change

Examples:

revert_initialize_column_type_change(:files, :size)

Parameters:

  • table_name (String, Symbol)
  • column_name (String, Symbol)
  • _new_type (String, Symbol) (defaults to: nil)

    Passing this argument will make this change reversible in migration

  • _options (Hash)

    additional options that apply to a new type. Passing this argument will make this change reversible in migration



159
160
161
# File 'lib/online_migrations/change_column_type_helpers.rb', line 159

def revert_initialize_column_type_change(table_name, column_name, _new_type = nil, **_options)
  cleanup_column_type_change(table_name, column_name)
end

#revert_initialize_columns_type_change(table_name, columns_and_types, **_options) ⇒ Object

Same as ‘revert_initialize_column_type_change` but for multiple columns.



166
167
168
169
# File 'lib/online_migrations/change_column_type_helpers.rb', line 166

def revert_initialize_columns_type_change(table_name, columns_and_types, **_options)
  column_names = columns_and_types.map(&:first)
  cleanup_columns_type_change(table_name, *column_names)
end