Class: Gitlab::Database::BackgroundMigration::BatchOptimizer

Inherits:
Object
  • Object
show all
Defined in:
lib/gitlab/database/background_migration/batch_optimizer.rb

Overview

This is an optimizer for throughput of batched migration jobs

The underyling mechanic is based on the concept of time efficiency:

time efficiency = job duration / interval

Ideally, this is close but lower than 1 - so we’re using time efficiently.

We aim to land in the 90%-98% range, which gives the database a little breathing room in between.

The optimizer is based on calculating the exponential moving average of time efficiencies for the last N jobs. If we’re outside the range, we add 10% to or decrease by 20% of the batch size.

Constant Summary collapse

TARGET_EFFICIENCY =

Target time efficiency for a job Time efficiency is defined as: job duration / interval

(0.9..0.95)
MIN_BATCH_SIZE =

Lower and upper bound for the batch size

1_000
MAX_BATCH_SIZE =
2_000_000
MAX_MULTIPLIER =

Limit for the multiplier of the batch size

1.2
NUMBER_OF_JOBS =

When smoothing time efficiency, use this many jobs

20
EMA_ALPHA =

Smoothing factor for exponential moving average

0.4

Instance Attribute Summary collapse

Instance Method Summary collapse

Constructor Details

#initialize(migration, number_of_jobs: NUMBER_OF_JOBS, ema_alpha: EMA_ALPHA) ⇒ BatchOptimizer

Returns a new instance of BatchOptimizer.



37
38
39
40
41
# File 'lib/gitlab/database/background_migration/batch_optimizer.rb', line 37

def initialize(migration, number_of_jobs: NUMBER_OF_JOBS, ema_alpha: EMA_ALPHA)
  @migration = migration
  @number_of_jobs = number_of_jobs
  @ema_alpha = ema_alpha
end

Instance Attribute Details

#ema_alphaObject (readonly)

Returns the value of attribute ema_alpha.



35
36
37
# File 'lib/gitlab/database/background_migration/batch_optimizer.rb', line 35

def ema_alpha
  @ema_alpha
end

#migrationObject (readonly)

Returns the value of attribute migration.



35
36
37
# File 'lib/gitlab/database/background_migration/batch_optimizer.rb', line 35

def migration
  @migration
end

#number_of_jobsObject (readonly)

Returns the value of attribute number_of_jobs.



35
36
37
# File 'lib/gitlab/database/background_migration/batch_optimizer.rb', line 35

def number_of_jobs
  @number_of_jobs
end

Instance Method Details

#optimize!Object



43
44
45
46
47
48
49
50
51
52
53
54
# File 'lib/gitlab/database/background_migration/batch_optimizer.rb', line 43

def optimize!
  return unless Feature.enabled?(:optimize_batched_migrations, type: :ops)

  multiplier = batch_size_multiplier
  return if multiplier.nil?

  max_batch = migration.max_batch_size || MAX_BATCH_SIZE
  min_batch = [max_batch, MIN_BATCH_SIZE].min

  migration.batch_size = (migration.batch_size * multiplier).to_i.clamp(min_batch, max_batch)
  migration.save!
end