Class: Gitlab::Database::Batch::Optimizer

Inherits:
Object
  • Object
show all
Defined in:
lib/gitlab/database/batch/optimizer.rb

Overview

This is an optimizer for throughput of batched jobs

The underyling mechanic is based on the concept of time efficiency:

time efficiency = job duration / interval

Ideally, this is close but lower than 1 - so we’re using time efficiently.

We aim to land in the 90%-98% range, which gives the database a little breathing room in between.

The optimizer is based on calculating the exponential moving average of time efficiencies for the last N jobs. If we’re outside the range, we add 10% to or decrease by 20% of the batch size.

Constant Summary collapse

TARGET_EFFICIENCY =

Target time efficiency for a job Time efficiency is defined as: job duration / interval

(0.9..0.95)
MIN_BATCH_SIZE =

Lower and upper bound for the batch size

1_000
MAX_BATCH_SIZE =
2_000_000
MAX_MULTIPLIER =

Limit for the multiplier of the batch size

1.2

Instance Attribute Summary collapse

Instance Method Summary collapse

Constructor Details

#initialize(current_batch_size:, max_batch_size: nil, time_efficiency: nil) ⇒ Optimizer

Returns a new instance of Optimizer.



31
32
33
34
35
# File 'lib/gitlab/database/batch/optimizer.rb', line 31

def initialize(current_batch_size:, max_batch_size: nil, time_efficiency: nil)
  @current_batch_size = current_batch_size
  @max_batch_size = max_batch_size
  @time_efficiency = time_efficiency
end

Instance Attribute Details

#current_batch_sizeObject (readonly)

Returns the value of attribute current_batch_size.



29
30
31
# File 'lib/gitlab/database/batch/optimizer.rb', line 29

def current_batch_size
  @current_batch_size
end

#max_batch_sizeObject (readonly)

Returns the value of attribute max_batch_size.



29
30
31
# File 'lib/gitlab/database/batch/optimizer.rb', line 29

def max_batch_size
  @max_batch_size
end

#time_efficiencyObject (readonly)

Returns the value of attribute time_efficiency.



29
30
31
# File 'lib/gitlab/database/batch/optimizer.rb', line 29

def time_efficiency
  @time_efficiency
end

Instance Method Details

#optimized_batch_sizeObject



37
38
39
40
41
42
43
44
# File 'lib/gitlab/database/batch/optimizer.rb', line 37

def optimized_batch_size
  return current_batch_size if invalid_time_efficiency?

  multiplier = calculate_multiplier
  new_size = (current_batch_size * multiplier).to_i

  apply_limits(new_size)
end

#should_optimize?Boolean

Returns:

  • (Boolean)


46
47
48
49
50
# File 'lib/gitlab/database/batch/optimizer.rb', line 46

def should_optimize?
  return false if invalid_time_efficiency?

  TARGET_EFFICIENCY.exclude?(time_efficiency)
end