Class: Lhm::SqlRetry

Inherits:
Object
  • Object
show all
Defined in:
lib/lhm/sql_retry.rb

Overview

SqlRetry standardizes the interface for retry behavior in components like Entangler, AtomicSwitcher, ChunkerInsert.

By default if an error includes the message “Lock wait timeout exceeded”, or “Deadlock found when trying to get lock”, SqlRetry will retry again once the MySQL client returns control to the caller, plus one second. It will retry a total of 10 times and output to the logger a description of the retry with error information, retry count, and elapsed time.

This behavior can be modified by passing ‘options` that are documented in github.com/kamui/retriable. Additionally, a “log_prefix” option, which is unique to SqlRetry can be used to prefix log output.

Constant Summary collapse

RECONNECT_SUCCESSFUL_MESSAGE =
"LHM successfully reconnected to initial host:"
CLOUDSQL_VERSION_COMMENT =
"(Google)"
RECONNECT_RETRY_MAX_ITERATION =

Will retry for 120 seconds (approximately, since connecting takes time).

120
RECONNECT_RETRY_INTERVAL =
1
RECONNECTION_MAXIMUM =

Will abort the LHM if it had to reconnect more than 25 times in a single run (indicator that there might be something wrong with the network and would be better to run the LHM at a later time).

25
MYSQL_VAR_NAMES =
{
  hostname: "@@global.hostname",
  server_id: "@@global.server_id",
  version_comment: "@@version_comment",
}

Instance Attribute Summary collapse

Instance Method Summary collapse

Constructor Details

#initialize(connection, retry_options: {}, reconnect_with_consistent_host: false) ⇒ SqlRetry

Returns a new instance of SqlRetry.



34
35
36
37
38
# File 'lib/lhm/sql_retry.rb', line 34

def initialize(connection, retry_options: {}, reconnect_with_consistent_host: false)
  @connection = connection
  self.retry_config = retry_options
  self.reconnect_with_consistent_host = reconnect_with_consistent_host
end

Instance Attribute Details

#connectionObject

Returns the value of attribute connection.



78
79
80
# File 'lib/lhm/sql_retry.rb', line 78

def connection
  @connection
end

#reconnect_with_consistent_hostObject

Both attributes will have defined setters



77
78
79
# File 'lib/lhm/sql_retry.rb', line 77

def reconnect_with_consistent_host
  @reconnect_with_consistent_host
end

#retry_configObject

Both attributes will have defined setters



77
78
79
# File 'lib/lhm/sql_retry.rb', line 77

def retry_config
  @retry_config
end

Instance Method Details

#with_retries(log_prefix: nil) ⇒ Object

Complete explanation of algorithm: github.com/Shopify/lhm/pull/112



41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
# File 'lib/lhm/sql_retry.rb', line 41

def with_retries(log_prefix: nil)
  @log_prefix = log_prefix || "" # No prefix. Just logs

  # Amount of time LHM had to reconnect. Aborting if more than RECONNECTION_MAXIMUM
  reconnection_counter = 0

  Retriable.retriable(@retry_config) do
    # Using begin -> rescue -> end for Ruby 2.4 compatibility
    begin
      if @reconnect_with_consistent_host
        raise Lhm::Error.new("MySQL host has changed since the start of the LHM. Aborting to avoid data-loss") unless same_host_as_initial?
      end

      yield(@connection)
    rescue StandardError => e
      # Not all errors should trigger a reconnect. Some errors such be raised and abort the LHM (such as reconnecting to the wrong host).
      # The error will be raised the connection is still active (i.e. no need to reconnect) or if the connection is
      # dead (i.e. not active) and @reconnect_with_host is false (i.e. instructed not to reconnect)
      raise e if @connection.active? || (!@connection.active? && !@reconnect_with_consistent_host)

      # Lhm could be stuck in a weird state where it loses connection, reconnects and re looses-connection instantly
      # after, creating an infinite loop (because of the usage of `retry`). Hence, abort after 25 reconnections
      raise Lhm::Error.new("LHM reached host reconnection max of #{RECONNECTION_MAXIMUM} times. " \
        "Please try again later.") if reconnection_counter > RECONNECTION_MAXIMUM

      reconnection_counter += 1
      if reconnect_with_host_check!
        retry
      else
        raise Lhm::Error.new("LHM tried the reconnection procedure but failed. Aborting")
      end
    end
  end
end