Module: NewRelic::Agent::AgentHelpers::StartWorkerThread

Included in:
NewRelic::Agent::Agent
Defined in:
lib/new_relic/agent/agent_helpers/start_worker_thread.rb

Constant Summary collapse

LOG_ONCE_KEYS_RESET_PERIOD =
60.0
TRANSACTION_EVENT_DATA =
'transaction_event_data'.freeze
CUSTOM_EVENT_DATA =
'custom_event_data'.freeze
ERROR_EVENT_DATA =
'error_event_data'.freeze
SPAN_EVENT_DATA =
'span_event_data'.freeze
LOG_EVENT_DATA =
'log_event_data'.freeze

Instance Method Summary collapse

Instance Method Details

#catch_errorsObject

a wrapper method to handle all the errors that can happen in the connection and worker thread system. This guarantees a no-throw from the background thread.

[View source]

109
110
111
112
113
114
115
116
117
118
# File 'lib/new_relic/agent/agent_helpers/start_worker_thread.rb', line 109

def catch_errors
  yield
rescue NewRelic::Agent::ForceRestartException => e
  handle_force_restart(e)
  retry
rescue NewRelic::Agent::ForceDisconnectException => e
  handle_force_disconnect(e)
rescue => e
  handle_other_error(e)
end

#create_and_run_event_loopObject

[View source]

62
63
64
65
66
67
68
69
70
71
72
# File 'lib/new_relic/agent/agent_helpers/start_worker_thread.rb', line 62

def create_and_run_event_loop
  @event_loop = create_event_loop
  data_harvest = :"#{Agent.config[:data_report_period]}_second_harvest"
  @event_loop.on(data_harvest) do
    transmit_data
  end
  establish_interval_transmissions
  establish_fire_everies(data_harvest)

  @event_loop.run
end

#create_event_loopObject

[View source]

33
34
35
# File 'lib/new_relic/agent/agent_helpers/start_worker_thread.rb', line 33

def create_event_loop
  EventLoop.new
end

#deferred_work!(connection_options) ⇒ Object

This is the method that is run in a new thread in order to background the harvesting and sending of data during the normal operation of the agent.

Takes connection options that determine how we should connect to the server, and loops endlessly - typically we never return from this method unless we’re shutting down the agent

[View source]

128
129
130
131
132
133
134
135
136
137
138
139
140
141
# File 'lib/new_relic/agent/agent_helpers/start_worker_thread.rb', line 128

def deferred_work!(connection_options)
  catch_errors do
    NewRelic::Agent.disable_all_tracing do
      connect(connection_options)
      if NewRelic::Agent.instance.connected?
        create_and_run_event_loop
        # never reaches here unless there is a problem or
        # the agent is exiting
      else
        ::NewRelic::Agent.logger.debug('No connection.  Worker thread ending.')
      end
    end
  end
end

#handle_force_disconnect(error) ⇒ Object

when a disconnect is requested, stop the current thread, which is the worker thread that gathers data and talks to the server.

[View source]

88
89
90
91
92
93
# File 'lib/new_relic/agent/agent_helpers/start_worker_thread.rb', line 88

def handle_force_disconnect(error)
  NewRelic::Agent.agent.health_check.update_status(NewRelic::Agent::HealthCheck::FORCED_DISCONNECT)
  ::NewRelic::Agent.logger.warn('Agent received a ForceDisconnectException from the server, disconnecting. ' \
    "(#{error.message})")
  disconnect
end

#handle_force_restart(error) ⇒ Object

Handles the case where the server tells us to restart - this clears the data, clears connection attempts, and waits a while to reconnect.

[View source]

77
78
79
80
81
82
83
# File 'lib/new_relic/agent/agent_helpers/start_worker_thread.rb', line 77

def handle_force_restart(error)
  ::NewRelic::Agent.logger.debug(error.message)
  drop_buffered_data
  @service&.force_restart
  @connect_state = :pending
  sleep(30)
end

#handle_other_error(error) ⇒ Object

Handles an unknown error in the worker thread by logging it and disconnecting the agent, since we are now in an unknown state.

[View source]

98
99
100
101
102
103
104
# File 'lib/new_relic/agent/agent_helpers/start_worker_thread.rb', line 98

def handle_other_error(error)
  ::NewRelic::Agent.logger.error('Unhandled error in worker thread, disconnecting.')
  # These errors are fatal (that is, they will prevent the agent from
  # reporting entirely), so we really want backtraces when they happen
  ::NewRelic::Agent.logger.log_exception(:error, error)
  disconnect
end

#interval_for(event_type) ⇒ Object

Certain event types may sometimes need to be on the same interval as metrics, so we will check config assigned in EventHarvestConfig to determine the interval on which to report them

[View source]

57
58
59
60
# File 'lib/new_relic/agent/agent_helpers/start_worker_thread.rb', line 57

def interval_for(event_type)
  interval = Agent.config[:"event_report_period.#{event_type}"]
  :"#{interval}_second_harvest"
end

#start_worker_thread(connection_options = {}) ⇒ Object

Try to launch the worker thread and connect to the server.

See #connect for a description of connection_options.

[View source]

20
21
22
23
24
25
26
27
28
29
30
31
# File 'lib/new_relic/agent/agent_helpers/start_worker_thread.rb', line 20

def start_worker_thread(connection_options = {})
  if disable = NewRelic::Agent.config[:disable_harvest_thread]
    NewRelic::Agent.logger.info('Not starting Ruby agent worker thread because :disable_harvest_thread is ' \
      "#{disable}")
    return
  end

  ::NewRelic::Agent.logger.debug('Creating Ruby agent worker thread.')
  @worker_thread = Threading::AgentThread.create('Worker Loop') do
    deferred_work!(connection_options)
  end
end

#stop_event_loopObject

If the @worker_thread encounters an error during the attempt to connect to the collector then the connect attempts enter an exponential backoff retry loop. To avoid potential race conditions with shutting down while also attempting to reconnect, we join the pending data to the server, but without waiting indefinitely for a reconnect to succeed. The use-case where this typically arises is in cronjob scheduled rake tasks where there’s also some network stability/latency issues happening.

[View source]

44
45
46
47
48
49
50
51
52
# File 'lib/new_relic/agent/agent_helpers/start_worker_thread.rb', line 44

def stop_event_loop
  @event_loop&.stop
  # Wait the end of the event loop thread.
  if @worker_thread
    unless @worker_thread.join(3)
      ::NewRelic::Agent.logger.debug('Event loop thread did not stop within 3 seconds')
    end
  end
end