Module: NewRelic::Agent::Agent::InstanceMethods::StartWorkerThread

Included in:
NewRelic::Agent::Agent::InstanceMethods
Defined in:
lib/new_relic/agent/agent.rb

Overview

All of this module used to be contained in the start_worker_thread method - this is an artifact of refactoring and can be moved, renamed, etc at will

Instance Method Summary collapse

Instance Method Details

#catch_errorsObject

a wrapper method to handle all the errors that can happen in the connection and worker thread system. This guarantees a no-throw from the background thread.



626
627
628
629
630
631
632
633
634
635
636
637
# File 'lib/new_relic/agent/agent.rb', line 626

def catch_errors
  yield
rescue NewRelic::Agent::ForceRestartException => e
  handle_force_restart(e)
  retry
rescue NewRelic::Agent::ForceDisconnectException => e
  handle_force_disconnect(e)
rescue NewRelic::Agent::ServerConnectionException => e
  handle_server_connection_problem(e)
rescue => e
  handle_other_error(e)
end

#create_and_run_worker_loopObject

Creates the worker loop and loads it with the instructions it should run every @report_period seconds



579
580
581
582
583
584
# File 'lib/new_relic/agent/agent.rb', line 579

def create_and_run_worker_loop
  @worker_loop = WorkerLoop.new
  @worker_loop.run(Agent.config[:data_report_period]) do
    transmit_data
  end
end

#deferred_work!(connection_options) ⇒ Object

This is the method that is run in a new thread in order to background the harvesting and sending of data during the normal operation of the agent.

Takes connection options that determine how we should connect to the server, and loops endlessly - typically we never return from this method unless we’re shutting down the agent



647
648
649
650
651
652
653
654
655
656
657
658
659
660
661
662
663
664
665
# File 'lib/new_relic/agent/agent.rb', line 647

def deferred_work!(connection_options)
  catch_errors do
    NewRelic::Agent.disable_all_tracing do
      # We try to connect.  If this returns false that means
      # the server rejected us for a licensing reason and we should
      # just exit the thread.  If it returns nil
      # that means it didn't try to connect because we're in the master.
      connect(connection_options)
      if connected?
        log_worker_loop_start
        create_and_run_worker_loop
        # never reaches here unless there is a problem or
        # the agent is exiting
      else
        ::NewRelic::Agent.logger.debug "No connection.  Worker thread ending."
      end
    end
  end
end

#handle_force_disconnect(error) ⇒ Object

when a disconnect is requested, stop the current thread, which is the worker thread that gathers data and talks to the server.



600
601
602
603
# File 'lib/new_relic/agent/agent.rb', line 600

def handle_force_disconnect(error)
  ::NewRelic::Agent.logger.warn "New Relic forced this agent to disconnect (#{error.message})"
  disconnect
end

#handle_force_restart(error) ⇒ Object

Handles the case where the server tells us to restart - this clears the data, clears connection attempts, and waits a while to reconnect.



589
590
591
592
593
594
595
# File 'lib/new_relic/agent/agent.rb', line 589

def handle_force_restart(error)
  ::NewRelic::Agent.logger.debug error.message
  reset_stats
  @service.reset_metric_id_cache if @service
  @connect_state = :pending
  sleep 30
end

#handle_other_error(error) ⇒ Object

Handles an unknown error in the worker thread by logging it and disconnecting the agent, since we are now in an unknown state.



615
616
617
618
619
620
621
# File 'lib/new_relic/agent/agent.rb', line 615

def handle_other_error(error)
  ::NewRelic::Agent.logger.error "Unhandled error in worker thread, disconnecting this agent process:"
  # These errors are fatal (that is, they will prevent the agent from
  # reporting entirely), so we really want backtraces when they happen
  ::NewRelic::Agent.logger.log_exception(:error, error)
  disconnect
end

#handle_server_connection_problem(error) ⇒ Object

there is a problem with connecting to the server, so we stop trying to connect and shut down the agent



607
608
609
610
# File 'lib/new_relic/agent/agent.rb', line 607

def handle_server_connection_problem(error)
  ::NewRelic::Agent.logger.error "Unable to establish connection with the server.", error
  disconnect
end

#log_worker_loop_startObject

logs info about the worker loop so users can see when the agent actually begins running in the background



554
555
556
557
# File 'lib/new_relic/agent/agent.rb', line 554

def log_worker_loop_start
  ::NewRelic::Agent.logger.debug "Reporting performance data every #{Agent.config[:data_report_period]} seconds."
  ::NewRelic::Agent.logger.debug "Running worker loop"
end

#reset_harvest_locksObject

Some forking cases (like Resque) end up with harvest lock from the parent process orphaned in the child. Let it go before we proceed.



571
572
573
574
575
# File 'lib/new_relic/agent/agent.rb', line 571

def reset_harvest_locks
  return if harvest_lock.nil?

  harvest_lock.unlock if harvest_lock.locked?
end

#synchronize_with_harvestObject

Synchronize with the harvest loop. If the harvest thread has taken a lock (DNS lookups, backticks, agent-owned locks, etc), and we fork while locked, this can deadlock child processes. For more details, see github.com/resque/resque/issues/1101



563
564
565
566
567
# File 'lib/new_relic/agent/agent.rb', line 563

def synchronize_with_harvest
  harvest_lock.synchronize do
    yield
  end
end