Class: Dor::IndexingService
- Inherits:
-
Object
- Object
- Dor::IndexingService
- Defined in:
- lib/dor/services/indexing_service.rb
Defined Under Namespace
Classes: ReindexError
Constant Summary collapse
- @@loggers =
memoize the loggers we create in a hash, init with a nil default logger
{ default: nil }
Class Method Summary collapse
- .default_index_logger ⇒ Object
-
.generate_index_logger { ... } ⇒ Object
Returns a Logger instance for recording info about indexing attempts.
-
.reindex_object(obj, options = {}) ⇒ Object
takes a Dor object and indexes it to solr.
-
.reindex_pid(pid, *args) ⇒ Object
retrieves a single Dor object by pid, indexes the object to solr, does some logging (will use a default logger if one is not provided).
-
.reindex_pid_list(pid_list, should_commit = false) ⇒ Object
given a list of pids, retrieve those objects from fedora, index each to solr, optionally commit.
-
.reindex_pid_remotely(pid) ⇒ Object
Use the dor-indexing-app service to reindex a pid.
Class Method Details
.default_index_logger ⇒ Object
30 31 32 |
# File 'lib/dor/services/indexing_service.rb', line 30 def self.default_index_logger @@loggers[:default] ||= generate_index_logger end |
.generate_index_logger { ... } ⇒ Object
Returns a Logger instance for recording info about indexing attempts
13 14 15 16 17 18 19 20 21 22 23 24 25 |
# File 'lib/dor/services/indexing_service.rb', line 13 def self.generate_index_logger(&entry_id_block) index_logger = Logger.new(Config.indexing_svc.log, Config.indexing_svc.log_rotation_interval) index_logger.formatter = proc do |_severity, datetime, _progname, msg| date_format_str = Config.indexing_svc.log_date_format_str entry_id = begin begin entry_id_block.call rescue StandardError '---' end end "[#{entry_id}] [#{datetime.utc.strftime(date_format_str)}] #{msg}\n" end index_logger end |
.reindex_object(obj, options = {}) ⇒ Object
takes a Dor object and indexes it to solr. doesn’t commit automatically.
35 36 37 38 39 |
# File 'lib/dor/services/indexing_service.rb', line 35 def self.reindex_object(obj, = {}) solr_doc = obj.to_solr Dor::SearchService.solr.add(solr_doc, ) solr_doc end |
.reindex_pid(pid, index_logger, options = {}) ⇒ Object .reindex_pid(pid, index_logger, should_raise_errors, options = {}) ⇒ Object .reindex_pid(pid, options = {}) ⇒ Object
retrieves a single Dor object by pid, indexes the object to solr, does some logging (will use a default logger if one is not provided). doesn’t commit automatically.
WARNING/TODO: the tests indicate that the “rescue Exception” block at the end will get skipped, and the thrown exception (e.g. SystemStackError) will not be logged. since that’s the only consequence, and the exception bubbles up as we would want anyway, it doesn’t seem worth blocking refactoring. see github.com/sul-dlss/dor-services/issues/156 extra logging in this case would be nice, but centralized indexing that’s otherwise fully functional is nicer.
74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 |
# File 'lib/dor/services/indexing_service.rb', line 74 def self.reindex_pid(pid, *args) = {} = args.pop if args.last.is_a? Hash if args.length > 0 warn 'Dor::IndexingService.reindex_pid with primitive arguments is deprecated; pass e.g. { logger: logger, raise_errors: bool } instead' index_logger, should_raise_errors = args index_logger ||= default_index_logger should_raise_errors = true if should_raise_errors.nil? else index_logger = .fetch(:logger, default_index_logger) should_raise_errors = .fetch(:raise_errors, true) end obj = nil solr_doc = nil # benchmark how long it takes to load the object load_stats = Benchmark.measure('load_instance') do obj = Dor.load_instance pid end.format('%n realtime %rs total CPU %ts').gsub(/[\(\)]/, '') # benchmark how long it takes to convert the object to a Solr document to_solr_stats = Benchmark.measure('to_solr') do solr_doc = reindex_object obj, end.format('%n realtime %rs total CPU %ts').gsub(/[\(\)]/, '') index_logger.info "successfully updated index for #{pid} (metrics: #{load_stats}; #{to_solr_stats})" solr_doc rescue StandardError => se if se.is_a? ActiveFedora::ObjectNotFoundError index_logger.warn "failed to update index for #{pid}, object not found in Fedora" else index_logger.warn "failed to update index for #{pid}, unexpected StandardError, see main app log: #{se.backtrace}" end raise se if should_raise_errors rescue Exception => ex index_logger.error "failed to update index for #{pid}, unexpected Exception, see main app log: #{ex.backtrace}" raise ex # don't swallow anything worse than StandardError end |
.reindex_pid_list(pid_list, should_commit = false) ⇒ Object
given a list of pids, retrieve those objects from fedora, index each to solr, optionally commit
117 118 119 120 |
# File 'lib/dor/services/indexing_service.rb', line 117 def self.reindex_pid_list(pid_list, should_commit = false) pid_list.each { |pid| reindex_pid pid, raise_errors: false } # use the default logger, don't let individual errors nuke the rest of the batch ActiveFedora.solr.conn.commit if should_commit end |
.reindex_pid_remotely(pid) ⇒ Object
Use the dor-indexing-app service to reindex a pid
44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 |
# File 'lib/dor/services/indexing_service.rb', line 44 def self.reindex_pid_remotely(pid) pid = "druid:#{pid}" unless pid =~ /^druid:/ realtime = Benchmark.realtime do with_retries(max_tries: 3, rescue: [RestClient::Exception, Errno::ECONNREFUSED]) do RestClient.post("#{Config.dor_indexing_app.url}/reindex/#{pid}", '') end end default_index_logger.info "successfully updated index for #{pid} in #{'%.3f' % realtime}s" rescue RestClient::Exception, Errno::ECONNREFUSED => e msg = "failed to reindex #{pid}: #{e}" default_index_logger.error msg raise ReindexError.new(msg) rescue StandardError => e default_index_logger.error "failed to reindex #{pid}: #{e}" raise end |