Class: Krikri::Harvesters::OAIHarvester

Inherits:
Object
  • Object
show all
Includes:
Krikri::Harvester
Defined in:
lib/krikri/harvesters/oai_harvester.rb

Overview

A harvester implementation for OAI-PMH

Constant Summary

Constants included from Krikri::Harvester

Krikri::Harvester::Registry

Constants included from SoftwareAgent

SoftwareAgent::Logger

Instance Attribute Summary collapse

Attributes included from Krikri::Harvester

#name, #uri

Class Method Summary collapse

Instance Method Summary collapse

Methods included from Krikri::Harvester

#run

Methods included from SoftwareAgent

#agent_name, #log, #run

Constructor Details

#initialize(opts = {}) ⇒ OAIHarvester

Returns a new instance of OAIHarvester

Parameters:

  • opts (Hash) (defaults to: {})

    options to pass through to client requests. Allowable options are specified in OAI::Const::Verbs. Currently :from, :until, :set, and :metadata_prefix.

See Also:

  • OAI::Client
  • #expected_opts

14
15
16
17
18
19
20
21
22
23
24
25
# File 'lib/krikri/harvesters/oai_harvester.rb', line 14

def initialize(opts = {})
  super
  @opts = opts.fetch(:oai, {})

  http_conn = Faraday.new do |conn|
    conn.request :retry, :max => 3
    conn.response :follow_redirects, :limit => 5
    conn.adapter :net_http
  end

  @client = OAI::Client.new(uri, :http => http_conn)
end

Instance Attribute Details

#clientObject

Returns the value of attribute client


6
7
8
# File 'lib/krikri/harvesters/oai_harvester.rb', line 6

def client
  @client
end

Class Method Details

.expected_optsObject

See Also:

  • Krikri::Harvester::expected_opts

74
75
76
77
78
79
80
81
82
# File 'lib/krikri/harvesters/oai_harvester.rb', line 74

def self.expected_opts
  {
    key: :oai,
    opts: {
      set: {type: :string, required: false, multiple_ok: true},
      metadata_prefix: {type: :string, required: true}
    }
  }
end

Instance Method Details

#countObject

Count on record_ids will request all ids and load them into memory TODO: an efficient implementation of count for OAI

Raises:

  • (NotImplementedError)

42
43
44
# File 'lib/krikri/harvesters/oai_harvester.rb', line 42

def count
  raise NotImplementedError
end

#get_record(identifier, opts = {}) ⇒ Object

TODO: normalize records; there will be differences in XML for different requests


65
66
67
68
69
70
# File 'lib/krikri/harvesters/oai_harvester.rb', line 65

def get_record(identifier, opts = {})
  opts[:identifier] = identifier
  opts = opts.merge(@opts)
  @record_class.build(mint_id(identifier),
                      record_xml(client.get_record(opts).record))
end

#record_ids(opts = {}) ⇒ Object

Sends ListIdentifier requests lazily.

The following will only send requests to the endpoint until it has 1000 record ids:

record_ids.take(1000)

35
36
37
38
# File 'lib/krikri/harvesters/oai_harvester.rb', line 35

def record_ids(opts = {})
  opts = opts.merge(@opts)
  client.list_identifiers(opts).full.lazy.flat_map(&:identifier)
end

#records(opts = {}) ⇒ Object

Sends ListRecords requests lazily.

The following will only send requests to the endpoint until it has 1000 records:

records.take(1000)

54
55
56
57
58
59
60
61
# File 'lib/krikri/harvesters/oai_harvester.rb', line 54

def records(opts = {})
  opts = opts.merge(@opts)
  client.list_records(opts).full.lazy.flat_map do |rec|
    @record_class.build(mint_id(rec.header.identifier),
                        record_xml(rec))

  end
end