Class: OAI::Client
- Inherits:
-
Object
- Object
- OAI::Client
- Defined in:
- lib/oai/client.rb
Overview
A OAI::Client provides a client api for issuing OAI-PMH verbs against a OAI-PMH server. The 6 OAI-PMH verbs translate directly to methods you can call on a OAI::Client object. Verb arguments are passed as a hash:
“‘ruby
client = OAI::Client.new 'http://www.pubmedcentral.gov/oai/oai.cgi'
record = client.get_record :identifier => 'oai:pubmedcentral.gov:13901'
for identifier in client.list_identifiers
puts identifier
end
“‘
It is worth noting that the API uses methods and parameter names with underscores in them rather than studly caps. So above list_identifiers and metadata_prefix are used instead of the listIdentifiers and metadataPrefix used in the OAI-PMH specification.
Also, the from and until arguments which specify dates should be passed in as Date or DateTime objects depending on the granularity supported by the server.
For detailed information on the arguments that can be used please consult the OAI-PMH docs at <www.openarchives.org/OAI/openarchivesprotocol.html>.
Instance Method Summary collapse
-
#get_record(opts = {}) ⇒ Object
Equivalent to a
GetRecordrequest. -
#identify ⇒ Object
Equivalent to a
Identifyrequest. -
#initialize(base_url, options = {}) ⇒ Client
constructor
The constructor which must be passed a valid base url for an oai service:.
-
#list_identifiers(opts = {}) ⇒ Object
Equivalent to a
ListIdentifiersrequest. -
#list_metadata_formats(opts = {}) ⇒ Object
Equivalent to a
ListMetadataFormatsrequest. -
#list_records(opts = {}) ⇒ Object
Equivalent to the
ListRecordsrequest. -
#list_sets(opts = {}) ⇒ Object
Equivalent to the
ListSetsrequest.
Constructor Details
#initialize(base_url, options = {}) ⇒ Client
The constructor which must be passed a valid base url for an oai service:
client = OAI::Client.new 'http://www.pubmedcentral.gov/oai/oai.cgi'
If you want to see debugging messages on STDERR use:
client = OAI::Client.new 'http://example.com', :debug => true
By default OAI verbs called on the client will return REXML::Element objects for metadata records, however if you wish you can use the :parser option to indicate you want to use libxml instead, and get back XML::Node objects
client = OAI::Client.new 'http://example.com', :parser => 'libxml'
You can configure the Faraday HTTP client by providing an alternate Faraday instance:
“‘ruby client = OAI::Client.new ’example.com’, :http => Faraday.new {|c|} “‘
### HIGH PERFORMANCE
If you want to supercharge this api install ‘libxml-ruby >= 0.3.8` and use the :parser option when you construct your OAI::Client.
86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 |
# File 'lib/oai/client.rb', line 86 def initialize(base_url, ={}) @base = URI.parse base_url @debug = .fetch(:debug, false) @parser = .fetch(:parser, 'rexml') @headers = .fetch(:headers, {}) @http_client = .fetch(:http) do Faraday.new(:url => @base.clone) do |builder| follow_redirects = .fetch(:redirects, true) follow_redirects = 5 if follow_redirects == true if follow_redirects require 'faraday_middleware' builder.response :follow_redirects, :limit => follow_redirects.to_i end builder.adapter :net_http end end # load appropriate parser case @parser when 'libxml' begin require 'rubygems' require 'xml/libxml' rescue raise OAI::Exception.new("xml/libxml not available") end when 'rexml' require 'rexml/document' require 'rexml/xpath' else raise OAI::Exception.new("unknown parser: #{@parser}") end end |
Instance Method Details
#get_record(opts = {}) ⇒ Object
Equivalent to a GetRecord request. You must supply an :identifier argument. You should get back a OAI::GetRecordResponse object which you can extract a OAI::Record object from.
154 155 156 |
# File 'lib/oai/client.rb', line 154 def get_record(opts={}) OAI::GetRecordResponse.new(do_request('GetRecord', opts)) end |
#identify ⇒ Object
Equivalent to a Identify request. You’ll get back a OAI::IdentifyResponse object which is essentially just a wrapper around a REXML::Document for the response. If you created your client using the libxml parser then you will get an XML::Node object instead.
127 128 129 |
# File 'lib/oai/client.rb', line 127 def identify OAI::IdentifyResponse.new(do_request('Identify')) end |
#list_identifiers(opts = {}) ⇒ Object
Equivalent to a ListIdentifiers request. Pass in :from, :until arguments as Date or DateTime objects as appropriate depending on the granularity supported by the server.
You can use seamless resumption with this verb, which allows you to mitigate (to some extent) the lack of a Count verb:
client.list_identifiers.full.count # Don't try this on PubMed though!
147 148 149 |
# File 'lib/oai/client.rb', line 147 def list_identifiers(opts={}) do_resumable(OAI::ListIdentifiersResponse, 'ListIdentifiers', opts) end |
#list_metadata_formats(opts = {}) ⇒ Object
Equivalent to a ListMetadataFormats request. A ListMetadataFormatsResponse object is returned to you.
134 135 136 |
# File 'lib/oai/client.rb', line 134 def (opts={}) OAI::ListMetadataFormatsResponse.new(do_request('ListMetadataFormats', opts)) end |
#list_records(opts = {}) ⇒ Object
Equivalent to the ListRecords request. A ListRecordsResponse will be returned which you can use to iterate through records
response = client.list_records
response.each do |record|
puts record.
end
Alternately, you can use seamless resumption to avoid handling resumption tokens:
client.list_records.full.each do |record|
puts record.
end
### Memory Use :full will avoid storing more than one page of records in memory, but your use it in ways that override that behaviour. Be careful to avoid using client.list_records.full.entries unless you really want to hold all the records in the feed in memory!
178 179 180 |
# File 'lib/oai/client.rb', line 178 def list_records(opts={}) do_resumable(OAI::ListRecordsResponse, 'ListRecords', opts) end |
#list_sets(opts = {}) ⇒ Object
Equivalent to the ListSets request. A ListSetsResponse object will be returned which you can use for iterating through the OAI::Set objects
for set in client.list_sets
puts set
end
A large number of sets is not unusual for some OAI-PMH feeds, so using seamless resumption may be preferable:
client.list_sets.full.each do |set|
puts set
end
196 197 198 |
# File 'lib/oai/client.rb', line 196 def list_sets(opts={}) do_resumable(OAI::ListSetsResponse, 'ListSets', opts) end |