Class: OAI::Client
- Inherits:
-
Object
- Object
- OAI::Client
- Defined in:
- lib/oai/client.rb
Overview
A ‘OAI::Client` provides a client api for issuing OAI-PMH verbs against a OAI-PMH server. The 6 OAI-PMH verbs translate directly to methods you can call on a `OAI::Client` object. Verb arguments are passed as a hash:
“‘ruby
client = OAI::Client.new 'http://www.pubmedcentral.gov/oai/oai.cgi'
record = client.get_record :identifier => 'oai:pubmedcentral.gov:13901'
for identifier in client.list_identifiers
puts identifier
end
“‘
It is worth noting that the API uses methods and parameter names with underscores in them rather than studly caps. So above ‘list_identifiers` and `metadata_prefix` are used instead of the `listIdentifiers` and `metadataPrefix` used in the OAI-PMH specification.
Also, the from and until arguments which specify dates should be passed in as ‘Date` or `DateTime` objects depending on the granularity supported by the server.
For detailed information on the arguments that can be used please consult the OAI-PMH docs at <www.openarchives.org/OAI/openarchivesprotocol.html>.
Instance Method Summary collapse
-
#get_record(opts = {}) ⇒ Object
Equivalent to a ‘GetRecord` request.
-
#identify ⇒ Object
Equivalent to a ‘Identify` request.
-
#initialize(base_url, options = {}) ⇒ Client
constructor
The constructor which must be passed a valid base url for an oai service:.
-
#list_identifiers(opts = {}) ⇒ Object
Equivalent to a ‘ListIdentifiers` request.
-
#list_metadata_formats(opts = {}) ⇒ Object
Equivalent to a ‘ListMetadataFormats` request.
-
#list_records(opts = {}) ⇒ Object
Equivalent to the ‘ListRecords` request.
-
#list_sets(opts = {}) ⇒ Object
Equivalent to the ‘ListSets` request.
Constructor Details
#initialize(base_url, options = {}) ⇒ Client
The constructor which must be passed a valid base url for an oai service:
client = OAI::Client.new 'http://www.pubmedcentral.gov/oai/oai.cgi'
If you want to see debugging messages on ‘STDERR` use:
client = OAI::Client.new 'http://example.com', :debug => true
By default OAI verbs called on the client will return ‘REXML::Element` objects for metadata records, however if you wish you can use the `:parser` option to indicate you want to use `libxml` instead, and get back `XML::Node` objects
client = OAI::Client.new 'http://example.com', :parser => 'libxml'
You can configure the Faraday HTTP client by providing an alternate Faraday instance:
“‘ruby client = OAI::Client.new ’example.com’, :http => Faraday.new {|c|} “‘
### HIGH PERFORMANCE
If you want to supercharge this api install ‘libxml-ruby >= 0.3.8` and use the `:parser` option when you construct your `OAI::Client`.
86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 |
# File 'lib/oai/client.rb', line 86 def initialize(base_url, ={}) @base = URI.parse base_url @debug = .fetch(:debug, false) @parser = .fetch(:parser, 'rexml') @headers = .fetch(:headers, {}) @http_client = .fetch(:http) do Faraday.new(:url => @base.clone) do |builder| follow_redirects = .fetch(:redirects, true) follow_redirects = 5 if follow_redirects == true if follow_redirects require 'faraday_middleware' builder.response :follow_redirects, :limit => follow_redirects.to_i end builder.adapter :net_http end end # load appropriate parser case @parser when 'libxml' begin require 'rubygems' require 'xml/libxml' rescue raise OAI::Exception.new("xml/libxml not available") end when 'rexml' require 'rexml/document' require 'rexml/xpath' else raise OAI::Exception.new("unknown parser: #{@parser}") end end |
Instance Method Details
#get_record(opts = {}) ⇒ Object
Equivalent to a ‘GetRecord` request. You must supply an `:identifier` argument. You should get back a `OAI::GetRecordResponse` object which you can extract a `OAI::Record` object from.
154 155 156 |
# File 'lib/oai/client.rb', line 154 def get_record(opts={}) OAI::GetRecordResponse.new(do_request('GetRecord', opts)) end |
#identify ⇒ Object
Equivalent to a ‘Identify` request. You’ll get back a ‘OAI::IdentifyResponse` object which is essentially just a wrapper around a `REXML::Document` for the response. If you created your client using the `libxml` parser then you will get an `XML::Node` object instead.
127 128 129 |
# File 'lib/oai/client.rb', line 127 def identify OAI::IdentifyResponse.new(do_request('Identify')) end |
#list_identifiers(opts = {}) ⇒ Object
Equivalent to a ‘ListIdentifiers` request. Pass in `:from`, `:until` arguments as `Date` or `DateTime` objects as appropriate depending on the granularity supported by the server.
You can use seamless resumption with this verb, which allows you to mitigate (to some extent) the lack of a ‘Count` verb:
client.list_identifiers.full.count # Don't try this on PubMed though!
147 148 149 |
# File 'lib/oai/client.rb', line 147 def list_identifiers(opts={}) do_resumable(OAI::ListIdentifiersResponse, 'ListIdentifiers', opts) end |
#list_metadata_formats(opts = {}) ⇒ Object
Equivalent to a ‘ListMetadataFormats` request. A `ListMetadataFormatsResponse` object is returned to you.
134 135 136 |
# File 'lib/oai/client.rb', line 134 def (opts={}) OAI::ListMetadataFormatsResponse.new(do_request('ListMetadataFormats', opts)) end |
#list_records(opts = {}) ⇒ Object
Equivalent to the ‘ListRecords` request. A `ListRecordsResponse` will be returned which you can use to iterate through records
response = client.list_records
response.each do |record|
puts record.
end
Alternately, you can use seamless resumption to avoid handling resumption tokens:
client.list_records.full.each do |record|
puts record.
end
### Memory Use ‘:full` will avoid storing more than one page of records in memory, but your use it in ways that override that behaviour. Be careful to avoid using `client.list_records.full.entries` unless you really want to hold all the records in the feed in memory!
178 179 180 |
# File 'lib/oai/client.rb', line 178 def list_records(opts={}) do_resumable(OAI::ListRecordsResponse, 'ListRecords', opts) end |
#list_sets(opts = {}) ⇒ Object
Equivalent to the ‘ListSets` request. A `ListSetsResponse` object will be returned which you can use for iterating through the `OAI::Set` objects
for set in client.list_sets
puts set
end
A large number of sets is not unusual for some OAI-PMH feeds, so using seamless resumption may be preferable:
client.list_sets.full.each do |set|
puts set
end
196 197 198 |
# File 'lib/oai/client.rb', line 196 def list_sets(opts={}) do_resumable(OAI::ListSetsResponse, 'ListSets', opts) end |