Class: Krikri::Harvesters::CouchdbHarvester

Inherits:
Object
  • Object
show all
Includes:
Krikri::Harvester
Defined in:
lib/krikri/harvesters/couchdb_harvester.rb

Overview

A harvester implementation for CouchDB

Constant Summary

Constants included from Krikri::Harvester

Krikri::Harvester::Registry

Constants included from SoftwareAgent

SoftwareAgent::Logger

Instance Attribute Summary collapse

Attributes included from Krikri::Harvester

#name, #uri

Class Method Summary collapse

Instance Method Summary collapse

Methods included from Krikri::Harvester

#run

Methods included from SoftwareAgent

#agent_name, #log, #run

Constructor Details

#initialize(opts = {}) ⇒ CouchdbHarvester

Returns a new instance of CouchdbHarvester

Parameters:

  • opts (Hash) (defaults to: {})

    options to pass through to client requests. If => :view is not specified, it defaults to using the CouchDB `_all_docs` view.

See Also:


19
20
21
22
23
24
# File 'lib/krikri/harvesters/couchdb_harvester.rb', line 19

def initialize(opts = {})
  super
  @opts = opts.fetch(:couchdb, view: '_all_docs')
  @opts[:view] ||= '_all_docs'
  @client = Analysand::Database.new(uri)
end

Instance Attribute Details

#clientObject

Returns the value of attribute client


8
9
10
# File 'lib/krikri/harvesters/couchdb_harvester.rb', line 8

def client
  @client
end

Class Method Details

.expected_optsObject

See Also:

  • Krikri::Harvester::expected_opts

82
83
84
85
86
87
88
89
# File 'lib/krikri/harvesters/couchdb_harvester.rb', line 82

def self.expected_opts
  {
    key: :couchdb,
    opts: {
      view: { type: :string, required: false }
    }
  }
end

Instance Method Details

#count(opts = {}) ⇒ Object

Returns the total number of documents reported by a CouchDB view.


43
44
45
46
47
48
49
# File 'lib/krikri/harvesters/couchdb_harvester.rb', line 43

def count(opts = {})
  view = opts[:view] || @opts[:view]
  client.view(view,
              limit: 0,
              include_docs: false,
              stream: true).total_rows
end

#get_record(identifier) ⇒ Object

Retrieves a specific document from CouchDB.

Uses Analysand::Database#get!, which raises an exception if the document cannot be found.

See Also:

  • Analysand::Database#get!

75
76
77
78
# File 'lib/krikri/harvesters/couchdb_harvester.rb', line 75

def get_record(identifier)
  doc = client.get!(CGI.escape(identifier)).body.to_json
  @record_class.build(mint_id(identifier), doc, 'application/json')
end

#record_ids(opts = {}) ⇒ Object

Streams a response from a CouchDB view to yield identifiers.

The following will only send requests to the endpoint until it has 1000 record ids:

record_ids.take(1000)

See Also:

  • Analysand::Viewing
  • Analysand::StreamingViewResponse

36
37
38
39
# File 'lib/krikri/harvesters/couchdb_harvester.rb', line 36

def record_ids(opts = {})
  view = opts[:view] || @opts[:view]
  client.view(view, include_docs: false, stream: true).keys.lazy
end

#records(opts = {}) ⇒ Object

Streams a response from a CouchDB view to yield documents.

The following will only send requests to the endpoint until it has 1000 records:

records.take(1000)

See Also:

  • Analysand::Viewing
  • Analysand::StreamingViewResponse

61
62
63
64
65
66
# File 'lib/krikri/harvesters/couchdb_harvester.rb', line 61

def records(opts = {})
  view = opts[:view] || @opts[:view]
  client.view(view, include_docs: true, stream: true).docs.lazy.map do |r|
    @record_class.build(mint_id(r['_id']), r.to_json, 'application/json')
  end
end