Class: HathiTrust

Inherits:
Service show all
Includes:
MetadataHelper
Defined in:
lib/service_adaptors/hathi_trust.rb

Overview

Service that searches HathiTrust from the University of Michigan

Supports full text links, and search inside.

We link to HathiTrust using a direct babel.hathitrust.org URL instead of the handle.net redirection, for two reasonsL 1) Can’t use the handle.net redirection for the “direct link to search

results for user-entered query" feature.

2) Some may want to force a Shibboleth login on HT links. Can’t do that

with the handle.net redirection either. If you do want to do that,
possibly in concert with an EZProxy mediated WAYFless login,
set direct_link_base in your services.yml to:
"https://babel.hathitrust.org/shcgi/"

Many (but not all) HT books will also be in Google Books (and vice versa) However, HT was more generous in deciding what books are public domain than GBS. Therefore the main expected use case is to use with Google Books, with HT being a lower priority, using preempted_by config.

Some may prefer HT search inside interface to Google, so search inside is not suppressed with presence of google. You can turn off HT search inside entirely if you like.

For HT records representing one volume of several, a :excerpts type response will be added if full text is avail for some. Or a :highlighted_link if only search inside is available for some. Or set config show_multi_volume=false to prevent this and ignore partial volumes.

Two possibilities are available for sdr rights “full” or “searchonly”. The third possibility is that sdr will be null.

An ISBN with search-only: 0195101464

Constant Summary

Constants inherited from Service

Service::LinkOutFilterTask, Service::StandardTask

Instance Attribute Summary collapse

Attributes inherited from Service

#name, #priority, #request, #service_id, #session_id, #status, #task

Instance Method Summary collapse

Methods included from MetadataHelper

#get_doi, #get_gpo_item_nums, #get_identifier, #get_isbn, #get_issn, #get_lccn, #get_oclcnum, #get_pmid, #get_search_creator, #get_search_terms, #get_search_title, #get_sudoc, #get_top_level_creator, #get_year, #normalize_lccn, #normalize_title, #raw_search_title, #title_is_serial?

Methods included from MarcHelper

#add_856_links, #edition_statement, #get_title, #get_years, #gmd_values, #service_type_for_856, #should_skip_856_link?, #strip_gmd

Methods inherited from Service

#credits, #handle_wrapper, #link_out_filter, #preempted_by, required_config_params, #response_to_view_data, #view_data_from_service_type

Constructor Details

#initialize(config) ⇒ HathiTrust

Returns a new instance of HathiTrust.



50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
# File 'lib/service_adaptors/hathi_trust.rb', line 50

def initialize(config)
  @api_url = 'http://catalog.hathitrust.org/api/volumes'
  # Set to 'https://babel.hathitrust.org/shcgi/' to force
  # Shibboleth login, possibly in concert with EZProxy providing
  # WAYFLess login. 
  @direct_link_base = 'http://babel.hathitrust.org/cgi/'
  @display_name = 'HathiTrust'
  @num_full_views = 1 # max num full view links to include
  @note =  '' #'Fulltext books from the University of Michigan'
  @show_search_inside = true
  @show_multi_volume = true
  
  @credits = {
    "HathiTrust" => "http://www.hathitrust.org"
  }
  
  super(config)
end

Instance Attribute Details

#display_nameObject (readonly)

Returns the value of attribute display_name.



41
42
43
# File 'lib/service_adaptors/hathi_trust.rb', line 41

def display_name
  @display_name
end

#noteObject (readonly)

Returns the value of attribute note.



41
42
43
# File 'lib/service_adaptors/hathi_trust.rb', line 41

def note
  @note
end

#urlObject (readonly)

Returns the value of attribute url.



41
42
43
# File 'lib/service_adaptors/hathi_trust.rb', line 41

def url
  @url
end

Instance Method Details

#create_fulltext_service_response(request, items) ⇒ Object



149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
# File 'lib/service_adaptors/hathi_trust.rb', line 149

def create_fulltext_service_response(request, items)
  return nil if items.empty?
  
  display_name = @display_name
  count = 0
  
  items.each do |item|         
    next if is_serial_part?(item)
    
    
    next unless full_view?(item)
    
    request.add_service_response(
        :service=>self, 
        :display_text=>display_name,           
        :url=> direct_url_to(item), 
        :notes=> note_for(item), 
        :service_type_value => :fulltext 
    )
    count += 1
    break if count == @num_full_views
  end   
  return count
end

#create_partial_volume_responses(request, ht_json) ⇒ Object

If HT has partial serial volumes, include a link to that. Need to pass in complete HT json response



177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
# File 'lib/service_adaptors/hathi_trust.rb', line 177

def create_partial_volume_responses(request, ht_json)
  items =  ht_json.values.first["items"]
  full_ids = items.collect do |i| 
    i["fromRecord"] if (is_serial_part?(i) && full_view?(i))
  end.compact.uniq
  
  full_ids.each do |recordId|
    record = ht_json.values.first["records"][recordId]
    next unless record && record["recordURL"]
      
    request.add_service_response(
        :service=>self, 
        :display_text=>@display_name,           
        :url=> record["recordURL"],
        :notes => excerpt_note_for(record),
        :service_type_value => :excerpts
    )
  end
  
  if full_ids.empty?
    search_ids = items.collect do |i|
      i["fromRecord"] if (is_serial_part?(i) )
    end.compact.uniq
    
    search_ids.each do |recordId|
      record = ht_json.values.first["records"][recordId]
      next unless record && record["recordURL"]
      
      request.add_service_response(
          :service=>self, 
          :display_text=>"Search inside some volumes",           
          :url=> record["recordURL"],
          :service_type_value => :highlighted_link             
      )   

    end
    
  end
  
  
end

#create_search_inside(request, items) ⇒ Object



219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
# File 'lib/service_adaptors/hathi_trust.rb', line 219

def create_search_inside(request, items)
  return if items.empty?

  # Can only include search from the first one  
  # There's search inside for _any_ HT item. We think. 
  item = items.first
  
  # if this is a serial, we don't want to search inside just part of it, forget it
  return if is_serial_part?(item) 
  
  direct_url = search_url_to(item)
  return unless direct_url

  request.add_service_response( 
      :service => self,
      :display_text=>@display_name,
      :url=> direct_url,
      :service_type_value => :search_inside
     )
end

#direct_url_to(item_json) ⇒ Object



240
241
242
243
244
245
246
247
248
249
250
# File 'lib/service_adaptors/hathi_trust.rb', line 240

def direct_url_to(item_json)
  if @direct_link_base
    # we're constructing our own link because we need our EZProxy
    # to recognize it for WAYFLess login, which it won't if we use
    # the handle.net url, sorry. 
    # We also need direct link for direct link to search results.
    @direct_link_base + "pt?id=" + CGI.escape(item_json['htid'])
  else
    item['itemURL']
  end
end

#do_query(params) ⇒ Object

conducts query and parses the JSON



143
144
145
146
# File 'lib/service_adaptors/hathi_trust.rb', line 143

def do_query(params)        
  link = @api_url + "/brief/json/" + params
  return MultiJson.load( open(link).read )
end

#excerpt_note_for(record) ⇒ Object



260
261
262
263
# File 'lib/service_adaptors/hathi_trust.rb', line 260

def excerpt_note_for(record)
  return nil unless record["titles"].kind_of?(Array)
  "Some volumes of: #{record["titles"].first}"
end

#full_view?(item) ⇒ Boolean

Returns:

  • (Boolean)


272
273
274
# File 'lib/service_adaptors/hathi_trust.rb', line 272

def full_view?(item)
  item["usRightsString"] == "Full view"
end

#get_bibkey_parameters(rft) {|isbn, lccn, oclcnum| ... } ⇒ Object

method that takes a referent and a block for parameter creation The block receives isbn, lccn, oclcnum and is responsible for formatting the parameters for the particular service FIXME consider moving this into metadata_helper

Yields:

  • (isbn, lccn, oclcnum)


127
128
129
130
131
132
133
134
135
136
137
138
139
140
# File 'lib/service_adaptors/hathi_trust.rb', line 127

def get_bibkey_parameters(rft)
  # filter out special chars that ought not to be in there anyway,
  # and that HathiTrust barfs on. 
  isbn = get_identifier(:urn, "isbn", rft)
  isbn = isbn.gsub(/[\-\[\]]/, '') unless isbn.blank?
  
  oclcnum = get_identifier(:info, "oclcnum", rft)
  oclcnum = oclcnum.gsub(/[\-\[\]]/, '') unless oclcnum.blank?
  
  lccn = get_lccn(rft)
  lccn = lccn.gsub(/[\-\[\]]/, '') unless lccn.blank?
      
  yield(isbn, lccn, oclcnum)    
end

#get_parameters(rft) ⇒ Object

just a wrapper around get_bibkey_parameters



103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
# File 'lib/service_adaptors/hathi_trust.rb', line 103

def get_parameters(rft)
  # API supports oclcnum, isbn, or lccn, and can provide more than one of each. 
  get_bibkey_parameters(rft) do |isbn, lccn, oclcnum|         
    keys = Array.new
                
    keys << "oclc:" + CGI.escape(oclcnum) unless oclcnum.blank?    
    keys <<  "lccn:" + CGI.escape(lccn) unless lccn.blank?
    # Only include ISBN if we have it and we do NOT have oclc or lccn,
    # Bill Dueber's advice for best matching. HT api will only match
    # if ALL the id's we supply match. 
    keys << "isbn:" + CGI.escape(isbn) unless (isbn.blank? || keys.length > 0)

    if keys.length > 0        
      return keys.join(";")
    else
      return nil
    end
  end
end

#handle(request) ⇒ Object



69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
# File 'lib/service_adaptors/hathi_trust.rb', line 69

def handle(request)
  params = get_parameters(request.referent)
  return request.dispatched(self, true) if params.blank?
  
  ht_json = do_query(params)
  return request.dispatched(self, true) if ht_json.nil?
  
  #extract the "items" list from the first result group from
  #response.
  first_group = ht_json.values.first    
  items = first_group["items"]
  
  
  
  # Only add fulltext if we're not skipping due to GBS
  if ( preempted_by(request, "fulltext"))
    Rails.logger.debug("#{self.class}: Skipping due to pre-emption")
  else
    full_views_shown = create_fulltext_service_response(request, items)
  end
  
  if @show_multi_volume
    #possibly partial volumes
    create_partial_volume_responses(request, ht_json)
  end

  

  create_search_inside(request, items)
      
  return request.dispatched(self, true)
end

#is_serial_part?(item) ⇒ Boolean

Returns:

  • (Boolean)


265
266
267
268
269
270
# File 'lib/service_adaptors/hathi_trust.rb', line 265

def is_serial_part?(item)
  # if it's got enumCron, then it's just part of a serial,
  # we don't want to say the serial title as a whole has full text
  # or can be searched, skip it. 
  return item['enumcron']
end

#note_for(item) ⇒ Object



252
253
254
255
256
257
258
# File 'lib/service_adaptors/hathi_trust.rb', line 252

def note_for(item)
  if item['orig']
    "Digitized from #{item['orig']}"
  else
    nil
  end
end

#response_url(service_response, submitted_params) ⇒ Object

Handle search_inside



288
289
290
291
292
293
294
295
296
297
298
# File 'lib/service_adaptors/hathi_trust.rb', line 288

def response_url(service_response, )
  if ( ! (service_response.service_type_value.name == "search_inside" ))
    return super(service_response, )
  else
    base = service_response[:url]      
    query = CGI.escape(["query"] || "")
    url = base + "&q1=#{query}"

    return url
  end
end

#search_url_to(item_json) ⇒ Object



276
277
278
279
280
281
282
# File 'lib/service_adaptors/hathi_trust.rb', line 276

def search_url_to(item_json)
  if @direct_link_base
    @direct_link_base + "ptsearch?id=" + CGI.escape(item_json['htid'])
  else
    return nil
  end
end

#service_types_generatedObject



43
44
45
46
47
48
# File 'lib/service_adaptors/hathi_trust.rb', line 43

def service_types_generated    
  types = [ ServiceTypeValue[:fulltext] ]
  types.concat([ServiceTypeValue[:excerpts], ServiceTypeValue[:highlighted_link]]) if @show_multi_volume
  types << ServiceTypeValue[:search_inside] if @show_search_inside
  return types
end