Class: HathiTrust
- Includes:
- MetadataHelper
- Defined in:
- app/service_adaptors/hathi_trust.rb
Overview
Service that searches HathiTrust from the University of Michigan
Supports full text links, and search inside.
We link to HathiTrust using a direct babel.hathitrust.org URL instead of the handle.net redirection, for two reasonsL 1) Can’t use the handle.net redirection for the “direct link to search
results for user-entered query" feature.
2) Some may want to force a Shibboleth login on HT links. Can’t do that
with the handle.net redirection either. If you do want to do that,
possibly in concert with an EZProxy mediated WAYFless login,
set direct_link_base in your services.yml to:
"https://babel.hathitrust.org/shcgi/"
Many (but not all) HT books will also be in Google Books (and vice versa) However, HT was more generous in deciding what books are public domain than GBS. Therefore the main expected use case is to use with Google Books, with HT being a lower priority, using preempted_by config.
Some may prefer HT search inside interface to Google, so search inside is not suppressed with presence of google. You can turn off HT search inside entirely if you like.
For HT records representing one volume of several, a :excerpts type response will be added if full text is avail for some. Or a :highlighted_link if only search inside is available for some. Or set config show_multi_volume=false to prevent this and ignore partial volumes.
Two possibilities are available for sdr rights “full” or “searchonly”. The third possibility is that sdr will be null.
An ISBN with search-only: 0195101464
Constant Summary
Constants inherited from Service
Service::LinkOutFilterTask, Service::StandardTask
Instance Attribute Summary collapse
-
#display_name ⇒ Object
readonly
Returns the value of attribute display_name.
-
#note ⇒ Object
readonly
Returns the value of attribute note.
-
#url ⇒ Object
readonly
Returns the value of attribute url.
Attributes inherited from Service
#group, #name, #priority, #request, #service_id, #status, #task
Instance Method Summary collapse
- #create_fulltext_service_response(request, items) ⇒ Object
-
#create_partial_volume_responses(request, ht_json) ⇒ Object
If HT has partial serial volumes, include a link to that.
- #create_search_inside(request, items) ⇒ Object
- #direct_url_to(item_json) ⇒ Object
-
#do_query(params) ⇒ Object
conducts query and parses the JSON.
- #full_view?(item) ⇒ Boolean
-
#get_bibkey_parameters(rft) {|isbn, lccn, oclcnum| ... } ⇒ Object
method that takes a referent and a block for parameter creation The block receives isbn, lccn, oclcnum and is responsible for formatting the parameters for the particular service FIXME consider moving this into metadata_helper.
-
#get_parameters(rft) ⇒ Object
just a wrapper around get_bibkey_parameters.
- #handle(request) ⇒ Object
-
#initialize(config) ⇒ HathiTrust
constructor
A new instance of HathiTrust.
- #is_serial_part?(item) ⇒ Boolean
-
#response_url(service_response, submitted_params) ⇒ Object
Handle search_inside.
- #search_url_to(item_json) ⇒ Object
- #service_types_generated ⇒ Object
- #transform_view_data(hash) ⇒ Object
Methods included from MetadataHelper
#get_doi, #get_epage, #get_gpo_item_nums, #get_identifier, #get_isbn, #get_issn, #get_lccn, #get_month, #get_oclcnum, #get_pmid, #get_search_creator, #get_search_terms, #get_search_title, #get_spage, #get_sudoc, #get_top_level_creator, #get_year, #normalize_lccn, #normalize_title, #raw_search_title, title_is_serial?
Methods included from MarcHelper
#add_856_links, #edition_statement, #get_title, #get_years, #gmd_values, #service_type_for_856, #should_skip_856_link?, #strip_gmd
Methods inherited from Service
#credits, #handle_wrapper, #link_out_filter, #preempted_by, required_config_params, #translate
Constructor Details
#initialize(config) ⇒ HathiTrust
Returns a new instance of HathiTrust.
50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 |
# File 'app/service_adaptors/hathi_trust.rb', line 50 def initialize(config) @api_url = 'http://catalog.hathitrust.org/api/volumes' # Set to 'https://babel.hathitrust.org/shcgi/' to force # Shibboleth login, possibly in concert with EZProxy providing # WAYFLess login. @direct_link_base = 'http://babel.hathitrust.org/cgi/' @display_name = 'HathiTrust' @num_full_views = 1 # max num full view links to include @note = '' #'Fulltext books from the University of Michigan' @show_search_inside = true @show_multi_volume = true @credits = { "HathiTrust" => "http://www.hathitrust.org" } super(config) end |
Instance Attribute Details
#display_name ⇒ Object (readonly)
Returns the value of attribute display_name.
41 42 43 |
# File 'app/service_adaptors/hathi_trust.rb', line 41 def display_name @display_name end |
#note ⇒ Object (readonly)
Returns the value of attribute note.
41 42 43 |
# File 'app/service_adaptors/hathi_trust.rb', line 41 def note @note end |
#url ⇒ Object (readonly)
Returns the value of attribute url.
41 42 43 |
# File 'app/service_adaptors/hathi_trust.rb', line 41 def url @url end |
Instance Method Details
#create_fulltext_service_response(request, items) ⇒ Object
148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 |
# File 'app/service_adaptors/hathi_trust.rb', line 148 def create_fulltext_service_response(request, items) return nil if items.empty? count = 0 items.each do |item| next if is_serial_part?(item) next unless full_view?(item) request.add_service_response( :service=>self, :display_text=> @display_name, :display_text_i18n => "display_name", :url=> direct_url_to(item), :add_i18n_notes => "single_volume", # signal for transform_view_data :source_for_i18n => item['orig'], :service_type_value => :fulltext ) count += 1 break if count == @num_full_views end return count end |
#create_partial_volume_responses(request, ht_json) ⇒ Object
If HT has partial serial volumes, include a link to that. Need to pass in complete HT json response
177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 209 210 211 212 213 214 215 216 217 218 219 220 221 222 |
# File 'app/service_adaptors/hathi_trust.rb', line 177 def create_partial_volume_responses(request, ht_json) items = ht_json.values.first["items"] full_ids = items.collect do |i| i["fromRecord"] if (is_serial_part?(i) && full_view?(i)) end.compact.uniq full_ids.each do |recordId| record = ht_json.values.first["records"][recordId] next unless record && record["recordURL"] record_title = record["titles"].first if record["titles"].kind_of?(Array) request.add_service_response( :service=>self, :display_text=> @display_name, :display_text_i18n => "display_name", :url=> record["recordURL"], :add_i18n_notes => "partial_volume", # signal for transform_view_data :title_for_i18n => record_title, :service_type_value => :excerpts ) end if full_ids.empty? search_ids = items.collect do |i| i["fromRecord"] if (is_serial_part?(i) ) end.compact.uniq search_ids.each do |recordId| record = ht_json.values.first["records"][recordId] next unless record && record["recordURL"] request.add_service_response( :service=>self, :display_text=> "Search inside some volumes", :display_text_i18n => "search_inside_some_vols", :url=> record["recordURL"], :service_type_value => :highlighted_link ) end end end |
#create_search_inside(request, items) ⇒ Object
224 225 226 227 228 229 230 231 232 233 234 235 236 237 238 239 240 241 242 243 244 |
# File 'app/service_adaptors/hathi_trust.rb', line 224 def create_search_inside(request, items) return if items.empty? # Can only include search from the first one # There's search inside for _any_ HT item. We think. item = items.first # if this is a serial, we don't want to search inside just part of it, forget it return if is_serial_part?(item) direct_url = search_url_to(item) return unless direct_url request.add_service_response( :service => self, :display_text=> @display_name, :display_text_i18n => "display_name", :url=> direct_url, :service_type_value => :search_inside ) end |
#direct_url_to(item_json) ⇒ Object
246 247 248 249 250 251 252 253 254 255 256 |
# File 'app/service_adaptors/hathi_trust.rb', line 246 def direct_url_to(item_json) if @direct_link_base # we're constructing our own link because we need our EZProxy # to recognize it for WAYFLess login, which it won't if we use # the handle.net url, sorry. # We also need direct link for direct link to search results. @direct_link_base + "pt?id=" + CGI.escape(item_json['htid']) else item['itemURL'] end end |
#do_query(params) ⇒ Object
conducts query and parses the JSON
142 143 144 145 |
# File 'app/service_adaptors/hathi_trust.rb', line 142 def do_query(params) link = @api_url + "/brief/json/" + params return MultiJson.load( open(link).read ) end |
#full_view?(item) ⇒ Boolean
276 277 278 |
# File 'app/service_adaptors/hathi_trust.rb', line 276 def full_view?(item) item["usRightsString"] == "Full view" end |
#get_bibkey_parameters(rft) {|isbn, lccn, oclcnum| ... } ⇒ Object
method that takes a referent and a block for parameter creation The block receives isbn, lccn, oclcnum and is responsible for formatting the parameters for the particular service FIXME consider moving this into metadata_helper
127 128 129 130 131 132 133 134 135 136 137 138 139 |
# File 'app/service_adaptors/hathi_trust.rb', line 127 def get_bibkey_parameters(rft) # filter out special chars that ought not to be in there anyway, # and that HathiTrust barfs on. isbn = get_isbn(rft) oclcnum = get_identifier(:info, "oclcnum", rft) oclcnum = oclcnum.gsub(/[\-\[\]]/, '') unless oclcnum.blank? lccn = get_lccn(rft) lccn = lccn.gsub(/[\-\[\]]/, '') unless lccn.blank? yield(isbn, lccn, oclcnum) end |
#get_parameters(rft) ⇒ Object
just a wrapper around get_bibkey_parameters
103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 |
# File 'app/service_adaptors/hathi_trust.rb', line 103 def get_parameters(rft) # API supports oclcnum, isbn, or lccn, and can provide more than one of each. get_bibkey_parameters(rft) do |isbn, lccn, oclcnum| keys = Array.new keys << "oclc:" + CGI.escape(oclcnum) unless oclcnum.blank? keys << "lccn:" + CGI.escape(lccn) unless lccn.blank? # Only include ISBN if we have it and we do NOT have oclc or lccn, # Bill Dueber's advice for best matching. HT api will only match # if ALL the id's we supply match. keys << "isbn:" + CGI.escape(isbn) unless (isbn.blank? || keys.length > 0) if keys.length > 0 return keys.join(";") else return nil end end end |
#handle(request) ⇒ Object
69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 |
# File 'app/service_adaptors/hathi_trust.rb', line 69 def handle(request) params = get_parameters(request.referent) return request.dispatched(self, true) if params.blank? ht_json = do_query(params) return request.dispatched(self, true) if ht_json.nil? #extract the "items" list from the first result group from #response. first_group = ht_json.values.first items = first_group["items"] # Only add fulltext if we're not skipping due to GBS if ( preempted_by(request, "fulltext")) Rails.logger.debug("#{self.class}: Skipping due to pre-emption") else full_views_shown = create_fulltext_service_response(request, items) end if @show_multi_volume #possibly partial volumes create_partial_volume_responses(request, ht_json) end create_search_inside(request, items) return request.dispatched(self, true) end |
#is_serial_part?(item) ⇒ Boolean
269 270 271 272 273 274 |
# File 'app/service_adaptors/hathi_trust.rb', line 269 def is_serial_part?(item) # if it's got enumCron, then it's just part of a serial, # we don't want to say the serial title as a whole has full text # or can be searched, skip it. return item['enumcron'] end |
#response_url(service_response, submitted_params) ⇒ Object
Handle search_inside
292 293 294 295 296 297 298 299 300 301 302 |
# File 'app/service_adaptors/hathi_trust.rb', line 292 def response_url(service_response, submitted_params) if ( ! (service_response.service_type_value.name == "search_inside" )) return super(service_response, submitted_params) else base = service_response[:url] query = CGI.escape(submitted_params["query"] || "") url = base + "&q1=#{query}" return url end end |
#search_url_to(item_json) ⇒ Object
280 281 282 283 284 285 286 |
# File 'app/service_adaptors/hathi_trust.rb', line 280 def search_url_to(item_json) if @direct_link_base @direct_link_base + "ptsearch?id=" + CGI.escape(item_json['htid']) else return nil end end |
#service_types_generated ⇒ Object
43 44 45 46 47 48 |
# File 'app/service_adaptors/hathi_trust.rb', line 43 def service_types_generated types = [ ServiceTypeValue[:fulltext] ] types.concat([ServiceTypeValue[:excerpts], ServiceTypeValue[:highlighted_link]]) if @show_multi_volume types << ServiceTypeValue[:search_inside] if @show_search_inside return types end |
#transform_view_data(hash) ⇒ Object
258 259 260 261 262 263 264 265 266 |
# File 'app/service_adaptors/hathi_trust.rb', line 258 def transform_view_data(hash) if hash[:add_i18n_notes] == "single_volume" hash[:notes] = translate("note_for_single_vol", :source => (hash[:source_for_i18n] || "")) elsif hash[:add_i18n_notes] == "partial_volume" hash[:notes] = translate("note_for_multi_vol", :title => (hash[:title_for_i18n] || "")) end return hash end |