Module: MarcHelper
- Included in:
- Blacklight, Hip3Service, HipHoldingSearch
- Defined in:
- lib/marc_helper.rb
Instance Method Summary (collapse)
-
- (Object) add_856_links(request, marc_records, options = {})
Takes an array of ruby MARC objects, adds ServiceResponses for the 856 links contained.
-
- (Object) edition_statement(marc, options = {})
From a marc record, get a string useful to display for identifying which edition/version of a work this represents.
-
- (Object) get_title(marc)
Take the title out of a marc record.
-
- (Object) get_years(marc)
A MARC record has two dates in it, date1 and date2.
-
- (Object) gmd_values
AACR2 "General Material Designation" .
-
- (Object) service_type_for_856(field, options)
Take a ruby Marc Field object representing an 856 field, decide what umlaut service type value to map it to.
-
- (Boolean) should_skip_856_link?(request, marc_record, url)
Used by #add_856_links.
-
- (Object) strip_gmd(arg_string, options = {})
removes something that looks like an AACR2 GMD in square brackets from the string.
Instance Method Details
- (Object) add_856_links(request, marc_records, options = {})
Takes an array of ruby MARC objects, adds ServiceResponses for the 856 links contained. Returns a hash of arrays of ServiceResponse objects added, keyed by service type value string.
9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 |
# File 'lib/marc_helper.rb', line 9 def add_856_links(request, marc_records, = {}) [:default_service_type] ||= "fulltext" [:match_reliability] ||= ServiceResponse::MatchExact responses_added = Hash.new # Keep track of urls to avoid putting the exact same url in twice urls_seen = Array.new marc_records.each do |marc_xml| marc_xml.find_all {|f| '856' === f.tag}.each do |field| # Might have more than one $u, in which case we want to # possibly add each of them. Might have 0 $u in which case # we skip. field.subfields.find_all {|sf| sf.code == 'u'}.each do |sf| url = sf.value # Already got it from another catalog record? next if urls_seen.include?(url) # Trying to avoid duplicates with SFX/link resolver. next if should_skip_856_link?(request, marc_xml, url) urls_seen.push(url) display_name = nil if field['y'] display_name = field['y'] else # okay let's try taking just the domain from the url begin u_obj = URI::parse( url ) display_name = u_obj.host rescue Exception end # Okay, can't parse out a domain, whole url then. display_name = url if display_name.nil? end # But if we've got a $3, the closest MARC comes to a field # that explains what this actually IS, use that too please. display_name = field['3'] + ' from ' + display_name if field['3'] # Build the response. response_params = {:service=>self, :display_text=>display_name, :url=>url} # get all those $z subfields and put em in notes. response_params[:url] = url # subfield 3 is being used for OCA records loaded in our catalog. response_params[:notes] = field.subfields.collect {|f| f.value if (f.code == 'z') }.compact.join('; ') is_journal = (marc_xml.leader[7,1] == 's') unless ( field['3'] || ! is_journal ) # subfield 3 is in fact some kind of coverage note, usually response_params[:notes] += "; " unless response_params[:notes].blank? response_params[:notes] += "Dates of coverage unknown." end unless ( [:match_reliability] == ServiceResponse::MatchExact ) response_params[:match_reliability] = [:match_reliability] response_params[:edition_str] = edition_statement(marc_xml) end # Figure out the right service type value for this, fulltext, ToC, # whatever. service_type_value = service_type_for_856( field, ) # fulltext urls from MARC are always marked as specially stupid. response_params[:coverage_checked] = false response_params[:can_link_to_article] = false # Some debugging info, add the 001 bibID if we have one. response_params[:debug_info] = "BibID: #{marc_xml['001'].value}" if marc_xml['001'] # Add the response response = request.add_service_response(response_params, [ service_type_value ]) responses_added[service_type_value] ||= Array.new responses_added[service_type_value].push(response) end end end return responses_added end |
- (Object) edition_statement(marc, options = {})
From a marc record, get a string useful to display for identifying which edition/version of a work this represents.
188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 209 210 211 212 213 214 215 216 217 218 219 220 221 222 223 224 225 226 227 228 229 230 231 |
# File 'lib/marc_helper.rb', line 188 def edition_statement(marc, = {}) [:include_repro_info] ||= true [:exclude_533_fields] = ['7','f','b', 'e'] parts = Array.new return "" unless marc #245$h GMD unless ( marc['245'].blank? || marc['245']['h'].blank? ) parts.push('(' + marc['245']['h'].gsub(/[^\w\s]/, '').strip.titlecase + ')') end #250 if ( marc['250']) parts.push( marc['250']['a'] ) unless marc['250']['a'].blank? parts.push( marc['250']['b'] ) unless marc['250']['b'].blank? end # 260 if ( marc['260']) if (marc['260']['b'] =~ /s\.n\./) parts.push(marc['260']['a']) unless marc['260']['a'].blank? else parts.push(marc['260']['b']) unless marc['260']['b'].blank? end parts.push( marc['260']['c'] ) unless marc['260']['c'].blank? end # 533 if [:include_repro_info] && marc['533'] marc['533'].subfields.each do |s| if ( s.code == 'a' ) parts.push('<em>' + s.value.gsub(/[^\w\s]/, '') + '</em>:' ) elsif (! [:exclude_533_fields].include?( s.code )) parts.push(s.value) end end end return nil if parts.length == 0 return parts.join(' ') end |
- (Object) get_title(marc)
Take the title out of a marc record
181 182 183 |
# File 'lib/marc_helper.rb', line 181 def get_title(marc) marc['245'].find_all {|sf| sf.code == "a" || sf.code == "b" || sf.code == "k"}.collect {|sf| sf.text}.join(" ").sub(/\s*[;:\/.,]\s*$/) end |
- (Object) get_years(marc)
A MARC record has two dates in it, date1 and date2. Exactly what they represent is something of an esoteric mystery. But this will return them both, in an array.
163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 |
# File 'lib/marc_helper.rb', line 163 def get_years(marc) array = [] # no marc 008? Weird, but okay. return array unless marc['008'] date1 = marc['008'].value[7,4] date1.strip! if date1 array.push(date1) unless date1.blank? date2 = marc['008'].value[11,4] date2.strip! if date2 array.push(date2) unless date2.blank? return array end |
- (Object) gmd_values
AACR2 "General Material Designation" . While these are (I think?) controlled, it's actually really hard to find the list. Maybe they're only semi-controlled. ONE list can be found here: www.oclc.org/bibformats/en/onlinecataloging/default.shtm#BCGFECEG
237 238 239 240 241 |
# File 'lib/marc_helper.rb', line 237 def gmd_values # 'computer file' is an old one that may still be found in data. return ['activity card', 'art original','art reproduction','braille','chart','diorama','electronic resource','computer file', 'filmstrip','flash card','game','globe','kit','manuscript','map','microform','microscope slides','model','motion picture','music','picture','realia','slide','sound recording','technical drawing','text','toy','transparency','videorecording'] end |
- (Object) service_type_for_856(field, options)
Take a ruby Marc Field object representing an 856 field, decide what umlaut service type value to map it to. Fulltext, ToC, etc. This is neccesarily a heuristic guess, Marc doesn't have enough granularity to really let us know for sure.
137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 |
# File 'lib/marc_helper.rb', line 137 def service_type_for_856(field, ) [:default_service_type] ||= "fulltext_title_level" # LC records here at hopkins have "Table of contents only" in the 856$3 # Think that's a convention from LC? if (field['3'] && field['3'].downcase =~ /table of contents( only)?/) return "table_of_contents" elsif (field['3'] && field['3'].downcase =~ /description/) # If it contains the word 'description', it's probably an abstract. # That's the best we can do, sadly. return "abstract" elsif (field['3'] && field['3'].downcase == 'sample text') # LC records often include these links. return "excerpts" elsif ( field['u'] =~ /www\.loc\.gov/ ) # Any other loc.gov link, we know it's not full text, don't put # it in full text field, put it as "see also". return "highlighted_link" else return [:default_service_type] end end |
- (Boolean) should_skip_856_link?(request, marc_record, url)
Used by #add_856_links. Complicated logic to try and avoid presenting a URL from the catalog that duplicates what SFX does, but present a URL from the catalog when it's really needed.
One reason not to include Catalog links for an article-level citation, even if SFX provided no targets, is maybe SFX provided no targets because SFX knew that the _particular date_ requested is not available. The catalog doesn't know that, but we don't want to show a link from the catalog that SFX really already knew wasn't going to be available.
So:
If this is a journal, skip the URL if it matches in our SFXUrl finder, because that means we think it's an SFX controlled URL. But if it's not a journal, use it anyway, because it's probably an e-book that is not in SFX, even if it's from a vendor who is in SFX. We use MARC leader byte 7 to tell if it's a journal. Confusing enough? Not yet! Even if it is a journal, if this isn't an article-level cite and there are no other full text already provided, we still include.
122 123 124 125 126 127 128 129 130 131 |
# File 'lib/marc_helper.rb', line 122 def should_skip_856_link?(request, marc_record, url) is_journal = (marc_record.leader[7,1] == 's') return ( is_journal && SfxUrl.sfx_controls_url?(url) && !( request.title_level_citation? && request.get_service_type("fulltext").length == 0 ) ) end |
- (Object) strip_gmd(arg_string, options = {})
removes something that looks like an AACR2 GMD in square brackets from the string. Pretty kludgey.
245 246 247 248 249 250 251 252 |
# File 'lib/marc_helper.rb', line 245 def strip_gmd(arg_string, = {}) [:replacement] ||= ':' gmd_values.each do |gmd_val| arg_string = arg_string.sub(/\[#{gmd_val}( \((tactile|braile|large print)\))?\]/, [:replacement]) end return arg_string end |