Module: MarcHelper
- Included in:
- Blacklight, Hip3Service, HipHoldingSearch, MetadataHelper
- Defined in:
- app/mixin_logic/marc_helper.rb
Instance Method Summary collapse
-
#add_856_links(request, marc_records, options = {}) ⇒ Object
Takes an array of ruby MARC objects, adds ServiceResponses for the 856 links contained.
-
#edition_statement(marc, options = {}) ⇒ Object
From a marc record, get a string useful to display for identifying which edition/version of a work this represents.
-
#get_title(marc) ⇒ Object
Take the title out of a marc record.
-
#get_years(marc) ⇒ Object
A MARC record has two dates in it, date1 and date2.
-
#gmd_values ⇒ Object
AACR2 “General Material Designation” .
-
#service_type_for_856(field, options) ⇒ Object
Take a ruby Marc Field object representing an 856 field, decide what umlaut service type value to map it to.
-
#should_skip_856_link?(request, marc_record, url) ⇒ Boolean
Used by #add_856_links.
-
#strip_gmd(arg_string, options = {}) ⇒ Object
removes something that looks like an AACR2 GMD in square brackets from the string.
Instance Method Details
#add_856_links(request, marc_records, options = {}) ⇒ Object
Takes an array of ruby MARC objects, adds ServiceResponses for the 856 links contained. Returns a hash of arrays of ServiceResponse objects added, keyed by service type value string.
9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 |
# File 'app/mixin_logic/marc_helper.rb', line 9 def add_856_links(request, marc_records, = {}) [:default_service_type] ||= "fulltext" [:match_reliability] ||= ServiceResponse::MatchExact responses_added = Hash.new # Keep track of urls to avoid putting the exact same url in twice urls_seen = Array.new marc_records.each do |marc_xml| marc_xml.find_all {|f| '856' === f.tag}.each do |field| # Might have more than one $u, in which case we want to # possibly add each of them. Might have 0 $u in which case # we skip. field.subfields.find_all {|sf| sf.code == 'u'}.each do |sf| url = sf.value # Already got it from another catalog record? next if urls_seen.include?(url) # Trying to avoid duplicates with SFX/link resolver. skip = should_skip_856_link?(request, marc_xml, url) next if skip urls_seen.push(url) display_name = nil if field['y'] display_name = field['y'] else # okay let's try taking just the domain from the url begin u_obj = URI::parse( url ) display_name = u_obj.host rescue Exception end # Okay, can't parse out a domain, whole url then. display_name = url if display_name.nil? end # But if we've got a $3, the closest MARC comes to a field # that explains what this actually IS, use that too please. display_name = field['3'] + ' from ' + display_name if field['3'] # Build the response. response_params = {:service=>self, :display_text=>display_name, :url=>url} # get all those $z subfields and put em in notes. response_params[:url] = url # subfield 3 is being used for OCA records loaded in our catalog. response_params[:notes] = field.subfields.collect {|f| f.value if (f.code == 'z') }.compact.join('; ') is_journal = (marc_xml.leader[7,1] == 's') unless ( field['3'] || ! is_journal ) # subfield 3 is in fact some kind of coverage note, usually response_params[:notes] += "; " unless response_params[:notes].blank? response_params[:notes] += "Dates of coverage unknown." end unless ( [:match_reliability] == ServiceResponse::MatchExact ) response_params[:match_reliability] = [:match_reliability] response_params[:edition_str] = edition_statement(marc_xml) end # Figure out the right service type value for this, fulltext, ToC, # whatever. response_params[:service_type_value] = service_type_for_856( field, ) # fulltext urls from MARC are always marked as specially stupid. response_params[:coverage_checked] = false response_params[:can_link_to_article] = false # Some debugging info, add the 001 bibID if we have one. response_params[:debug_info] = "BibID: #{marc_xml['001'].value}" if marc_xml['001'] # Add the response response = request.add_service_response(response_params) responses_added[response_params[:service_type_value]] ||= Array.new responses_added[response_params[:service_type_value]].push(response) end end end return responses_added end |
#edition_statement(marc, options = {}) ⇒ Object
From a marc record, get a string useful to display for identifying which edition/version of a work this represents.
196 197 198 199 200 201 202 203 204 205 206 207 208 209 210 211 212 213 214 215 216 217 218 219 220 221 222 223 224 225 226 227 228 229 230 231 232 233 234 235 236 237 238 239 |
# File 'app/mixin_logic/marc_helper.rb', line 196 def edition_statement(marc, = {}) [:include_repro_info] ||= true [:exclude_533_fields] = ['7','f','b', 'e'] parts = Array.new return "" unless marc #245$h GMD unless ( marc['245'].blank? || marc['245']['h'].blank? ) parts.push('(' + marc['245']['h'].gsub(/[^\w\s]/, '').strip.titlecase + ')') end #250 if ( marc['250']) parts.push( marc['250']['a'] ) unless marc['250']['a'].blank? parts.push( marc['250']['b'] ) unless marc['250']['b'].blank? end # 260 if ( marc['260']) if (marc['260']['b'] =~ /s\.n\./) parts.push(marc['260']['a']) unless marc['260']['a'].blank? else parts.push(marc['260']['b']) unless marc['260']['b'].blank? end parts.push( marc['260']['c'] ) unless marc['260']['c'].blank? end # 533 if [:include_repro_info] && marc['533'] marc['533'].subfields.each do |s| if ( s.code == 'a' ) parts.push(s.value.gsub(/[^\w\s]/, '') + ':' ) elsif (! [:exclude_533_fields].include?( s.code )) parts.push(s.value) end end end return nil if parts.length == 0 return parts.join(' ') end |
#get_title(marc) ⇒ Object
Take the title out of a marc record
189 190 191 |
# File 'app/mixin_logic/marc_helper.rb', line 189 def get_title(marc) marc['245'].find_all {|sf| sf.code == "a" || sf.code == "b" || sf.code == "k"}.collect {|sf| sf.text}.join(" ").sub(/\s*[;:\/.,]\s*$/) end |
#get_years(marc) ⇒ Object
A MARC record has two dates in it, date1 and date2. Exactly what they represent is something of an esoteric mystery. But this will return them both, in an array.
171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 |
# File 'app/mixin_logic/marc_helper.rb', line 171 def get_years(marc) array = [] # no marc 008? Weird, but okay. return array unless marc['008'] date1 = marc['008'].value[7,4] date1.strip! if date1 array.push(date1) unless date1.blank? date2 = marc['008'].value[11,4] date2.strip! if date2 array.push(date2) unless date2.blank? return array end |
#gmd_values ⇒ Object
AACR2 “General Material Designation” . While these are (I think?) controlled, it’s actually really hard to find the list. Maybe they’re only semi-controlled. ONE list can be found here: www.oclc.org/bibformats/en/onlinecataloging/default.shtm#BCGFECEG
245 246 247 248 249 |
# File 'app/mixin_logic/marc_helper.rb', line 245 def gmd_values # 'computer file' is an old one that may still be found in data. return ['activity card', 'art original','art reproduction','braille','chart','diorama','electronic resource','computer file', 'filmstrip','flash card','game','globe','kit','manuscript','map','microform','microscope slides','model','motion picture','music','picture','realia','slide','sound recording','technical drawing','text','toy','transparency','videorecording'] end |
#service_type_for_856(field, options) ⇒ Object
Take a ruby Marc Field object representing an 856 field, decide what umlaut service type value to map it to. Fulltext, ToC, etc. This is neccesarily a heuristic guess, Marc doesn’t have enough granularity to really let us know for sure – although if indicator2 is ‘2’ for ‘related resource’, we decide it is NOT fulltext.
143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 |
# File 'app/mixin_logic/marc_helper.rb', line 143 def service_type_for_856(field, ) [:default_service_type] ||= "fulltext_title_level" # LC records here at hopkins have "Table of contents only" in the 856$3 # Think that's a convention from LC? if (field['3'] && field['3'].downcase =~ /table of contents( only)?/) return "table_of_contents" elsif (field['3'] && field['3'].downcase =~ /description/) # If it contains the word 'description', it's probably an abstract. # That's the best we can do, sadly. return "abstract" elsif (field['3'] && field['3'].downcase == 'sample text') # LC records often include these links. return "excerpts" elsif ( field['u'] =~ /www\.loc\.gov/ ) # Any other loc.gov link, we know it's not full text, don't put # it in full text field, put it as "see also". return "highlighted_link" elsif field.indicator2 == '2' # 'related resource' return "highlighted_link" else return [:default_service_type] end end |
#should_skip_856_link?(request, marc_record, url) ⇒ Boolean
Used by #add_856_links. Complicated logic to try and avoid presenting a URL from the catalog that duplicates what SFX does, but present a URL from the catalog when it’s really needed.
One reason not to include Catalog links for an article-level citation, even if SFX provided no targets, is maybe SFX provided no targets because SFX knew that the _particular date_ requested is not available. The catalog doesn’t know that, but we don’t want to show a link from the catalog that SFX really already knew wasn’t going to be available.
So:
If this is a journal, skip the URL if it matches in our SFXUrl finder, because that means we think it’s an SFX controlled URL. But if it’s not a journal, use it anyway, because it’s probably an e-book that is not in SFX, even if it’s from a vendor who is in SFX. We use MARC leader byte 7 to tell if it’s a journal. Confusing enough? Not yet! Even if it is a journal, if this isn’t an article-level cite and there are no other full text already provided, we still include.
122 123 124 125 126 127 128 129 130 131 132 133 134 135 |
# File 'app/mixin_logic/marc_helper.rb', line 122 def should_skip_856_link?(request, marc_record, url) is_journal = (marc_record.leader[7,1] == 's') sfx_controlled = SfxUrl.sfx_controls_url?(url) # Do NOT skip if it's a title-level citation with no # existing full text entries. not_title_level_empty = !( request.title_level_citation? && request.get_service_type("fulltext").length == 0 ) result = ( is_journal && sfx_controlled && not_title_level_empty ) return result end |
#strip_gmd(arg_string, options = {}) ⇒ Object
removes something that looks like an AACR2 GMD in square brackets from the string. Pretty kludgey.
253 254 255 256 257 258 259 260 |
# File 'app/mixin_logic/marc_helper.rb', line 253 def strip_gmd(arg_string, = {}) [:replacement] ||= ':' gmd_values.each do |gmd_val| arg_string = arg_string.sub(/\[#{gmd_val}( \((tactile|braile|large print)\))?\]/, [:replacement]) end return arg_string end |