Class: PennMARC::Subject

Inherits:
Helper
  • Object
show all
Defined in:
lib/pennmarc/helpers/subject.rb

Overview

This helper extracts subject heading in various ways to facilitate searching, faceting and display of subject values. Michael Gibney did a lot to “clean up” Subject parsing in discovery-app, but much of it was intended to support features (xfacet) that we will no longer support, and ties display and xfacet field parsing together too tightly to be preserved. As a result fo this, display methods and facet methods below are ported from their state prior to Michael’s 2/2021 subject parsing changes.

Constant Summary collapse

SEARCH_TAGS =
TODO:

why are 541 and 561 included here? these fields include info about source of acquisition

Tags that serve as sources for Subject search values

%w[541 561 600 610 611 630 650 651 653].freeze
VALID_SOURCE_INDICATORS =

Valid indicator 2 values indicating the source thesaurus for subject terms. These are:

  • 0: LCSH

  • 1: LC Children’s

  • 2: MeSH

  • 4: Source not specified (local?)

  • 7: Source specified in ǂ2

%w[0 1 2 4 7].freeze
DISPLAY_TAGS =

Tags that serve as sources for Subject facet values

%w[600 610 611 630 650 651].freeze
LOCAL_TAGS =

Local subject heading tags

%w[690 691 697].freeze

Constants included from Util

Util::TRAILING_PUNCTUATIONS_PATTERNS

Class Method Summary collapse

Methods included from Util

#append_relator, #append_trailing, #datafield_and_linked_alternate, #field_defined?, #field_or_its_linked_alternate?, #join_and_squish, #join_subfields, #linked_alternate, #linked_alternate_not_6_or_8, #no_subfield_value_matches?, #prefixed_subject_and_alternate, #relator, #relator_join_separator, #relator_term_subfield, #remove_paren_value_from_subfield_i, #subfield_defined?, #subfield_in?, #subfield_not_in?, #subfield_undefined?, #subfield_value?, #subfield_value_in?, #subfield_value_not_in?, #subfield_values, #subfield_values_for, #substring_after, #substring_before, #translate_relator, #trim_punctuation, #trim_trailing, #trim_trailing!, #valid_subject_genre_source_code?

Class Method Details

.childrens_show(record, override: true) ⇒ Array

Get Subjects from “Children” ontology

Parameters:

  • record (MARC::Record)
  • override (Boolean) (defaults to: true)

    remove undesirable terms or not

Returns:

  • (Array)

    array of children’s subject values for display



98
99
100
101
102
103
104
105
106
107
# File 'lib/pennmarc/helpers/subject.rb', line 98

def childrens_show(record, override: true)
  values = subject_fields(record, type: :display, options: { tags: DISPLAY_TAGS, indicator2: '1' })
           .filter_map { |field|
    term_hash = build_subject_hash(field)
    next if term_hash.blank? || term_hash[:count]&.zero?

    format_term type: :display, term: term_hash
  }.uniq
  override ? HeadingControl.term_override(values) : values
end

.facet(record, override: true) ⇒ Array<String>

Note:

this is ported mostly form MG’s new-style Subject parsing

All Subjects for faceting

Parameters:

  • record (MARC::Record)
  • override (Boolean) (defaults to: true)

    remove undesirable terms or not

Returns:

  • (Array<String>)

    array of all subject values for faceting



67
68
69
70
71
72
73
74
75
# File 'lib/pennmarc/helpers/subject.rb', line 67

def facet(record, override: true)
  values = subject_fields(record, type: :facet).filter_map { |field|
    term_hash = build_subject_hash(field)
    next if term_hash.blank? || term_hash[:count]&.zero?

    format_term type: :facet, term: term_hash
  }.uniq
  override ? HeadingControl.term_override(values) : values
end

.local_show(record, override: true) ⇒ Array

Get Subject values from DISPLAY_TAGS where indicator2 is 4 and LOCAL_TAGS. Do not include any values where sf2 includes “penncoi” (Community of Interest).

Parameters:

  • record (MARC::Record)
  • override (Boolean) (defaults to: true)

    to remove undesirable terms

Returns:

  • (Array)

    array of local subject values for display



131
132
133
134
135
136
137
138
139
140
141
142
143
# File 'lib/pennmarc/helpers/subject.rb', line 131

def local_show(record, override: true)
  local_fields = subject_fields(record, type: :display, options: { tags: DISPLAY_TAGS, indicator2: '4' }) +
                 subject_fields(record, type: :local)
  values = local_fields.filter_map { |field|
    next if subfield_value?(field, '2', /penncoi/)

    term_hash = build_subject_hash(field)
    next if term_hash.blank? || term_hash[:count]&.zero?

    format_term type: :display, term: term_hash
  }.uniq
  override ? HeadingControl.term_override(values) : values
end

.medical_show(record, override: true) ⇒ Array

Get Subjects from “MeSH” ontology

Parameters:

  • record (MARC::Record)
  • override (Boolean) (defaults to: true)

    remove undesirable terms or not

Returns:

  • (Array)

    array of MeSH subject values for display



114
115
116
117
118
119
120
121
122
123
# File 'lib/pennmarc/helpers/subject.rb', line 114

def medical_show(record, override: true)
  values = subject_fields(record, type: :display, options: { tags: DISPLAY_TAGS, indicator2: '2' })
           .filter_map { |field|
    term_hash = build_subject_hash(field)
    next if term_hash.blank? || term_hash[:count]&.zero?

    format_term type: :display, term: term_hash
  }.uniq
  override ? HeadingControl.term_override(values) : values
end

.search(record, relator_map: Mappers.relator) ⇒ Array<String>

TODO:

this includes subfields that may not be desired like 1 (uri) and 2 (source code) but this might be OK for a search (non-display) field?

All Subjects for searching. This includes most subfield content from any field contained in SEARCH_TAGS or 69X, including any linked 880 fields. Fields must have an indicator2 value in VALID_SOURCE_INDICATORS.

Parameters:

  • relator_map (Hash) (defaults to: Mappers.relator)
  • record (MARC::Record)

Returns:

  • (Array<String>)

    array of all subject values for search



36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
# File 'lib/pennmarc/helpers/subject.rb', line 36

def search(record, relator_map: Mappers.relator)
  subject_fields(record, type: :search).filter_map { |field|
    subj_parts = field.filter_map do |subfield|
      # TODO: use term hash here? pro/chr would be rejected...
      # TODO: should we care about punctuation in a search field? relator mapping?
      case subfield.code
      when '5', '6', '8', '7' then next
      when 'a'
        # remove %PRO or PRO or %CHR or CHR
        # remove any ? at the end
        subfield.value.gsub(/^%?(PRO|CHR)/, '').gsub(/\?$/, '').strip
      when '4'
        # sf 4 should contain a 3-letter code or URI "that specifies the relationship from the entity described
        # in the record to the entity referenced in the field"
        "#{subfield.value} #{translate_relator(subfield.value.to_sym, relator_map)}".strip
      else
        subfield.value
      end
    end
    next if subj_parts.empty?

    join_and_squish subj_parts
  }.uniq
end

.show(record, override: true) ⇒ Array

All Subjects for display. This includes all DISPLAY_TAGS and LOCAL_TAGS. For tags that specify a source, only those with an allowed source code (see ALLOWED_SOURCE_CODES) are included.

Parameters:

  • record (MARC::Record)
  • override (Boolean) (defaults to: true)

    to remove undesirable terms or not

Returns:

  • (Array)

    array of all subject values for display



83
84
85
86
87
88
89
90
91
# File 'lib/pennmarc/helpers/subject.rb', line 83

def show(record, override: true)
  values = subject_fields(record, type: :all).filter_map { |field|
    term_hash = build_subject_hash(field)
    next if term_hash.blank? || term_hash[:count]&.zero?

    format_term type: :display, term: term_hash
  }.uniq
  override ? HeadingControl.term_override(values) : values
end