Class: NacoNormalizer

Inherits:
Object
  • Object
show all
Defined in:
lib/naconormalizer.rb,
lib/naconormalizer/version.rb

Overview

A tiny shim around the OCLC's java code that performs NACO normalization, used by libraries (and others) to normalize author and title strings for sorting purposes.

See http://www.loc.gov/aba/pcc/naco/normrule-2.html

Java code adapted from https://code.google.com/p/oclcnaconormalizer/ and copyright OCLC

Author:

  • Bill Dueber

Constant Summary collapse

OCLCNormalizer =
org.oclc.util::NacoNormalize
Defaults =
{ :keep_caps => false, :strip_html=>true, :keep_first_comma => true }
VERSION =
"0.9.1"

Instance Method Summary collapse

Constructor Details

#initialize(opts = {}) ⇒ NacoNormalizer

Create a new normalizer that will use the passed options (if any)

Examples:

author_normalizer = NacoNormalizer.new
title_normalizer  = NacoNormalizer.new(:keep_first_comma => false)

sortable_author = author_normalizer.normalize(author_name)
sortable_title  = title_normalizer.normalize(title)

Parameters:

  • opts (Hash) (defaults to: {})

    The hash of options

Options Hash (opts):

  • :keep_caps (Boolean) — default: false

    Don't "lowercase" capital letters

  • :keep_first_comma (Boolean) — default: true

    Keep the first comma, useful for Lastname,Firstname data

  • :strip_html (Boolean) — default: true

    Strip any spurious HTML out of the passed string when normalizing



35
36
37
38
39
40
# File 'lib/naconormalizer.rb', line 35

def initialize(opts={})
  opts = opts.merge(Defaults)
  @keep_caps = opts[:keep_caps]
  @strip_html = opts[:strip_html]
  @keep_first_comma = opts[:keep_first_comma]
end

Instance Method Details

#normalize(str, keep_first_comma = @keep_first_comma, keep_caps = @keep_caps, strip_html = @strip_html) ⇒ String

Normalize a string using the options passed to the constructor

Parameters:

  • str (String)

    The string to normalize

Returns:

  • (String)

    The normalized string



45
46
47
# File 'lib/naconormalizer.rb', line 45

def normalize(str, keep_first_comma = @keep_first_comma, keep_caps = @keep_caps, strip_html = @strip_html )    
  OCLCNormalizer.nacoNormalize(str, keep_caps, strip_html, keep_first_comma)
end