Method: Addressable::URI.normalize_component

Defined in:
lib/addressable/uri.rb

.normalize_component(component, character_class = CharacterClasses::RESERVED + CharacterClasses::UNRESERVED) ⇒ String

Normalizes the encoding of a URI component.

'9' to be percent encoded. If a <code>Regexp</code> is passed, the
 value <code>/[^b-zB-Z0-9]/</code> would have the same effect. A set of
 useful <code>String</code> values may be found in the
 <code>Addressable::URI::CharacterClasses</code> module. The default
 value is the reserved plus unreserved character classes specified in
 <a href="http://www.ietf.org/rfc/rfc3986.txt">RFC 3986</a>.

Examples:

Addressable::URI.normalize_component("simpl%65/%65xampl%65", "b-zB-Z")
=> "simple%2Fex%61mple"
Addressable::URI.normalize_component(
  "simpl%65/%65xampl%65", /[^b-zB-Z]/
)
=> "simple%2Fex%61mple"
Addressable::URI.normalize_component(
  "simpl%65/%65xampl%65",
  Addressable::URI::CharacterClasses::UNRESERVED
)
=> "simple%2Fexample"

Parameters:

  • component (String, #to_str)

    The URI component to encode.

  • character_class (String, Regexp) (defaults to: CharacterClasses::RESERVED + CharacterClasses::UNRESERVED)

    The characters which are not percent encoded. If a String is passed, the String must be formatted as a regular expression character class. (Do not include the surrounding square brackets.) For example, "b-zB-Z0-9" would cause everything but the letters ‘b’ through ‘z’ and the numbers ‘0’ through

Returns:

  • (String)

    The normalized component.



412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
436
437
438
439
440
441
442
# File 'lib/addressable/uri.rb', line 412

def self.normalize_component(component, character_class=
    CharacterClasses::RESERVED + CharacterClasses::UNRESERVED)
  return nil if component.nil?
  if !component.respond_to?(:to_str)
    raise TypeError, "Can't convert #{component.class} into String."
  end
  component = component.to_str
  if ![String, Regexp].include?(character_class.class)
    raise TypeError,
      "Expected String or Regexp, got #{character_class.inspect}"
  end
  if character_class.kind_of?(String)
    character_class = /[^#{character_class}]/
  end
  if component.respond_to?(:force_encoding)
    # We can't perform regexps on invalid UTF sequences, but
    # here we need to, so switch to ASCII.
    component = component.dup
    component.force_encoding(Encoding::ASCII_8BIT)
  end
  unencoded = self.unencode_component(component)
  begin
    encoded = self.encode_component(
      Addressable::IDNA.unicode_normalize_kc(unencoded),
      character_class
    )
  rescue ArgumentError
    encoded = self.encode_component(unencoded)
  end
  return encoded
end