Module: XmlChar

Defined in:
lib/xml_char.rb

Overview

Purpose

This module provides Module procedures to escape special characters in a string so that it can be placed in an XML stream without causing havoc.

A true object-oriented version would reopen String and Fixnum and stick the methods in there. Personally I prefer to keep these special functions on standard objects in their own namespace where they can’t cause any nasty name clashes.

Usage

unclean_string="<ShowMe>this < that && that > this</ShowMe>"
XmlChar::xml_string(unclean_string)
=> "&lt;ShowMe&gt;this &lt; that &amp;&amp; that &gt; this&lt;/ShowMe&gt;"

Constant Summary collapse

CP1252 =
{ # :nodoc:
128 => 8364, # euro sign
130 => 8218, # single low-9 quotation mark
131 =>  402, # latin small letter f with hook
132 => 8222, # double low-9 quotation mark
133 => 8230, # horizontal ellipsis
134 => 8224, # dagger
135 => 8225, # double dagger
136 =>  710, # modifier letter circumflex accent
137 => 8240, # per mille sign
138 =>  352, # latin capital letter s with caron
139 => 8249, # single left-pointing angle quotation mark
140 =>  338, # latin capital ligature oe
142 =>  381, # latin capital letter z with caron
145 => 8216, # left single quotation mark
146 => 8217, # right single quotation mark
147 => 8220, # left double quotation mark
148 => 8221, # right double quotation mark
149 => 8226, # bullet
150 => 8211, # en dash
151 => 8212, # em dash
152 =>  732, # small tilde
153 => 8482, # trade mark sign
154 =>  353, # latin small letter s with caron
155 => 8250, # single right-pointing angle quotation mark
156 =>  339, # latin small ligature oe
158 =>  382, # latin small letter z with caron
159 =>  376 # latin capital letter y with diaeresis
}
PREDEFINED =
{    # :nodoc:
?' => '&apos;', # single quote
?" => '&quot;', # double quote
?& => '&amp;',  # ampersand
?< => '&lt;',   # left angle bracket
?> => '&gt;'}
VALID =
[[0x9, 0xA, 0xD], (0x20..0xD7FF),   # :nodoc:
(0xE000..0xFFFD), (0x10000..0x10FFFF)]

Class Method Summary collapse

Class Method Details

.xml_char(normal_char) ⇒ Object

Create an XML escaped version of normal_char

Contract

require: normal_char && normal_char.integer?


100
101
102
103
104
105
106
107
108
# File 'lib/xml_char.rb', line 100

def self.xml_char(normal_char)
  char_value = XmlChar::CP1252[normal_char] || normal_char
  if XmlChar::VALID.find { |range| range.include? char_value }
    XmlChar::PREDEFINED[char_value] ||
     	(char_value<0x80 ? char_value.chr : "&##{char_value};")
  else
    "*"
  end
end

.xml_string(normal_string) ⇒ Object

:startdoc:

Create an XML escaped version of normal_string

Contract

require:: !normal_string.nil?


87
88
89
90
91
# File 'lib/xml_char.rb', line 87

def self.xml_string(normal_string)
  normal_string.unpack('U*').map {|n| XmlChar::xml_char(n)}.join # ASCII, UTF-8
rescue
  normal_string.unpack('C*').map {|n| XmlChar::xml_char(n)}.join # ISO-8859-1, WIN-1252
end