Module: Escape
- Defined in:
- lib/escape.rb
Overview
Escape module provides several escape functions.
-
URI
-
HTML
-
shell command
Constant Summary collapse
- HTML_TEXT_ESCAPE_HASH =
:stopdoc:
{ '&' => '&', '<' => '<', '>' => '>', }
- HTML_ATTR_ESCAPE_HASH =
:stopdoc:
{ '&' => '&', '<' => '<', '>' => '>', '"' => '"', }
Class Method Summary collapse
-
.html_attr(str) ⇒ Object
Escape.html_attr encodes a string as a double-quoted HTML attribute using character references.
-
.html_form(pairs, sep = '&') ⇒ Object
Escape.html_form composes HTML form key-value pairs as a x-www-form-urlencoded encoded string.
-
.html_form_fast(pairs, sep = ';') ⇒ Object
:stopdoc:.
-
.html_text(str) ⇒ Object
Escape.html_text escapes a string appropriate for HTML text using character references.
-
.shell_command(command) ⇒ Object
Escape.shell_command composes a sequence of words to a single shell command line.
-
.shell_single_word(str) ⇒ Object
Escape.shell_single_word quotes shell meta characters.
-
.uri_path(str) ⇒ Object
Escape.uri_path escapes URI path using percent-encoding.
-
.uri_segment(str) ⇒ Object
Escape.uri_segment escapes URI segment using percent-encoding.
Class Method Details
.html_attr(str) ⇒ Object
Escape.html_attr encodes a string as a double-quoted HTML attribute using character references.
Escape.html_attr("abc") #=> "\"abc\""
Escape.html_attr("a&b") #=> "\"a&b\""
Escape.html_attr("ab&<>\"c") #=> "\"ab&<>"c\""
Escape.html_attr("a'c") #=> "\"a'c\""
It escapes 4 characters:
-
‘&’ to ‘&’
-
‘<’ to ‘<’
-
‘>’ to ‘>’
-
‘“’ to ‘"’
244 245 246 |
# File 'lib/escape.rb', line 244 def html_attr(str) '"' + str.gsub(/[&<>"]/) {|ch| HTML_ATTR_ESCAPE_HASH[ch] } + '"' end |
.html_form(pairs, sep = '&') ⇒ Object
Escape.html_form composes HTML form key-value pairs as a x-www-form-urlencoded encoded string.
Escape.html_form takes an array of pair of strings or an hash from string to string.
Escape.html_form([["a","b"], ["c","d"]]) #=> "a=b&c=d"
Escape.html_form({"a"=>"b", "c"=>"d"}) #=> "a=b&c=d"
In the array form, it is possible to use same key more than once. (It is required for a HTML form which contains checkboxes and select element with multiple attribute.)
Escape.html_form([["k","1"], ["k","2"]]) #=> "k=1&k=2"
If the strings contains characters which must be escaped in x-www-form-urlencoded, they are escaped using %-encoding.
Escape.html_form([["k=","&;="]]) #=> "k%3D=%26%3B%3D"
The separator can be specified by the optional second argument.
Escape.html_form([["a","b"], ["c","d"]], ";") #=> "a=b;c=d"
See HTML 4.01 for details.
164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 |
# File 'lib/escape.rb', line 164 def html_form(pairs, sep='&') r = '' first = true pairs.each {|k, v| # query-chars - pct-encoded - x-www-form-urlencoded-delimiters = # unreserved / "!" / "$" / "'" / "(" / ")" / "*" / "," / ":" / "@" / "/" / "?" # query-char - pct-encoded = unreserved / sub-delims / ":" / "@" / "/" / "?" # query-char = pchar / "/" / "?" = unreserved / pct-encoded / sub-delims / ":" / "@" / "/" / "?" # unreserved = ALPHA / DIGIT / "-" / "." / "_" / "~" # sub-delims = "!" / "$" / "&" / "'" / "(" / ")" / "*" / "+" / "," / ";" / "=" # x-www-form-urlencoded-delimiters = "&" / "+" / ";" / "=" r << sep if !first first = false k.each_byte {|byte| ch = byte.chr if %r{[^0-9A-Za-z\-\._~:/?@!\$'()*,]}n =~ ch r << "%" << ch.unpack("H2")[0].upcase else r << ch end } r << '=' v.each_byte {|byte| ch = byte.chr if %r{[^0-9A-Za-z\-\._~:/?@!\$'()*,]}n =~ ch r << "%" << ch.unpack("H2")[0].upcase else r << ch end } } r end |
.html_form_fast(pairs, sep = ';') ⇒ Object
:stopdoc:
120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 |
# File 'lib/escape.rb', line 120 def html_form_fast(pairs, sep=';') pairs.map {|k, v| # query-chars - pct-encoded - x-www-form-urlencoded-delimiters = # unreserved / "!" / "$" / "'" / "(" / ")" / "*" / "," / ":" / "@" / "/" / "?" # query-char - pct-encoded = unreserved / sub-delims / ":" / "@" / "/" / "?" # query-char = pchar / "/" / "?" = unreserved / pct-encoded / sub-delims / ":" / "@" / "/" / "?" # unreserved = ALPHA / DIGIT / "-" / "." / "_" / "~" # sub-delims = "!" / "$" / "&" / "'" / "(" / ")" / "*" / "+" / "," / ";" / "=" # x-www-form-urlencoded-delimiters = "&" / "+" / ";" / "=" k = k.gsub(%r{[^0-9A-Za-z\-\._~:/?@!\$'()*,]}n) { '%' + $&.unpack("H2")[0].upcase } v = v.gsub(%r{[^0-9A-Za-z\-\._~:/?@!\$'()*,]}n) { '%' + $&.unpack("H2")[0].upcase } "#{k}=#{v}" }.join(sep) end |
.html_text(str) ⇒ Object
Escape.html_text escapes a string appropriate for HTML text using character references.
It escapes 3 characters:
-
‘&’ to ‘&’
-
‘<’ to ‘<’
-
‘>’ to ‘>’
Escape.html_text("abc") #=> "abc"
Escape.html_text("a & b < c > d") #=> "a & b < c > d"
This function is not appropriate for escaping HTML element attribute because quotes are not escaped.
218 219 220 |
# File 'lib/escape.rb', line 218 def html_text(str) str.gsub(/[&<>]/) {|ch| HTML_TEXT_ESCAPE_HASH[ch] } end |
.shell_command(command) ⇒ Object
Escape.shell_command composes a sequence of words to a single shell command line. All shell meta characters are quoted and the words are concatenated with interleaving space.
Escape.shell_command(["ls", "/"]) #=> "ls /"
Escape.shell_command(["echo", "*"]) #=> "echo '*'"
Note that system(*command) and system(Escape.shell_command(command)) is roughly same. There are two exception as follows.
-
The first is that the later may invokes /bin/sh.
-
The second is an interpretation of an array with only one element: the element is parsed by the shell with the former but it is recognized as single word with the later. For example, system(*[“echo foo”]) invokes echo command with an argument “foo”. But system(Escape.shell_command([“echo foo”])) invokes “echo foo” command without arguments (and it probably fails).
52 53 54 |
# File 'lib/escape.rb', line 52 def shell_command(command) command.map {|word| shell_single_word(word) }.join(' ') end |
.shell_single_word(str) ⇒ Object
65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 |
# File 'lib/escape.rb', line 65 def shell_single_word(str) if str.empty? "''" elsif %r{\A[0-9A-Za-z+,./:=@_-]+\z} =~ str str else result = '' str.scan(/('+)|[^']+/) { if $1 result << %q{\'} * $1.length else result << "'#{$&}'" end } result end end |
.uri_path(str) ⇒ Object
Escape.uri_path escapes URI path using percent-encoding. The given path should be a sequence of (non-escaped) segments separated by “/”. The segments cannot contains “/”.
Escape.uri_path("a/b/c") #=> "a/b/c"
Escape.uri_path("a?b/c?d/e?f") #=> "a%3Fb/c%3Fd/e%3Ff"
The path is the part after authority before query in URI, as follows.
scheme:///path#fragment
See RFC 3986 for details of URI.
Note that this function is not appropriate to convert OS path to URI.
115 116 117 |
# File 'lib/escape.rb', line 115 def uri_path(str) str.gsub(%r{[^/]+}n) { uri_segment($&) } end |
.uri_segment(str) ⇒ Object
Escape.uri_segment escapes URI segment using percent-encoding.
Escape.uri_segment("a/b") #=> "a%2Fb"
The segment is “/”-splitted element after authority before query in URI, as follows.
scheme://authority/segment1/segment2/.../segmentN?query#fragment
See RFC 3986 for details of URI.
92 93 94 95 96 97 98 99 |
# File 'lib/escape.rb', line 92 def uri_segment(str) # pchar - pct-encoded = unreserved / sub-delims / ":" / "@" # unreserved = ALPHA / DIGIT / "-" / "." / "_" / "~" # sub-delims = "!" / "$" / "&" / "'" / "(" / ")" / "*" / "+" / "," / ";" / "=" str.gsub(%r{[^A-Za-z0-9\-._~!$&'()*+,;=:@]}n) { '%' + $&.unpack("H2")[0].upcase } end |