Module: Escape

Defined in:: lib/vendor/escape/lib/escape.rb

Overview

Escape module provides several escape functions.

URI
HTML
shell command

Constant Summary collapse

HTML_TEXT_ESCAPE_HASH = :stopdoc:

{
  '&' => '&amp;',
  '<' => '&lt;',
  '>' => '&gt;',
}

HTML_ATTR_ESCAPE_HASH = :stopdoc:

{
  '&' => '&amp;',
  '<' => '&lt;',
  '>' => '&gt;',
  '"' => '&quot;',
}

Class Method Summary collapse

.html_attr(str) ⇒ Object

Escape.html_attr encodes a string as a double-quoted HTML attribute using character references.
.html_form(pairs, sep = '&') ⇒ Object

Escape.html_form composes HTML form key-value pairs as a x-www-form-urlencoded encoded string.
.html_form_fast(pairs, sep = ';') ⇒ Object

:stopdoc:.
.html_text(str) ⇒ Object

Escape.html_text escapes a string appropriate for HTML text using character references.
.shell_command(command) ⇒ Object

Escape.shell_command composes a sequence of words to a single shell command line.
.shell_single_word(str) ⇒ Object

Escape.shell_single_word quotes shell meta characters.
.uri_path(str) ⇒ Object

Escape.uri_path escapes URI path using percent-encoding.
.uri_segment(str) ⇒ Object

Escape.uri_segment escapes URI segment using percent-encoding.

Class Method Details

.html_attr(str) ⇒ `Object`

Escape.html_attr encodes a string as a double-quoted HTML attribute using character references.

Escape.html_attr("abc") #=> "\"abc\""
Escape.html_attr("a&b") #=> "\"a&amp;b\""
Escape.html_attr("ab&<>\"c") #=> "\"ab&amp;&lt;&gt;&quot;c\""
Escape.html_attr("a'c") #=> "\"a'c\""

It escapes 4 characters:

‘&’ to ‘&’
‘<’ to ‘<’
‘>’ to ‘>’
‘“’ to ‘"’



244
245
246

# File 'lib/vendor/escape/lib/escape.rb', line 244

def html_attr(str)
  '"' + str.gsub(/[&<>"]/) {|ch| HTML_ATTR_ESCAPE_HASH[ch] } + '"'
end

.html_form(pairs, sep = '&') ⇒ `Object`

Escape.html_form composes HTML form key-value pairs as a x-www-form-urlencoded encoded string.

Escape.html_form takes an array of pair of strings or an hash from string to string.

Escape.html_form([["a","b"], ["c","d"]]) #=> "a=b&c=d"
Escape.html_form({"a"=>"b", "c"=>"d"}) #=> "a=b&c=d"

In the array form, it is possible to use same key more than once. (It is required for a HTML form which contains checkboxes and select element with multiple attribute.)

Escape.html_form([["k","1"], ["k","2"]]) #=> "k=1&k=2"

If the strings contains characters which must be escaped in x-www-form-urlencoded, they are escaped using %-encoding.

Escape.html_form([["k=","&;="]]) #=> "k%3D=%26%3B%3D"

The separator can be specified by the optional second argument.

Escape.html_form([["a","b"], ["c","d"]], ";") #=> "a=b;c=d"

See HTML 4.01 for details.

# File 'lib/vendor/escape/lib/escape.rb', line 164

def html_form(pairs, sep='&')
  r = ''
  first = true
  pairs.each {|k, v|
    # query-chars - pct-encoded - x-www-form-urlencoded-delimiters =
    #   unreserved / "!" / "$" / "'" / "(" / ")" / "*" / "," / ":" / "@" / "/" / "?"
    # query-char - pct-encoded = unreserved / sub-delims / ":" / "@" / "/" / "?"
    # query-char = pchar / "/" / "?" = unreserved / pct-encoded / sub-delims / ":" / "@" / "/" / "?"
    # unreserved = ALPHA / DIGIT / "-" / "." / "_" / "~"
    # sub-delims = "!" / "$" / "&" / "'" / "(" / ")" / "*" / "+" / "," / ";" / "="
    # x-www-form-urlencoded-delimiters = "&" / "+" / ";" / "="
    r << sep if !first
    first = false
    k.each_byte {|byte|
      ch = byte.chr
      if %r{[^0-9A-Za-z\-\._~:/?@!\$'()*,]}n =~ ch
        r << "%" << ch.unpack("H2")[0].upcase
      else
        r << ch
      end
    }
    r << '='
    v.each_byte {|byte|
      ch = byte.chr
      if %r{[^0-9A-Za-z\-\._~:/?@!\$'()*,]}n =~ ch
        r << "%" << ch.unpack("H2")[0].upcase
      else
        r << ch
      end
    }
  }
  r
end

.html_form_fast(pairs, sep = ';') ⇒ `Object`

:stopdoc:

# File 'lib/vendor/escape/lib/escape.rb', line 120

def html_form_fast(pairs, sep=';')
  pairs.map {|k, v|
    # query-chars - pct-encoded - x-www-form-urlencoded-delimiters =
    #   unreserved / "!" / "$" / "'" / "(" / ")" / "*" / "," / ":" / "@" / "/" / "?"
    # query-char - pct-encoded = unreserved / sub-delims / ":" / "@" / "/" / "?"
    # query-char = pchar / "/" / "?" = unreserved / pct-encoded / sub-delims / ":" / "@" / "/" / "?"
    # unreserved = ALPHA / DIGIT / "-" / "." / "_" / "~"
    # sub-delims = "!" / "$" / "&" / "'" / "(" / ")" / "*" / "+" / "," / ";" / "="
    # x-www-form-urlencoded-delimiters = "&" / "+" / ";" / "="
    k = k.gsub(%r{[^0-9A-Za-z\-\._~:/?@!\$'()*,]}n) {
      '%' + $&.unpack("H2")[0].upcase
    }
    v = v.gsub(%r{[^0-9A-Za-z\-\._~:/?@!\$'()*,]}n) {
      '%' + $&.unpack("H2")[0].upcase
    }
    "#{k}=#{v}"
  }.join(sep)
end

.html_text(str) ⇒ `Object`

Escape.html_text escapes a string appropriate for HTML text using character references.

It escapes 3 characters:

‘&’ to ‘&’
‘<’ to ‘<’
‘>’ to ‘>’

Escape.html_text("abc") #=> "abc"
Escape.html_text("a & b < c > d") #=> "a &amp; b &lt; c &gt; d"

This function is not appropriate for escaping HTML element attribute because quotes are not escaped.



218
219
220

# File 'lib/vendor/escape/lib/escape.rb', line 218

def html_text(str)
  str.gsub(/[&<>]/) {|ch| HTML_TEXT_ESCAPE_HASH[ch] }
end

.shell_command(command) ⇒ `Object`

Escape.shell_command composes a sequence of words to a single shell command line. All shell meta characters are quoted and the words are concatenated with interleaving space.

Escape.shell_command(["ls", "/"]) #=> "ls /"
Escape.shell_command(["echo", "*"]) #=> "echo '*'"

Note that system(*command) and system(Escape.shell_command(command)) is roughly same. There are two exception as follows.

The first is that the later may invokes /bin/sh.
The second is an interpretation of an array with only one element: the element is parsed by the shell with the former but it is recognized as single word with the later. For example, system(*[“echo foo”]) invokes echo command with an argument “foo”. But system(Escape.shell_command([“echo foo”])) invokes “echo foo” command without arguments (and it probably fails).



52
53
54

# File 'lib/vendor/escape/lib/escape.rb', line 52

def shell_command(command)
  command.map {|word| shell_single_word(word) }.join(' ')
end

.shell_single_word(str) ⇒ `Object`

Escape.shell_single_word quotes shell meta characters.

The result string is always single shell word, even if the argument is “”. Escape.shell_single_word(“”) returns “””.

Escape.shell_single_word("") #=> "''"
Escape.shell_single_word("foo") #=> "foo"
Escape.shell_single_word("*") #=> "'*'"

# File 'lib/vendor/escape/lib/escape.rb', line 65

def shell_single_word(str)
  if str.empty?
    "''"
  elsif %r{\A[0-9A-Za-z+,./:=@_-]+\z} =~ str
    str
  else
    result = ''
    str.scan(/('+)|[^']+/) {
      if $1
        result << %q{\'} * $1.length
      else
        result << "'#{$&}'"
      end
    }
    result
  end
end

.uri_path(str) ⇒ `Object`

Escape.uri_path escapes URI path using percent-encoding. The given path should be a sequence of (non-escaped) segments separated by “/”. The segments cannot contains “/”.

Escape.uri_path("a/b/c") #=> "a/b/c"
Escape.uri_path("a?b/c?d/e?f") #=> "a%3Fb/c%3Fd/e%3Ff"

The path is the part after authority before query in URI, as follows.

scheme://authority/path#fragment

See RFC 3986 for details of URI.

Note that this function is not appropriate to convert OS path to URI.



115
116
117

# File 'lib/vendor/escape/lib/escape.rb', line 115

def uri_path(str)
  str.gsub(%r{[^/]+}n) { uri_segment($&) }
end

.uri_segment(str) ⇒ `Object`

Escape.uri_segment escapes URI segment using percent-encoding.

Escape.uri_segment("a/b") #=> "a%2Fb"

The segment is “/”-splitted element after authority before query in URI, as follows.

scheme://authority/segment1/segment2/.../segmentN?query#fragment

See RFC 3986 for details of URI.

# File 'lib/vendor/escape/lib/escape.rb', line 92

def uri_segment(str)
  # pchar - pct-encoded = unreserved / sub-delims / ":" / "@"
  # unreserved = ALPHA / DIGIT / "-" / "." / "_" / "~"
  # sub-delims = "!" / "$" / "&" / "'" / "(" / ")" / "*" / "+" / "," / ";" / "="
  str.gsub(%r{[^A-Za-z0-9\-._~!$&'()*+,;=:@]}n) {
    '%' + $&.unpack("H2")[0].upcase
  }
end

Module: Escape

Overview

Constant Summary collapse

Class Method Summary collapse

Class Method Details

.html_attr(str) ⇒ Object

.html_form(pairs, sep = '&') ⇒ Object

.html_form_fast(pairs, sep = ';') ⇒ Object

.html_text(str) ⇒ Object

.shell_command(command) ⇒ Object

.shell_single_word(str) ⇒ Object

.uri_path(str) ⇒ Object

.uri_segment(str) ⇒ Object