Class: Hermeneutics::URLText
- Inherits:
-
Object
- Object
- Hermeneutics::URLText
- Defined in:
- lib/hermeneutics/escape.rb
Overview
URL-able representation
What’s acually happening
URLs may not contain spaces and serveral character as slashes, ampersands etc. These characters will be masked by a percent sign and two hex digits representing the ASCII code. Eight bit characters should be masked the same way.
An URL line does not store encoding information by itself. A locator may either say one of these:
http://www.example.com/subdir/index.html?umlfield=%C3%BCber+alles
http://www.example.com/subdir/index.html?umlfield=%FCber+alles
The reading CGI has to decide on itself how to treat it.
Examples
URLText.encode "'Stop!' said Fred." #=> "%27Stop%21%27+said+Fred."
URLText.decode "%27Stop%21%27+said+Fred%2e"
#=> "'Stop!' said Fred."
Defined Under Namespace
Classes: Dict
Constant Summary collapse
- PAIR_SET =
:stopdoc:
"="
- PAIR_SEP =
"&"
Instance Attribute Summary collapse
-
#keep_8bit ⇒ Object
Returns the value of attribute keep_8bit.
-
#keep_space ⇒ Object
Returns the value of attribute keep_space.
-
#mask_space ⇒ Object
Returns the value of attribute mask_space.
Class Method Summary collapse
-
.decode(str) ⇒ Object
:call-seq: decode( str) -> str decode( str, encoding) -> str.
-
.decode_hash(qstr) ⇒ Object
:call-seq: decode_hash( str) -> hash decode_hash( str) { |key,val| … } -> nil or int.
- .encode(str) ⇒ Object
- .encode_hash(hash) ⇒ Object
- .mkurl(path, hash, anchor = nil) ⇒ Object
- .std ⇒ Object
Instance Method Summary collapse
- #decode(str) ⇒ Object
- #decode_hash(qstr, &block) ⇒ Object
-
#encode(str) ⇒ Object
:call-seq: encode( str) -> str.
-
#encode_hash(hash) ⇒ Object
:call-seq: encode_hash( hash) -> str.
-
#initialize(keep_8bit: nil, keep_space: nil, mask_space: nil) ⇒ URLText
constructor
:call-seq: new( hash) -> urltext.
-
#mkurl(path, hash = nil, anchor = nil) ⇒ Object
:call-seq: mkurl( path, hash, anchor = nil) -> str.
Constructor Details
#initialize(keep_8bit: nil, keep_space: nil, mask_space: nil) ⇒ URLText
:call-seq:
new( hash) -> urltext
Creates a URLText
converter.
The parameters may be given as values or as a hash.
utx = URLText.new keep_8bit: true, keep_space: false
See the encode
method for an explanation of these parameters.
267 268 269 270 271 |
# File 'lib/hermeneutics/escape.rb', line 267 def initialize keep_8bit: nil, keep_space: nil, mask_space: nil @keep_8bit = keep_8bit @keep_space = keep_space @mask_space = mask_space end |
Instance Attribute Details
#keep_8bit ⇒ Object
Returns the value of attribute keep_8bit.
254 255 256 |
# File 'lib/hermeneutics/escape.rb', line 254 def keep_8bit @keep_8bit end |
#keep_space ⇒ Object
Returns the value of attribute keep_space.
254 255 256 |
# File 'lib/hermeneutics/escape.rb', line 254 def keep_space @keep_space end |
#mask_space ⇒ Object
Returns the value of attribute mask_space.
254 255 256 |
# File 'lib/hermeneutics/escape.rb', line 254 def mask_space @mask_space end |
Class Method Details
.decode(str) ⇒ Object
:call-seq:
decode( str) -> str
decode( str, encoding) -> str
Decode the contained string.
utx = URLText.new
utx.decode "%27Stop%21%27+said+Fred%2e" #=> "'Stop!' said Fred."
The encoding will be kept. That means that an invalidly encoded string could be produced.
a = "bl%F6d"
a.encode! "utf-8"
d = utx.decode a
d =~ /./ #=> "invalid byte sequence in UTF-8 (ArgumentError)"
457 458 459 460 461 462 463 |
# File 'lib/hermeneutics/escape.rb', line 457 def decode str r = str.new_string r.tr! "+", " " r.gsub! /(?:%([0-9A-F]{2}))/i do $1.hex.chr end r.force_encoding str.encoding r end |
.decode_hash(qstr) ⇒ Object
:call-seq:
decode_hash( str) -> hash
decode_hash( str) { |key,val| ... } -> nil or int
Decode a URL-style encoded string to a Hash
. In case a block is given, the number of key-value pairs is returned.
str = "a=%3B%3B%3B&x=%26auml%3B%26ouml%3B%26uuml%3B"
URLText.decode_hash str do |k,v|
puts "#{k} = #{v}"
end
Output:
a = ;;;
x = äöü
482 483 484 485 486 487 488 489 490 491 492 493 494 495 |
# File 'lib/hermeneutics/escape.rb', line 482 def decode_hash qstr if block_given? then i = 0 each_pair qstr do |k,v| yield k, v i += 1 end i.nonzero? else Dict.create do |h| each_pair qstr do |k,v| h.parse k, v end end end end |
.encode(str) ⇒ Object
428 429 430 |
# File 'lib/hermeneutics/escape.rb', line 428 def encode str std.encode str end |
.encode_hash(hash) ⇒ Object
432 433 434 |
# File 'lib/hermeneutics/escape.rb', line 432 def encode_hash hash std.encode_hash hash end |
.mkurl(path, hash, anchor = nil) ⇒ Object
436 437 438 |
# File 'lib/hermeneutics/escape.rb', line 436 def mkurl path, hash, anchor = nil std.mkurl path, hash, anchor end |
.std ⇒ Object
424 425 426 |
# File 'lib/hermeneutics/escape.rb', line 424 def std @std ||= new end |
Instance Method Details
#decode(str) ⇒ Object
414 415 416 |
# File 'lib/hermeneutics/escape.rb', line 414 def decode str self.class.decode str end |
#decode_hash(qstr, &block) ⇒ Object
418 419 420 |
# File 'lib/hermeneutics/escape.rb', line 418 def decode_hash qstr, &block self.class.decode_hash qstr, &block end |
#encode(str) ⇒ Object
:call-seq:
encode( str) -> str
Create a string that contains %XX-encoded bytes.
utx = URLText.new
utx.encode "'Stop!' said Fred." #=> "%27Stop%21%27+said+Fred."
The result will not contain any 8-bit characters, except when keep_8bit
is set. The result will be in the same encoding as the argument although this normally has no meaning.
utx = URLText.new keep_8bit: true
s = "< ä >".encode "UTF-8"
utx.encode s #=> "%3C+\u{e4}+%3E" in UTF-8
s = "< ä >".encode "ISO-8859-1"
utx.encode s #=> "%3C+\xe4+%3E" in ISO-8859-1
A space “ ” will not be replaced by a plus “”+ if keep_space
is set.
utx = URLText.new keep_space: true
s = "< x >"
utx.encode s #=> "%3C x %3E"
When mask_space
is set, then a space will be represented as “%20”,
301 302 303 304 305 306 307 308 309 310 311 312 313 314 |
# File 'lib/hermeneutics/escape.rb', line 301 def encode str r = str.new_string r.force_encoding Encoding::ASCII_8BIT unless @keep_8bit r.gsub! %r/([^a-zA-Z0-9_.-])/ do |c| if c == " " and not @mask_space then @keep_space ? c : "+" elsif not @keep_8bit or c.ascii_only? then "%%%02X" % c.ord else c end end r.encode! str.encoding end |
#encode_hash(hash) ⇒ Object
381 382 383 384 385 386 387 388 389 390 |
# File 'lib/hermeneutics/escape.rb', line 381 def encode_hash hash hash.map { |(k,v)| case v when nil then next when true then v = k when false then v = "" end [k, v].map { |x| encode x.to_s }.join PAIR_SET }.compact.join PAIR_SEP end |
#mkurl(path, hash = nil, anchor = nil) ⇒ Object
402 403 404 405 406 407 408 409 410 |
# File 'lib/hermeneutics/escape.rb', line 402 def mkurl path, hash = nil, anchor = nil unless Hash === hash then hash, anchor = anchor, hash end r = "#{path}" r << "?#{encode_hash hash}" if hash r << "##{anchor}" if anchor r end |