Module: URI
- Extended by:
- Escape
- Includes:
- REGEXP
- Included in:
- Generic
- Defined in:
- lib/extensions/uri/uri.rb,
lib/extensions/uri/uri/ftp.rb,
lib/extensions/uri/uri/http.rb,
lib/extensions/uri/uri/ldap.rb,
lib/extensions/uri/uri/https.rb,
lib/extensions/uri/uri/ldaps.rb,
lib/extensions/uri/uri/common.rb,
lib/extensions/uri/uri/mailto.rb,
lib/extensions/uri/uri/generic.rb
Overview
uri/common.rb
- Author
-
Akira Yamada <[email protected]>
- Revision
-
$Id: common.rb 31799 2011-05-29 22:49:36Z yugui $
- License
-
You can redistribute it and/or modify it under the same term as Ruby.
Defined Under Namespace
Modules: Escape, REGEXP, Util Classes: BadURIError, Error, FTP, Generic, HTTP, HTTPS, InvalidComponentError, InvalidURIError, LDAP, LDAPS, MailTo, Parser
Constant Summary collapse
- VERSION_CODE =
:stopdoc:
'000911'.freeze
- VERSION =
VERSION_CODE.scan(/../).collect{|n| n.to_i}.join('.').freeze
- DEFAULT_PARSER =
class Parser
Parser.new
- TBLENCWWWCOMP_ =
:nodoc:
{}
- TBLDECWWWCOMP_ =
:nodoc:
{}
- HTML5ASCIIINCOMPAT =
[Encoding::UTF_7, Encoding::UTF_16BE, Encoding::UTF_16LE,
[]
- WFKV_ =
:nodoc:
'(?:%\h\h|[^%#=;&])'
- @@schemes =
{}
Class Method Summary collapse
-
.decode_www_form(str, enc = "UTF-8") ⇒ Object
Decode URL-encoded form data from given
str
. -
.decode_www_form_component(str, enc = "UTF-8") ⇒ Object
Decode given
str
of URL-encoded form data. -
.encode_www_form(enum) ⇒ Object
Generate URL-encoded form data from given
enum
. -
.encode_www_form_component(str) ⇒ Object
Encode given
str
to URL-encoded form data. -
.extract(str, schemes = nil, &block) ⇒ Object
Synopsis.
-
.join(*str) ⇒ Object
Synopsis.
-
.parse(uri) ⇒ Object
Synopsis.
-
.regexp(schemes = nil) ⇒ Object
Synopsis.
- .scheme_list ⇒ Object
-
.split(uri) ⇒ Object
Synopsis.
Methods included from Escape
Class Method Details
.decode_www_form(str, enc = "UTF-8") ⇒ Object
Decode URL-encoded form data from given str
.
This decodes application/x-www-form-urlencoded data and returns array of key-value array. This internally uses URI.decode_www_form_component.
charset hack is not supported now because the mapping from given charset to Ruby’s encoding is not clear yet. see also www.w3.org/TR/html5/syntax.html#character-encodings-0
This refers www.w3.org/TR/html5/forms.html#url-encoded-form-data
ary = URI.decode_www_form(“a=1&a=2&b=3”) p ary #=> [[‘a’, ‘1’], [‘a’, ‘2’], [‘b’, ‘3’]] p ary.assoc(‘a’).last #=> ‘1’ p ary.assoc(‘b’).last #=> ‘3’ p ary.rassoc(‘a’).last #=> ‘2’ p Hash # => “b”=>“3”
See URI.decode_www_form_component, URI.encode_www_form
836 837 838 839 840 841 842 843 844 845 846 |
# File 'lib/extensions/uri/uri/common.rb', line 836 def self.decode_www_form(str, enc="UTF-8") #Encoding::UTF_8) return [] if str.empty? unless /\A#{WFKV_}*=#{WFKV_}*(?:[;&]#{WFKV_}*=#{WFKV_}*)*\z/o =~ str raise ArgumentError, "invalid data of application/x-www-form-urlencoded (#{str})" end ary = [] $&.scan(/([^=;&]+)=([^;&]*)/) do ary << [decode_www_form_component($1, enc), decode_www_form_component($2, enc)] end ary end |
.decode_www_form_component(str, enc = "UTF-8") ⇒ Object
Decode given str
of URL-encoded form data.
This decods + to SP.
See URI.encode_www_form_component, URI.decode_www_form
761 762 763 764 765 766 767 768 769 770 771 772 773 774 775 776 777 778 779 780 |
# File 'lib/extensions/uri/uri/common.rb', line 761 def self.decode_www_form_component(str, enc="UTF-8") #Encoding::UTF_8) if TBLDECWWWCOMP_.empty? tbl = {} 256.times do |i| h, l = i>>4, i&15 tbl['%%%X%X' % [h, l]] = i.chr tbl['%%%x%X' % [h, l]] = i.chr tbl['%%%X%x' % [h, l]] = i.chr tbl['%%%x%x' % [h, l]] = i.chr end tbl['+'] = ' ' begin TBLDECWWWCOMP_.replace(tbl) TBLDECWWWCOMP_.freeze rescue end end raise ArgumentError, "invalid %-encoding (#{str})" unless /\A(?:%\h\h|[^%]+)*\z/ =~ str str.gsub(/\+|%\h\h/, TBLDECWWWCOMP_).force_encoding(enc) end |
.encode_www_form(enum) ⇒ Object
Generate URL-encoded form data from given enum
.
This generates application/x-www-form-urlencoded data defined in HTML5 from given an Enumerable object.
This internally uses URI.encode_www_form_component(str).
This doesn’t convert encodings of give items, so convert them before call this method if you want to send data as other than original encoding or mixed encoding data. (strings which is encoded in HTML5 ASCII incompatible encoding is converted to UTF-8)
This doesn’t treat files. When you send a file, use multipart/form-data.
This refers www.w3.org/TR/html5/forms.html#url-encoded-form-data
See URI.encode_www_form_component, URI.decode_www_form
799 800 801 802 803 804 805 806 807 808 809 810 811 812 |
# File 'lib/extensions/uri/uri/common.rb', line 799 def self.encode_www_form(enum) str = nil enum.each do |k,v| if str str << '&' else str = nil.to_s end str << encode_www_form_component(k) str << '=' str << encode_www_form_component(v) end str end |
.encode_www_form_component(str) ⇒ Object
Encode given str
to URL-encoded form data.
This doesn’t convert *, -, ., 0-9, A-Z, _, a-z, does convert SP to +, and convert others to %XX.
This refers www.w3.org/TR/html5/forms.html#url-encoded-form-data
See URI.decode_www_form_component, URI.encode_www_form
732 733 734 735 736 737 738 739 740 741 742 743 744 745 746 747 748 749 750 751 752 753 754 |
# File 'lib/extensions/uri/uri/common.rb', line 732 def self.encode_www_form_component(str) if TBLENCWWWCOMP_.empty? tbl = {} 256.times do |i| tbl[i.chr] = '%%%02X' % i end tbl[' '] = '+' begin TBLENCWWWCOMP_.replace(tbl) TBLENCWWWCOMP_.freeze rescue end end str = str.to_s if HTML5ASCIIINCOMPAT.include?(str.encoding) str = str.encode("UTF-8") #Encoding::UTF_8) else str = str.dup end str.force_encoding("ASCII-8BIT") #Encoding::ASCII_8BIT) str.gsub!(/[^*\-.0-9A-Z_a-z]/, TBLENCWWWCOMP_) str.force_encoding("US-ASCII") #Encoding::US_ASCII) end |
.extract(str, schemes = nil, &block) ⇒ Object
Synopsis
URI::extract(str[, schemes][,&blk])
Args
str
-
String to extract URIs from.
schemes
-
Limit URI matching to a specific schemes.
Description
Extracts URIs from a string. If block given, iterates through all matched URIs. Returns nil if block given or array with matches.
Usage
require "uri"
URI.extract("text here http://foo.example.org/bla and here mailto:[email protected] and here also.")
# => ["http://foo.example.com/bla", "mailto:[email protected]"]
680 681 682 |
# File 'lib/extensions/uri/uri/common.rb', line 680 def self.extract(str, schemes = nil, &block) DEFAULT_PARSER.extract(str, schemes, &block) end |
.join(*str) ⇒ Object
Synopsis
URI::join(str[, str, ...])
Args
str
-
String(s) to work with
Description
Joins URIs.
Usage
require 'uri'
p URI.join("http://localhost/","main.rbx")
# => #<URI::HTTP:0x2022ac02 URL:http://localhost/main.rbx>
652 653 654 |
# File 'lib/extensions/uri/uri/common.rb', line 652 def self.join(*str) DEFAULT_PARSER.join(*str) end |
.parse(uri) ⇒ Object
Synopsis
URI::parse(uri_str)
Args
uri_str
-
String with URI.
Description
Creates one of the URI’s subclasses instance from the string.
Raises
URI::InvalidURIError
Raised if URI given is not a correct one.
Usage
require 'uri'
uri = URI.parse("http://www.ruby-lang.org/")
p uri
# => #<URI::HTTP:0x202281be URL:http://www.ruby-lang.org/>
p uri.scheme
# => "http"
p uri.host
# => "www.ruby-lang.org"
627 628 629 |
# File 'lib/extensions/uri/uri/common.rb', line 627 def self.parse(uri) DEFAULT_PARSER.parse(uri) end |
.regexp(schemes = nil) ⇒ Object
Synopsis
URI::regexp([match_schemes])
Args
match_schemes
-
Array of schemes. If given, resulting regexp matches to URIs whose scheme is one of the match_schemes.
Description
Returns a Regexp object which matches to URI-like strings. The Regexp object returned by this method includes arbitrary number of capture group (parentheses). Never rely on it’s number.
Usage
require 'uri'
# extract first URI from html_string
html_string.slice(URI.regexp)
# remove ftp URIs
html_string.sub(URI.regexp(['ftp'])
# You should not rely on the number of parentheses
html_string.scan(URI.regexp) do |*matches|
p $&
end
715 716 717 |
# File 'lib/extensions/uri/uri/common.rb', line 715 def self.regexp(schemes = nil) DEFAULT_PARSER.make_regexp(schemes) end |
.scheme_list ⇒ Object
540 541 542 |
# File 'lib/extensions/uri/uri/common.rb', line 540 def self.scheme_list @@schemes end |
.split(uri) ⇒ Object
Synopsis
URI::split(uri)
Args
uri
-
String with URI.
Description
Splits the string on following parts and returns array with result:
* Scheme
* Userinfo
* Host
* Port
* Registry
* Path
* Opaque
* Query
* Fragment
Usage
require 'uri'
p URI.split("http://www.ruby-lang.org/")
# => ["http", nil, "www.ruby-lang.org", nil, nil, "/", nil, nil, nil]
592 593 594 |
# File 'lib/extensions/uri/uri/common.rb', line 592 def self.split(uri) DEFAULT_PARSER.split(uri) end |