Module: URI
- Extended by:
- Escape
- Includes:
- REGEXP
- Included in:
- Generic
- Defined in:
- lib/uri.rb,
lib/uri/ftp.rb,
lib/uri/ldap.rb,
lib/uri/http.rb,
lib/uri/https.rb,
lib/uri/ldaps.rb,
lib/uri/common.rb,
lib/uri/mailto.rb,
lib/uri/generic.rb
Overview
–
uri/common.rb
- Author
-
Akira Yamada <[email protected]>
- Revision
-
$Id: common.rb 44187 2013-12-13 23:22:41Z hsbt $
- License
-
You can redistribute it and/or modify it under the same term as Ruby.
See URI for general documentation
Defined Under Namespace
Modules: Escape, REGEXP, Util Classes: BadURIError, Error, FTP, Generic, HTTP, HTTPS, InvalidComponentError, InvalidURIError, LDAP, LDAPS, MailTo, Parser
Constant Summary collapse
- VERSION_CODE =
:stopdoc:
'000911'.freeze
- VERSION =
VERSION_CODE.scan(/../).collect{|n| n.to_i}.join('.').freeze
- DEFAULT_PARSER =
URI::Parser.new
Parser.new
- TBLENCWWWCOMP_ =
:nodoc:
{}
- TBLDECWWWCOMP_ =
:nodoc:
{}
- HTML5ASCIIINCOMPAT =
[Encoding::UTF_7, Encoding::UTF_16BE, Encoding::UTF_16LE, Encoding::UTF_32BE, Encoding::UTF_32LE]
- @@schemes =
{}
Class Method Summary collapse
-
.decode_www_form(str, enc = Encoding::UTF_8, separator: '&', use__charset_: false, isindex: false) ⇒ Object
Decode URL-encoded form data from given
str
. -
.decode_www_form_component(str, enc = Encoding::UTF_8) ⇒ Object
Decode given
str
of URL-encoded form data. -
.encode_www_form(enum, enc = nil) ⇒ Object
Generate URL-encoded form data from given
enum
. -
.encode_www_form_component(str, enc = nil) ⇒ Object
Encode given
str
to URL-encoded form data. -
.extract(str, schemes = nil, &block) ⇒ Object
Synopsis.
-
.join(*str) ⇒ Object
Synopsis.
-
.parse(uri) ⇒ Object
Synopsis.
-
.regexp(schemes = nil) ⇒ Object
Synopsis.
-
.scheme_list ⇒ Object
Returns a Hash of the defined schemes.
-
.split(uri) ⇒ Object
Synopsis.
Methods included from Escape
Class Method Details
.decode_www_form(str, enc = Encoding::UTF_8, separator: '&', use__charset_: false, isindex: false) ⇒ Object
Decode URL-encoded form data from given str
.
This decodes application/x-www-form-urlencoded data and returns array of key-value array.
This refers url.spec.whatwg.org/#concept-urlencoded-parser , so this supports only &-separator, don’t support ;-separator.
ary = URI.decode_www_form(“a=1&a=2&b=3”) p ary #=> [[‘a’, ‘1’], [‘a’, ‘2’], [‘b’, ‘3’]] p ary.assoc(‘a’).last #=> ‘1’ p ary.assoc(‘b’).last #=> ‘3’ p ary.rassoc(‘a’).last #=> ‘2’ p Hash # => “b”=>“3”
See URI.decode_www_form_component, URI.encode_www_form
968 969 970 971 972 973 974 975 976 977 978 979 980 981 982 983 984 985 986 987 988 989 990 991 992 993 994 995 996 997 998 999 1000 1001 1002 1003 1004 1005 |
# File 'lib/uri/common.rb', line 968 def self.decode_www_form(str, enc=Encoding::UTF_8, separator: '&', use__charset_: false, isindex: false) raise ArgumentError, "the input of #{self.name}.#{__method__} must be ASCII only string" unless str.ascii_only? ary = [] return ary if str.empty? enc = Encoding.find(enc) str.b.each_line(separator) do |string| string.chomp!(separator) key, sep, val = string.partition('=') if isindex if sep.empty? val = key key = '' end isindex = false end if use__charset_ and key == '_charset_' and e = get_encoding(val) enc = e use__charset_ = false end key.gsub!(/\+|%\h\h/, TBLDECWWWCOMP_) if val val.gsub!(/\+|%\h\h/, TBLDECWWWCOMP_) else val = '' end ary << [key, val] end ary.each do |k, v| k.force_encoding(enc) k.scrub! v.force_encoding(enc) v.scrub! end ary end |
.decode_www_form_component(str, enc = Encoding::UTF_8) ⇒ Object
Decode given str
of URL-encoded form data.
This decodes + to SP.
See URI.encode_www_form_component, URI.decode_www_form
900 901 902 903 |
# File 'lib/uri/common.rb', line 900 def self.decode_www_form_component(str, enc=Encoding::UTF_8) raise ArgumentError, "invalid %-encoding (#{str})" unless /\A[^%]*(?:%\h\h[^%]*)*\z/ =~ str str.b.gsub(/\+|%\h\h/, TBLDECWWWCOMP_).force_encoding(enc) end |
.encode_www_form(enum, enc = nil) ⇒ Object
Generate URL-encoded form data from given enum
.
This generates application/x-www-form-urlencoded data defined in HTML5 from given an Enumerable object.
This internally uses URI.encode_www_form_component(str).
This method doesn’t convert the encoding of given items, so convert them before call this method if you want to send data as other than original encoding or mixed encoding data. (Strings which are encoded in an HTML5 ASCII incompatible encoding are converted to UTF-8.)
This method doesn’t handle files. When you send a file, use multipart/form-data.
This refers url.spec.whatwg.org/#concept-urlencoded-serializer
URI.encode_www_form([["q", "ruby"], ["lang", "en"]])
#=> "q=ruby&lang=en"
URI.encode_www_form("q" => "ruby", "lang" => "en")
#=> "q=ruby&lang=en"
URI.encode_www_form("q" => ["ruby", "perl"], "lang" => "en")
#=> "q=ruby&q=perl&lang=en"
URI.encode_www_form([["q", "ruby"], ["q", "perl"], ["lang", "en"]])
#=> "q=ruby&q=perl&lang=en"
See URI.encode_www_form_component, URI.decode_www_form
932 933 934 935 936 937 938 939 940 941 942 943 944 945 946 947 948 949 950 |
# File 'lib/uri/common.rb', line 932 def self.encode_www_form(enum, enc=nil) enum.map do |k,v| if v.nil? encode_www_form_component(k, enc) elsif v.respond_to?(:to_ary) v.to_ary.map do |w| str = encode_www_form_component(k, enc) unless w.nil? str << '=' str << encode_www_form_component(w, enc) end end.join('&') else str = encode_www_form_component(k, enc) str << '=' str << encode_www_form_component(v, enc) end end.join('&') end |
.encode_www_form_component(str, enc = nil) ⇒ Object
Encode given str
to URL-encoded form data.
This method doesn’t convert *, -, ., 0-9, A-Z, _, a-z, but does convert SP (ASCII space) to + and converts others to %XX.
If enc
is given, convert str
to the encoding before percent encoding.
This is an implementation of www.w3.org/TR/html5/forms.html#url-encoded-form-data
See URI.decode_www_form_component, URI.encode_www_form
882 883 884 885 886 887 888 889 890 891 892 893 |
# File 'lib/uri/common.rb', line 882 def self.encode_www_form_component(str, enc=nil) str = str.to_s.dup if str.encoding != Encoding::ASCII_8BIT if enc && enc != Encoding::ASCII_8BIT str.encode!(Encoding::UTF_8, invalid: :replace, undef: :replace) str.encode!(enc, fallback: ->(x){"&#{x.ord};"}) end str.force_encoding(Encoding::ASCII_8BIT) end str.gsub!(/[^*\-.0-9A-Z_a-z]/, TBLENCWWWCOMP_) str.force_encoding(Encoding::US_ASCII) end |
.extract(str, schemes = nil, &block) ⇒ Object
Synopsis
URI::extract(str[, schemes][,&blk])
Args
str
-
String to extract URIs from.
schemes
-
Limit URI matching to a specific schemes.
Description
Extracts URIs from a string. If block given, iterates through all matched URIs. Returns nil if block given or array with matches.
Usage
require "uri"
URI.extract("text here http://foo.example.org/bla and here mailto:[email protected] and here also.")
# => ["http://foo.example.com/bla", "mailto:[email protected]"]
812 813 814 |
# File 'lib/uri/common.rb', line 812 def self.extract(str, schemes = nil, &block) DEFAULT_PARSER.extract(str, schemes, &block) end |
.join(*str) ⇒ Object
Synopsis
URI::join(str[, str, ...])
Args
str
-
String(s) to work with
Description
Joins URIs.
Usage
require 'uri'
p URI.join("http://example.com/","main.rbx")
# => #<URI::HTTP:0x2022ac02 URL:http://localhost/main.rbx>
p URI.join('http://example.com', 'foo')
# => #<URI::HTTP:0x01ab80a0 URL:http://example.com/foo>
p URI.join('http://example.com', '/foo', '/bar')
# => #<URI::HTTP:0x01aaf0b0 URL:http://example.com/bar>
p URI.join('http://example.com', '/foo', 'bar')
# => #<URI::HTTP:0x801a92af0 URL:http://example.com/bar>
p URI.join('http://example.com', '/foo/', 'bar')
# => #<URI::HTTP:0x80135a3a0 URL:http://example.com/foo/bar>
784 785 786 |
# File 'lib/uri/common.rb', line 784 def self.join(*str) DEFAULT_PARSER.join(*str) end |
.parse(uri) ⇒ Object
Synopsis
URI::parse(uri_str)
Args
uri_str
-
String with URI.
Description
Creates one of the URI’s subclasses instance from the string.
Raises
URI::InvalidURIError
Raised if URI given is not a correct one.
Usage
require 'uri'
uri = URI.parse("http://www.ruby-lang.org/")
p uri
# => #<URI::HTTP:0x202281be URL:http://www.ruby-lang.org/>
p uri.scheme
# => "http"
p uri.host
# => "www.ruby-lang.org"
746 747 748 |
# File 'lib/uri/common.rb', line 746 def self.parse(uri) DEFAULT_PARSER.parse(uri) end |
.regexp(schemes = nil) ⇒ Object
Synopsis
URI::regexp([match_schemes])
Args
match_schemes
-
Array of schemes. If given, resulting regexp matches to URIs whose scheme is one of the match_schemes.
Description
Returns a Regexp object which matches to URI-like strings. The Regexp object returned by this method includes arbitrary number of capture group (parentheses). Never rely on it’s number.
Usage
require 'uri'
# extract first URI from html_string
html_string.slice(URI.regexp)
# remove ftp URIs
html_string.sub(URI.regexp(['ftp'])
# You should not rely on the number of parentheses
html_string.scan(URI.regexp) do |*matches|
p $&
end
847 848 849 |
# File 'lib/uri/common.rb', line 847 def self.regexp(schemes = nil) DEFAULT_PARSER.make_regexp(schemes) end |
.scheme_list ⇒ Object
Returns a Hash of the defined schemes
659 660 661 |
# File 'lib/uri/common.rb', line 659 def self.scheme_list @@schemes end |
.split(uri) ⇒ Object
Synopsis
URI::split(uri)
Args
uri
-
String with URI.
Description
Splits the string on following parts and returns array with result:
* Scheme
* Userinfo
* Host
* Port
* Registry
* Path
* Opaque
* Query
* Fragment
Usage
require 'uri'
p URI.split("http://www.ruby-lang.org/")
# => ["http", nil, "www.ruby-lang.org", nil, nil, "/", nil, nil, nil]
711 712 713 |
# File 'lib/uri/common.rb', line 711 def self.split(uri) DEFAULT_PARSER.split(uri) end |