Module: URI
- Extended by:
- Escape
- Includes:
- REGEXP
- Defined in:
- lib/uri.rb,
lib/uri/ftp.rb,
lib/uri/http.rb,
lib/uri/ldap.rb,
lib/open-uri.rb,
lib/uri/ldaps.rb,
lib/uri/https.rb,
lib/uri/common.rb,
lib/uri/mailto.rb,
lib/uri/generic.rb
Overview
uri/common.rb
Author |
Akira Yamada <akira@ruby-lang.org> |
Revision |
$Id: common.rb 27285 2010-04-10 22:05:02Z naruse $ |
License |
You can redistribute it and/or modify it under the same term as Ruby. |
Defined Under Namespace
Modules: Escape, REGEXP, Util Classes: BadURIError, Error, FTP, Generic, HTTP, HTTPS, InvalidComponentError, InvalidURIError, LDAP, LDAPS, MailTo, Parser
Constant Summary
- VERSION_CODE =
:stopdoc:
'000911'.freeze
- VERSION =
VERSION_CODE.scan(/../).collect{|n| n.to_i}.join('.').freeze
- DEFAULT_PARSER =
class Parser
Parser.new
- TBLENCWWWCOMP_ =
{}
- TBLDECWWWCOMP_ =
:nodoc: :nodoc:
{}
- HTML5ASCIIINCOMPAT =
[Encoding::UTF_16BE, Encoding::UTF_16LE, Encoding::UTF_32BE, Encoding::UTF_32LE]
- WFKV_ =
:nodoc:
'(?:%\h\h|[^%#=;&])'- @@schemes =
{}
Class Method Summary (collapse)
-
+ (Object) decode_www_form(str, enc = Encoding::UTF_8)
Decode URL-encoded form data from given str.
-
+ (Object) decode_www_form_component(str, enc = Encoding::UTF_8)
Decode given str of URL-encoded form data.
-
+ (Object) encode_www_form(enum)
Generate URL-encoded form data from given enum.
-
+ (Object) encode_www_form_component(str)
Encode given str to URL-encoded form data.
-
+ (Object) extract(str, schemes = nil, &block)
Synopsis.
-
+ (Object) join(*str)
Synopsis.
-
+ (Object) parse(uri)
Synopsis.
-
+ (Object) regexp(schemes = nil)
Synopsis.
- + (Object) scheme_list
-
+ (Object) split(uri)
Synopsis.
Methods included from Escape
Class Method Details
+ (Object) decode_www_form(str, enc = Encoding::UTF_8)
Decode URL-encoded form data from given str.
This decodes application/x-www-form-urlencoded data and returns array of key-value array. This internally uses URI.decode_www_form_component.
charset hack is not supported now because the mapping from given charset to Ruby's encoding is not clear yet. see also www.w3.org/TR/html5/syntax.html#character-encodings-0
This refers www.w3.org/TR/html5/forms.html#url-encoded-form-data
ary = URI.decode_www_form("a=1&a=2&b=3") p ary #=> [['a', '1'], ['a', '2'], ['b', '3']] p ary.assoc('a').last #=> '1' p ary.assoc('b').last #=> '3' p ary.rassoc('a').last #=> '2' p Hash # => "b"=>"3"
See URI.decode_www_form_component, URI.encode_www_form
826 827 828 829 830 831 832 833 834 835 836 |
# File 'lib/uri/common.rb', line 826 def self.decode_www_form(str, enc=Encoding::UTF_8) return [] if str.empty? unless /\A#{WFKV_}*=#{WFKV_}*(?:[;&]#{WFKV_}*=#{WFKV_}*)*\z/o =~ str raise ArgumentError, "invalid data of application/x-www-form-urlencoded (#{str})" end ary = [] $&.scan(/([^=;&]+)=([^;&]*)/) do ary << [decode_www_form_component($1, enc), decode_www_form_component($2, enc)] end ary end |
+ (Object) decode_www_form_component(str, enc = Encoding::UTF_8)
Decode given str of URL-encoded form data.
This decods + to SP.
See URI.encode_www_form_component, URI.decode_www_form
756 757 758 759 760 761 762 763 764 765 766 767 768 769 770 |
# File 'lib/uri/common.rb', line 756 def self.decode_www_form_component(str, enc=Encoding::UTF_8) if TBLDECWWWCOMP_.empty? 256.times do |i| h, l = i>>4, i&15 TBLDECWWWCOMP_['%%%X%X' % [h, l]] = i.chr TBLDECWWWCOMP_['%%%x%X' % [h, l]] = i.chr TBLDECWWWCOMP_['%%%X%x' % [h, l]] = i.chr TBLDECWWWCOMP_['%%%x%x' % [h, l]] = i.chr end TBLDECWWWCOMP_['+'] = ' ' TBLDECWWWCOMP_.freeze end raise ArgumentError, "invalid %-encoding (#{str})" unless /\A(?:%\h\h|[^%]+)*\z/ =~ str str.gsub(/\+|%\h\h/, TBLDECWWWCOMP_).force_encoding(enc) end |
+ (Object) encode_www_form(enum)
Generate URL-encoded form data from given enum.
This generates application/x-www-form-urlencoded data defined in HTML5 from given an Enumerable object.
This internally uses URI.encode_www_form_component(str).
This doesn't convert encodings of give items, so convert them before call this method if you want to send data as other than original encoding or mixed encoding data. (strings which is encoded in HTML5 ASCII incompatible encoding is converted to UTF-8)
This doesn't treat files. When you send a file, use multipart/form-data.
This refers www.w3.org/TR/html5/forms.html#url-encoded-form-data
See URI.encode_www_form_component, URI.decode_www_form
789 790 791 792 793 794 795 796 797 798 799 800 801 802 |
# File 'lib/uri/common.rb', line 789 def self.encode_www_form(enum) str = nil enum.each do |k,v| if str str << '&' else str = nil.to_s end str << encode_www_form_component(k) str << '=' str << encode_www_form_component(v) end str end |
+ (Object) encode_www_form_component(str)
Encode given str to URL-encoded form data.
This doesn't convert *, -, ., 0-9, A-Z, _, a-z, does convert SP to +, and convert others to %XX.
This refers www.w3.org/TR/html5/forms.html#url-encoded-form-data
See URI.decode_www_form_component, URI.encode_www_form
732 733 734 735 736 737 738 739 740 741 742 743 744 745 746 747 748 749 |
# File 'lib/uri/common.rb', line 732 def self.encode_www_form_component(str) if TBLENCWWWCOMP_.empty? 256.times do |i| TBLENCWWWCOMP_[i.chr] = '%%%02X' % i end TBLENCWWWCOMP_[' '] = '+' TBLENCWWWCOMP_.freeze end str = str.to_s if HTML5ASCIIINCOMPAT.include?(str.encoding) str = str.encode(Encoding::UTF_8) else str = str.dup end str.force_encoding(Encoding::ASCII_8BIT) str.gsub!(/[^*\-.0-9A-Z_a-z]/, TBLENCWWWCOMP_) str.force_encoding(Encoding::US_ASCII) end |
+ (Object) extract(str, schemes = nil, &block)
Synopsis
URI::extract(str[, schemes][,&blk])
Args
str |
String to extract URIs from. |
schemes |
Limit URI matching to a specific schemes. |
Description
Extracts URIs from a string. If block given, iterates through all matched URIs. Returns nil if block given or array with matches.
Usage
require "uri"
URI.extract("text here http://foo.example.org/bla and here mailto:test@example.com and here also.")
# => ["http://foo.example.com/bla", "mailto:test@example.com"]
680 681 682 |
# File 'lib/uri/common.rb', line 680 def self.extract(str, schemes = nil, &block) DEFAULT_PARSER.extract(str, schemes, &block) end |
+ (Object) join(*str)
Synopsis
URI::join(str[, str, ...])
Args
str |
String(s) to work with |
Description
Joins URIs.
Usage
require 'uri'
p URI.join("http://localhost/","main.rbx")
# => #<URI::HTTP:0x2022ac02 URL:http://localhost/main.rbx>
652 653 654 |
# File 'lib/uri/common.rb', line 652 def self.join(*str) DEFAULT_PARSER.join(*str) end |
+ (Object) parse(uri)
Synopsis
URI::parse(uri_str)
Args
uri_str |
String with URI. |
Description
Creates one of the URI's subclasses instance from the string.
Raises
URI::InvalidURIError
Raised if URI given is not a correct one.
Usage
require 'uri'
uri = URI.parse("http://www.ruby-lang.org/")
p uri
# => #<URI::HTTP:0x202281be URL:http://www.ruby-lang.org/>
p uri.scheme
# => "http"
p uri.host
# => "www.ruby-lang.org"
627 628 629 |
# File 'lib/uri/common.rb', line 627 def self.parse(uri) DEFAULT_PARSER.parse(uri) end |
+ (Object) regexp(schemes = nil)
Synopsis
URI::regexp([match_schemes])
Args
match_schemes |
Array of schemes. If given, resulting regexp matches to URIs whose scheme is one of the match_schemes. |
Description
Returns a Regexp object which matches to URI-like strings. The Regexp object returned by this method includes arbitrary number of capture group (parentheses). Never rely on it's number.
Usage
require 'uri'
# extract first URI from html_string
html_string.slice(URI.regexp)
# remove ftp URIs
html_string.sub(URI.regexp(['ftp'])
# You should not rely on the number of parentheses
html_string.scan(URI.regexp) do |*matches|
p $&
end
715 716 717 |
# File 'lib/uri/common.rb', line 715 def self.regexp(schemes = nil) DEFAULT_PARSER.make_regexp(schemes) end |
+ (Object) scheme_list
540 541 542 |
# File 'lib/uri/common.rb', line 540 def self.scheme_list @@schemes end |
+ (Object) split(uri)
Synopsis
URI::split(uri)
Args
uri |
String with URI. |
Description
Splits the string on following parts and returns array with result:
* Scheme
* Userinfo
* Host
* Port
* Registry
* Path
* Opaque
* Query
* Fragment
Usage
require 'uri'
p URI.split("http://www.ruby-lang.org/")
# => ["http", nil, "www.ruby-lang.org", nil, nil, "/", nil, nil, nil]
592 593 594 |
# File 'lib/uri/common.rb', line 592 def self.split(uri) DEFAULT_PARSER.split(uri) end |