Module: Bundler::URI

Includes:
RFC2396_REGEXP
Included in:
Generic
Defined in:
lib/bundler/vendor/uri/lib/uri.rb,
lib/bundler/vendor/uri/lib/uri/ws.rb,
lib/bundler/vendor/uri/lib/uri/ftp.rb,
lib/bundler/vendor/uri/lib/uri/wss.rb,
lib/bundler/vendor/uri/lib/uri/file.rb,
lib/bundler/vendor/uri/lib/uri/http.rb,
lib/bundler/vendor/uri/lib/uri/ldap.rb,
lib/bundler/vendor/uri/lib/uri/https.rb,
lib/bundler/vendor/uri/lib/uri/ldaps.rb,
lib/bundler/vendor/uri/lib/uri/common.rb,
lib/bundler/vendor/uri/lib/uri/mailto.rb,
lib/bundler/vendor/uri/lib/uri/generic.rb,
lib/bundler/vendor/uri/lib/uri/version.rb,
lib/bundler/vendor/uri/lib/uri/rfc2396_parser.rb,
lib/bundler/vendor/uri/lib/uri/rfc3986_parser.rb

Overview

uri/common.rb

Author

Akira Yamada <[email protected]>

License

You can redistribute it and/or modify it under the same term as Ruby.

See Bundler::URI for general documentation

Defined Under Namespace

Modules: RFC2396_REGEXP, Util Classes: BadURIError, Error, FTP, File, Generic, HTTP, HTTPS, InvalidComponentError, InvalidURIError, LDAP, LDAPS, MailTo, RFC2396_Parser, RFC3986_Parser, WS, WSS

Constant Summary collapse

REGEXP =
RFC2396_REGEXP
Parser =
RFC2396_Parser
RFC3986_PARSER =
RFC3986_Parser.new
DEFAULT_PARSER =

Bundler::URI::Parser.new

Parser.new
TBLENCWWWCOMP_ =

:nodoc:

{}
TBLENCURICOMP_ =
TBLENCWWWCOMP_.dup.freeze
TBLDECWWWCOMP_ =

:nodoc:

{}
VERSION_CODE =

:stopdoc:

'001201'.freeze
VERSION =
VERSION_CODE.scan(/../).collect{|n| n.to_i}.join('.').freeze

Class Method Summary collapse

Class Method Details

.decode_uri_component(str, enc = Encoding::UTF_8) ⇒ Object

Decodes given str of URL-encoded data.

This does not decode + to SP.



351
352
353
# File 'lib/bundler/vendor/uri/lib/uri/common.rb', line 351

def self.decode_uri_component(str, enc=Encoding::UTF_8)
  _decode_uri_component(/%\h\h/, str, enc)
end

.decode_www_form(str, enc = Encoding::UTF_8, separator: '&', use__charset_: false, isindex: false) ⇒ Object

Decodes URL-encoded form data from given str.

This decodes application/x-www-form-urlencoded data and returns an array of key-value arrays.

This refers url.spec.whatwg.org/#concept-urlencoded-parser, so this supports only &-separator, and doesn’t support ;-separator.

ary = Bundler::URI.decode_www_form("a=1&a=2&b=3")
ary                   #=> [['a', '1'], ['a', '2'], ['b', '3']]
ary.assoc('a').last   #=> '1'
ary.assoc('b').last   #=> '3'
ary.rassoc('a').last  #=> '2'
Hash[ary]             #=> {"a"=>"2", "b"=>"3"}

See Bundler::URI.decode_www_form_component, Bundler::URI.encode_www_form.

Raises:

  • (ArgumentError)


438
439
440
441
442
443
444
445
446
447
448
449
450
451
452
453
454
455
456
457
458
459
460
461
462
463
464
465
466
467
468
469
470
471
472
473
474
475
# File 'lib/bundler/vendor/uri/lib/uri/common.rb', line 438

def self.decode_www_form(str, enc=Encoding::UTF_8, separator: '&', use__charset_: false, isindex: false)
  raise ArgumentError, "the input of #{self.name}.#{__method__} must be ASCII only string" unless str.ascii_only?
  ary = []
  return ary if str.empty?
  enc = Encoding.find(enc)
  str.b.each_line(separator) do |string|
    string.chomp!(separator)
    key, sep, val = string.partition('=')
    if isindex
      if sep.empty?
        val = key
        key = +''
      end
      isindex = false
    end

    if use__charset_ and key == '_charset_' and e = get_encoding(val)
      enc = e
      use__charset_ = false
    end

    key.gsub!(/\+|%\h\h/, TBLDECWWWCOMP_)
    if val
      val.gsub!(/\+|%\h\h/, TBLDECWWWCOMP_)
    else
      val = +''
    end

    ary << [key, val]
  end
  ary.each do |k, v|
    k.force_encoding(enc)
    k.scrub!
    v.force_encoding(enc)
    v.scrub!
  end
  ary
end

.decode_www_form_component(str, enc = Encoding::UTF_8) ⇒ Object

Decodes given str of URL-encoded form data.

This decodes + to SP.

See Bundler::URI.encode_www_form_component, Bundler::URI.decode_www_form.



337
338
339
# File 'lib/bundler/vendor/uri/lib/uri/common.rb', line 337

def self.decode_www_form_component(str, enc=Encoding::UTF_8)
  _decode_uri_component(/\+|%\h\h/, str, enc)
end

.encode_uri_component(str, enc = nil) ⇒ Object

Encodes str using URL encoding

This encodes SP to %20 instead of +.



344
345
346
# File 'lib/bundler/vendor/uri/lib/uri/common.rb', line 344

def self.encode_uri_component(str, enc=nil)
  _encode_uri_component(/[^*\-.0-9A-Z_a-z]/, TBLENCURICOMP_, str, enc)
end

.encode_www_form(enum, enc = nil) ⇒ Object

Generates URL-encoded form data from given enum.

This generates application/x-www-form-urlencoded data defined in HTML5 from given an Enumerable object.

This internally uses Bundler::URI.encode_www_form_component(str).

This method doesn’t convert the encoding of given items, so convert them before calling this method if you want to send data as other than original encoding or mixed encoding data. (Strings which are encoded in an HTML5 ASCII incompatible encoding are converted to UTF-8.)

This method doesn’t handle files. When you send a file, use multipart/form-data.

This refers url.spec.whatwg.org/#concept-urlencoded-serializer

Bundler::URI.encode_www_form([["q", "ruby"], ["lang", "en"]])
#=> "q=ruby&lang=en"
Bundler::URI.encode_www_form("q" => "ruby", "lang" => "en")
#=> "q=ruby&lang=en"
Bundler::URI.encode_www_form("q" => ["ruby", "perl"], "lang" => "en")
#=> "q=ruby&q=perl&lang=en"
Bundler::URI.encode_www_form([["q", "ruby"], ["q", "perl"], ["lang", "en"]])
#=> "q=ruby&q=perl&lang=en"

See Bundler::URI.encode_www_form_component, Bundler::URI.decode_www_form.



402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
# File 'lib/bundler/vendor/uri/lib/uri/common.rb', line 402

def self.encode_www_form(enum, enc=nil)
  enum.map do |k,v|
    if v.nil?
      encode_www_form_component(k, enc)
    elsif v.respond_to?(:to_ary)
      v.to_ary.map do |w|
        str = encode_www_form_component(k, enc)
        unless w.nil?
          str << '='
          str << encode_www_form_component(w, enc)
        end
      end.join('&')
    else
      str = encode_www_form_component(k, enc)
      str << '='
      str << encode_www_form_component(v, enc)
    end
  end.join('&')
end

.encode_www_form_component(str, enc = nil) ⇒ Object

Encodes given str to URL-encoded form data.

This method doesn’t convert *, -, ., 0-9, A-Z, _, a-z, but does convert SP (ASCII space) to + and converts others to %XX.

If enc is given, convert str to the encoding before percent encoding.

This is an implementation of www.w3.org/TR/2013/CR-html5-20130806/forms.html#url-encoded-form-data.

See Bundler::URI.decode_www_form_component, Bundler::URI.encode_www_form.



328
329
330
# File 'lib/bundler/vendor/uri/lib/uri/common.rb', line 328

def self.encode_www_form_component(str, enc=nil)
  _encode_uri_component(/[^*\-.0-9A-Z_a-z]/, TBLENCWWWCOMP_, str, enc)
end

.extract(str, schemes = nil, &block) ⇒ Object

Synopsis

Bundler::URI::extract(str[, schemes][,&blk])

Args

str

String to extract URIs from.

schemes

Limit Bundler::URI matching to specific schemes.

Description

Extracts URIs from a string. If block given, iterates through all matched URIs. Returns nil if block given or array with matches.

Usage

require "bundler/vendor/uri/lib/uri"

Bundler::URI.extract("text here http://foo.example.org/bla and here mailto:[email protected] and here also.")
# => ["http://foo.example.com/bla", "mailto:[email protected]"]


257
258
259
260
# File 'lib/bundler/vendor/uri/lib/uri/common.rb', line 257

def self.extract(str, schemes = nil, &block)
  warn "Bundler::URI.extract is obsolete", uplevel: 1 if $VERBOSE
  DEFAULT_PARSER.extract(str, schemes, &block)
end

.for(scheme, *arguments, default: Generic) ⇒ Object

Construct a Bundler::URI instance, using the scheme to detect the appropriate class from Bundler::URI.scheme_list.



95
96
97
98
99
100
101
102
103
104
105
# File 'lib/bundler/vendor/uri/lib/uri/common.rb', line 95

def self.for(scheme, *arguments, default: Generic)
  const_name = scheme.to_s.upcase

  uri_class = INITIAL_SCHEMES[const_name]
  uri_class ||= if /\A[A-Z]\w*\z/.match?(const_name) && Schemes.const_defined?(const_name, false)
    Schemes.const_get(const_name, false)
  end
  uri_class ||= default

  return uri_class.new(scheme, *arguments)
end

.join(*str) ⇒ Object

Synopsis

Bundler::URI::join(str[, str, ...])

Args

str

String(s) to work with, will be converted to RFC3986 URIs before merging.

Description

Joins URIs.

Usage

require 'bundler/vendor/uri/lib/uri'

Bundler::URI.join("http://example.com/","main.rbx")
# => #<Bundler::URI::HTTP http://example.com/main.rbx>

Bundler::URI.join('http://example.com', 'foo')
# => #<Bundler::URI::HTTP http://example.com/foo>

Bundler::URI.join('http://example.com', '/foo', '/bar')
# => #<Bundler::URI::HTTP http://example.com/bar>

Bundler::URI.join('http://example.com', '/foo', 'bar')
# => #<Bundler::URI::HTTP http://example.com/bar>

Bundler::URI.join('http://example.com', '/foo/', 'bar')
# => #<Bundler::URI::HTTP http://example.com/foo/bar>


229
230
231
# File 'lib/bundler/vendor/uri/lib/uri/common.rb', line 229

def self.join(*str)
  RFC3986_PARSER.join(*str)
end

.parse(uri) ⇒ Object

Synopsis

Bundler::URI::parse(uri_str)

Args

uri_str

String with Bundler::URI.

Description

Creates one of the Bundler::URI’s subclasses instance from the string.

Raises

Bundler::URI::InvalidURIError

Raised if Bundler::URI given is not a correct one.

Usage

require 'bundler/vendor/uri/lib/uri'

uri = Bundler::URI.parse("http://www.ruby-lang.org/")
# => #<Bundler::URI::HTTP http://www.ruby-lang.org/>
uri.scheme
# => "http"
uri.host
# => "www.ruby-lang.org"

It’s recommended to first ::escape the provided uri_str if there are any invalid Bundler::URI characters.



192
193
194
# File 'lib/bundler/vendor/uri/lib/uri/common.rb', line 192

def self.parse(uri)
  RFC3986_PARSER.parse(uri)
end

.regexp(schemes = nil) ⇒ Object

Synopsis

Bundler::URI::regexp([match_schemes])

Args

match_schemes

Array of schemes. If given, resulting regexp matches to URIs whose scheme is one of the match_schemes.

Description

Returns a Regexp object which matches to Bundler::URI-like strings. The Regexp object returned by this method includes arbitrary number of capture group (parentheses). Never rely on its number.

Usage

require 'bundler/vendor/uri/lib/uri'

# extract first Bundler::URI from html_string
html_string.slice(Bundler::URI.regexp)

# remove ftp URIs
html_string.sub(Bundler::URI.regexp(['ftp']), '')

# You should not rely on the number of parentheses
html_string.scan(Bundler::URI.regexp) do |*matches|
  p $&
end


294
295
296
297
# File 'lib/bundler/vendor/uri/lib/uri/common.rb', line 294

def self.regexp(schemes = nil)
  warn "Bundler::URI.regexp is obsolete", uplevel: 1 if $VERBOSE
  DEFAULT_PARSER.make_regexp(schemes)
end

.register_scheme(scheme, klass) ⇒ Object

Register the given klass to be instantiated when parsing URLs with the given scheme. Note that currently only schemes which after .upcase are valid constant names can be registered (no -/+/. allowed).



76
77
78
# File 'lib/bundler/vendor/uri/lib/uri/common.rb', line 76

def self.register_scheme(scheme, klass)
  Schemes.const_set(scheme.to_s.upcase, klass)
end

.scheme_listObject

Returns a Hash of the defined schemes.



81
82
83
84
85
# File 'lib/bundler/vendor/uri/lib/uri/common.rb', line 81

def self.scheme_list
  Schemes.constants.map { |name|
    [name.to_s.upcase, Schemes.const_get(name)]
  }.to_h
end

.split(uri) ⇒ Object

Synopsis

Bundler::URI::split(uri)

Args

uri

String with Bundler::URI.

Description

Splits the string on following parts and returns array with result:

  • Scheme

  • Userinfo

  • Host

  • Port

  • Registry

  • Path

  • Opaque

  • Query

  • Fragment

Usage

require 'bundler/vendor/uri/lib/uri'

Bundler::URI.split("http://www.ruby-lang.org/")
# => ["http", nil, "www.ruby-lang.org", nil, nil, "/", nil, nil, nil]


155
156
157
# File 'lib/bundler/vendor/uri/lib/uri/common.rb', line 155

def self.split(uri)
  RFC3986_PARSER.split(uri)
end