Class: NewsScraper::URIParser

Inherits:
Object
  • Object
show all
Defined in:
lib/news_scraper/uri_parser.rb

Instance Method Summary collapse

Constructor Details

#initialize(url) ⇒ URIParser

Initialize a URIParser

Params

  • url: the url to parse to a uri



10
11
12
# File 'lib/news_scraper/uri_parser.rb', line 10

def initialize(url)
  @uri = URI.parse(url)
end

Instance Method Details

#hostObject

Returns the URI’s host, removing paths, params, and schemes

Returns



37
38
39
# File 'lib/news_scraper/uri_parser.rb', line 37

def host
  without_scheme.downcase.match(/^(?:[\w\d-]+\.)?(?<host>[\w\d-]+\.\w{2,})/)['host']
end

#with_schemeObject

Returns the URI with a scheme, adding http:// if no scheme is present

Returns

  • A URI string, with http:// if no scheme was specified



28
29
30
# File 'lib/news_scraper/uri_parser.rb', line 28

def with_scheme
  @uri.scheme ? @uri.to_s : "http://#{@uri}"
end

#without_schemeObject

Removes the scheme from the URI

Returns

  • A schemeless URI string, e.g. google.ca will return google.ca



19
20
21
# File 'lib/news_scraper/uri_parser.rb', line 19

def without_scheme
  @uri.scheme ? @uri.to_s.gsub(%r{^#{@uri.scheme}://}, '') : @uri.to_s
end