Module: Html2rss::Utils

Defined in:: lib/html2rss/utils.rb

Overview

The collecting tank for utility methods.

Class Method Summary collapse

.build_absolute_url_from_relative(url, base_url) ⇒ Addressable::URI
.build_regexp_from_string(string) ⇒ Regexp

Parses the given String and builds a Regexp out of it.
.guess_content_type_from_url(url) ⇒ String

Guesses the content type based on the file extension of the URL.
.sanitize_url(url) ⇒ Addressable::URI^?

Removes any space, parses and normalizes the given url.
.titleized_channel_url(url) ⇒ String

Builds a titleized representation of the URL with prefixed host.
.titleized_url(url) ⇒ String

Builds a titleized representation of the URL.
.use_zone(time_zone, default_time_zone: Time.now.getlocal.zone) { ... } ⇒ Object

Allows override of time zone locally inside supplied block; resets previous time zone when done.

Class Method Details

.build_absolute_url_from_relative(url, base_url) ⇒ `Addressable::URI`

Parameters:

url (String, Addressable::URI)
base_url (String, Addressable::URI)

Returns:

(Addressable::URI)

# File 'lib/html2rss/utils.rb', line 18

def self.build_absolute_url_from_relative(url, base_url)
  url = Addressable::URI.parse(url)
  return url if url.absolute?

  base_uri = Addressable::URI.parse(base_url)
  base_uri.path = '/' if base_uri.path.empty?

  base_uri.join(url).normalize
end

.build_regexp_from_string(string) ⇒ `Regexp`

Parses the given String and builds a Regexp out of it.

It will remove one pair of surrounding slashes (‘/’) from the String to maintain backwards compatibility before building the Regexp.

Parameters:

string (String)

Returns:

(Regexp)

Raises:

(ArgumentError)

# File 'lib/html2rss/utils.rb', line 94

def self.build_regexp_from_string(string)
  raise ArgumentError, 'must be a string!' unless string.is_a?(String)

  string = string[1..-2] if string.start_with?('/') && string.end_with?('/')
  Regexp::Parser.parse(string, options: ::Regexp::EXTENDED | ::Regexp::IGNORECASE).to_re
end

.guess_content_type_from_url(url) ⇒ `String`

Guesses the content type based on the file extension of the URL.

Parameters:

url (Addressable::URI)

Returns:

(String) —

guessed content type, defaults to ‘application/octet-stream’

# File 'lib/html2rss/utils.rb', line 106

def self.guess_content_type_from_url(url)
  url = url.path.split('?').first

  content_type = MIME::Types.type_for(File.extname(url).delete('.'))
  content_type.first&.to_s || 'application/octet-stream'
end

.sanitize_url(url) ⇒ `Addressable::URI`^?

Removes any space, parses and normalizes the given url.

Parameters:

url (String)

Returns:

(Addressable::URI, nil) —

normalized URL, or nil if input is empty

# File 'lib/html2rss/utils.rb', line 32

def self.sanitize_url(url)
  url = url.to_s.gsub(/\s+/, ' ').strip
  return if url.empty?

  Addressable::URI.parse(url).normalize
end

.titleized_channel_url(url) ⇒ `String`

Builds a titleized representation of the URL with prefixed host.

Parameters:

url (Addressable::URI)

Returns:

(String)

# File 'lib/html2rss/utils.rb', line 62

def self.titleized_channel_url(url)
  nicer_path = CGI.unescapeURIComponent(url.path).split('/').reject(&:empty?)
  host = url.host

  nicer_path.any? ? "#{host}: #{nicer_path.map(&:capitalize).join(' ')}" : host
end

.titleized_url(url) ⇒ `String`

Builds a titleized representation of the URL.

Parameters:

url (Addressable::URI)

Returns:

(String)

# File 'lib/html2rss/utils.rb', line 73

def self.titleized_url(url)
  return '' if url.path.empty?

  nicer_path = CGI.unescapeURIComponent(url.path)
                  .split('/')
                  .flat_map do |part|
    part.gsub(/[^a-zA-Z0-9\.]/, ' ').gsub(/\s+/, ' ').split
  end

  nicer_path.map!(&:capitalize)
  File.basename nicer_path.join(' '), '.*'
end

.use_zone(time_zone, default_time_zone: Time.now.getlocal.zone) { ... } ⇒ `Object`

Allows override of time zone locally inside supplied block; resets previous time zone when done.

Parameters:

time_zone (String)
default_time_zone (String) (defaults to: Time.now.getlocal.zone)

Yields:

block to execute with the given time zone

Returns:

(Object) —

whatever the given block returns

# File 'lib/html2rss/utils.rb', line 46

def self.use_zone(time_zone, default_time_zone: Time.now.getlocal.zone)
  raise ArgumentError, 'a block is required' unless block_given?

  time_zone = TZInfo::Timezone.get(time_zone)

  prev_tz = ENV.fetch('TZ', default_time_zone)
  ENV['TZ'] = time_zone.name
  yield
ensure
  ENV['TZ'] = prev_tz if prev_tz
end

Module: Html2rss::Utils

Overview

Class Method Summary collapse

Class Method Details

.build_absolute_url_from_relative(url, base_url) ⇒ Addressable::URI

.build_regexp_from_string(string) ⇒ Regexp

.guess_content_type_from_url(url) ⇒ String

.sanitize_url(url) ⇒ Addressable::URI?

.titleized_channel_url(url) ⇒ String

.titleized_url(url) ⇒ String

.use_zone(time_zone, default_time_zone: Time.now.getlocal.zone) { ... } ⇒ Object

.build_absolute_url_from_relative(url, base_url) ⇒ `Addressable::URI`

.build_regexp_from_string(string) ⇒ `Regexp`

.guess_content_type_from_url(url) ⇒ `String`

.sanitize_url(url) ⇒ `Addressable::URI`^?

.titleized_channel_url(url) ⇒ `String`

.titleized_url(url) ⇒ `String`

.use_zone(time_zone, default_time_zone: Time.now.getlocal.zone) { ... } ⇒ `Object`