Class: PublicSuffixService::RuleList

Inherits:
Object
  • Object
show all
Includes:
Enumerable
Defined in:
lib/public_suffix_service/rule_list.rb

Overview

A RuleList is a collection of one or more Rule.

Given a RuleList, you can add or remove Rule, iterate all items in the list or search for the first rule which matches a specific domain name.

# Create a new list
list =  PublicSuffixService::RuleList.new

# Push two rules to the list
list << PublicSuffixService::Rule.factory("it")
list << PublicSuffixService::Rule.factory("com")

# Get the size of the list
list.size
# => 2

# Search for the rule matching given domain
list.find("example.com")
# => #<PublicSuffixService::Rule::Normal>
list.find("example.org")
# => nil

You can create as many RuleList you want. The RuleList.default rule list is used to tokenize and validate a domain.

RuleList implements Enumerable module.

Constant Summary collapse

@@default =
nil

Instance Attribute Summary collapse

Class Method Summary collapse

Instance Method Summary collapse

Constructor Details

#initialize {|self| ... } ⇒ RuleList

Initializes an empty PublicSuffixService::RuleList.

Yields:

  • (self)

    Yields on self.

Yield Parameters:



63
64
65
66
67
68
# File 'lib/public_suffix_service/rule_list.rb', line 63

def initialize(&block)
  @list    = []
  @indexes = {}
  yield(self) if block_given?
  create_index!
end

Instance Attribute Details

#indexesArray (readonly)

Gets the naive index, a hash that with the keys being the first label of every rule pointing to an array of integers (indexes of the rules in @list)

Returns:

  • (Array)


55
56
57
# File 'lib/public_suffix_service/rule_list.rb', line 55

def indexes
  @indexes
end

#listArray<PublicSuffixService::Rule::*> (readonly)

Gets the list of rules.

Returns:



49
50
51
# File 'lib/public_suffix_service/rule_list.rb', line 49

def list
  @list
end

Class Method Details

.clearself

Sets the default rule list to nil.

Returns:

  • (self)


230
231
232
233
# File 'lib/public_suffix_service/rule_list.rb', line 230

def clear
  self.default = nil
  self
end

.defaultPublicSuffixService::RuleList

Gets the default rule list. Initializes a new PublicSuffixService::RuleList parsing the content of default_definition, if required.



213
214
215
# File 'lib/public_suffix_service/rule_list.rb', line 213

def default
  @@default ||= parse(default_definition)
end

.default=(value) ⇒ PublicSuffixService::RuleList

Sets the default rule list to value.

Parameters:

Returns:



223
224
225
# File 'lib/public_suffix_service/rule_list.rb', line 223

def default=(value)
  @@default = value
end

.default_definitionFile

Gets the default definition list. Can be any IOStream including a File or a simple String. The object must respond to #each_line.

Returns:

  • (File)


249
250
251
# File 'lib/public_suffix_service/rule_list.rb', line 249

def default_definition
  File.new(File.join(File.dirname(__FILE__), "definitions.txt"))
end

.parse(input) ⇒ Array<PublicSuffixService::Rule::*>

Parse given input treating the content as Public Suffix List.

See publicsuffix.org/format/ for more details about input format.

Parameters:

  • input (String)

    The rule list to parse.

Returns:



261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
# File 'lib/public_suffix_service/rule_list.rb', line 261

def parse(input)
  new do |list|
    input.each_line do |line|
      line.strip!

      # strip blank lines
      if line.empty?
        next
      # strip comments
      elsif line =~ %r{^//}
        next
      # append rule
      else
        list.add(Rule.factory(line), false)
      end
    end
  end
end

.reloadPublicSuffixService::RuleList

Resets the default rule list and reinitialize it parsing the content of default_definition.



239
240
241
# File 'lib/public_suffix_service/rule_list.rb', line 239

def reload
  self.clear.default
end

Instance Method Details

#==(other) ⇒ Boolean Also known as: eql?

Checks whether two lists are equal.

RuleList one is equal to two, if two is an instance of PublicSuffixService::RuleList and each PublicSuffixService::Rule::* in list one is available in list two, in the same order.

Parameters:

Returns:

  • (Boolean)


98
99
100
101
102
# File 'lib/public_suffix_service/rule_list.rb', line 98

def ==(other)
  return false unless other.is_a?(RuleList)
  self.equal?(other) ||
  self.list == other.list
end

#add(rule, index = true) ⇒ self Also known as: <<

Adds the given object to the list  and optionally refreshes the rule index.

Parameters:

  • rule (PublicSuffixService::Rule::*)

    The rule to add to the list.

  • index (Boolean) (defaults to: true)

    Set to true to recreate the rule index after the rule has been added to the list.

Returns:

  • (self)

See Also:



130
131
132
133
134
# File 'lib/public_suffix_service/rule_list.rb', line 130

def add(rule, index = true)
  @list << rule
  create_index! if index == true
  self
end

#clearself

Removes all elements.

Returns:

  • (self)


155
156
157
158
# File 'lib/public_suffix_service/rule_list.rb', line 155

def clear
  @list.clear
  self
end

#create_index!Object

Creates a naive index for @list. Just a hash that will tell us where the elements of @list are relative to its first PublicSuffixService::Rule::Base#labels element.

For instance if @list and @list are the only elements of the list where Rule#labels.first is ‘us’ @indexes #=> [5,4], that way in select we can avoid mapping every single rule against the candidate domain.



77
78
79
80
81
82
83
84
85
# File 'lib/public_suffix_service/rule_list.rb', line 77

def create_index!
  @list.map { |l| l.labels.first }.each_with_index do |elm, inx|
    if !@indexes.has_key?(elm)
      @indexes[elm] = [inx]
    else
      @indexes[elm] << inx
    end
  end
end

#each(*args, &block) ⇒ Object

Iterates each rule in the list.



106
107
108
# File 'lib/public_suffix_service/rule_list.rb', line 106

def each(*args, &block)
  @list.each(*args, &block)
end

#empty?Boolean

Checks whether the list is empty.

Returns:

  • (Boolean)


148
149
150
# File 'lib/public_suffix_service/rule_list.rb', line 148

def empty?
  @list.empty?
end

#find(domain) ⇒ PublicSuffixService::Rule::*?

Returns the most appropriate rule for domain.

From the Public Suffix List documentation:

  • If a hostname matches more than one rule in the file, the longest matching rule (the one with the most levels) will be used.

  • An exclamation mark (!) at the start of a rule marks an exception to a previous wildcard rule. An exception rule takes priority over any other matching rule.

Algorithm description

  • Match domain against all rules and take note of the matching ones.

  • If no rules match, the prevailing rule is “*”.

  • If more than one rule matches, the prevailing rule is the one which is an exception rule.

  • If there is no matching exception rule, the prevailing rule is the one with the most labels.

  • If the prevailing rule is a exception rule, modify it by removing the leftmost label.

  • The public suffix is the set of labels from the domain which directly match the labels of the prevailing rule (joined by dots).

  • The registered domain is the public suffix plus one additional label.

Parameters:

  • domain (String, #to_s)

    The domain name.

Returns:



184
185
186
187
188
# File 'lib/public_suffix_service/rule_list.rb', line 184

def find(domain)
  rules = select(domain)
  rules.select { |r|   r.type == :exception }.first ||
  rules.inject { |t,r| t.length > r.length ? t : r }
end

#select(domain) ⇒ Array<PublicSuffixService::Rule::*>

Selects all the rules matching given domain.

Will use @indexes to try only the rules that share the same first label, that will speed up things when using RuleList.find(‘foo’) a lot.

Parameters:

  • domain (String, #to_s)

    The domain name.

Returns:



198
199
200
201
# File 'lib/public_suffix_service/rule_list.rb', line 198

def select(domain)
  indices = (@indexes[Domain.domain_to_labels(domain).first] || [])
  @list.values_at(*indices).select { |rule| rule.match?(domain) }
end

#sizeInteger Also known as: length

Gets the number of elements in the list.

Returns:

  • (Integer)


140
141
142
# File 'lib/public_suffix_service/rule_list.rb', line 140

def size
  @list.size
end

#to_aArray<PublicSuffixService::Rule::*>

Gets the list as array.

Returns:



113
114
115
# File 'lib/public_suffix_service/rule_list.rb', line 113

def to_a
  @list
end