Class: PublicSuffixService::RuleList
- Inherits:
-
Object
- Object
- PublicSuffixService::RuleList
- Includes:
- Enumerable
- Defined in:
- lib/public_suffix_service/rule_list.rb
Overview
A RuleList is a collection of one or more Rule.
Given a RuleList, you can add or remove Rule, iterate all items in the list or search for the first rule which matches a specific domain name.
# Create a new list
list = PublicSuffixService::RuleList.new
# Push two rules to the list
list << PublicSuffixService::Rule.factory("it")
list << PublicSuffixService::Rule.factory("com")
# Get the size of the list
list.size
# => 2
# Search for the rule matching given domain
list.find("example.com")
# => #<PublicSuffixService::Rule::Normal>
list.find("example.org")
# => nil
You can create as many RuleList you want. The RuleList.default rule list is used to tokenize and validate a domain.
RuleList implements Enumerable
module.
Constant Summary collapse
- @@default =
nil
Instance Attribute Summary collapse
-
#indexes ⇒ Array
readonly
Gets the naive index, a hash that with the keys being the first label of every rule pointing to an array of integers (indexes of the rules in @list).
-
#list ⇒ Array<PublicSuffixService::Rule::*>
readonly
Gets the list of rules.
Class Method Summary collapse
-
.clear ⇒ self
Sets the default rule list to
nil
. -
.default ⇒ PublicSuffixService::RuleList
Gets the default rule list.
-
.default=(value) ⇒ PublicSuffixService::RuleList
Sets the default rule list to
value
. -
.default_definition ⇒ File
Gets the default definition list.
-
.parse(input) ⇒ Array<PublicSuffixService::Rule::*>
Parse given
input
treating the content as Public Suffix List. -
.reload ⇒ PublicSuffixService::RuleList
Resets the default rule list and reinitialize it parsing the content of RuleList.default_definition.
Instance Method Summary collapse
-
#==(other) ⇒ Boolean
(also: #eql?)
Checks whether two lists are equal.
-
#add(rule, index = true) ⇒ self
(also: #<<)
Adds the given object to the list and optionally refreshes the rule index.
-
#clear ⇒ self
Removes all elements.
-
#create_index! ⇒ Object
Creates a naive index for @list.
-
#each(*args, &block) ⇒ Object
Iterates each rule in the list.
-
#empty? ⇒ Boolean
Checks whether the list is empty.
-
#find(domain) ⇒ PublicSuffixService::Rule::*?
Returns the most appropriate rule for domain.
-
#initialize {|self| ... } ⇒ RuleList
constructor
Initializes an empty RuleList.
-
#select(domain) ⇒ Array<PublicSuffixService::Rule::*>
Selects all the rules matching given domain.
-
#size ⇒ Integer
(also: #length)
Gets the number of elements in the list.
-
#to_a ⇒ Array<PublicSuffixService::Rule::*>
Gets the list as array.
Constructor Details
#initialize {|self| ... } ⇒ RuleList
Initializes an empty PublicSuffixService::RuleList.
63 64 65 66 67 68 |
# File 'lib/public_suffix_service/rule_list.rb', line 63 def initialize(&block) @list = [] @indexes = {} yield(self) if block_given? create_index! end |
Instance Attribute Details
#indexes ⇒ Array (readonly)
Gets the naive index, a hash that with the keys being the first label of every rule pointing to an array of integers (indexes of the rules in @list)
55 56 57 |
# File 'lib/public_suffix_service/rule_list.rb', line 55 def indexes @indexes end |
#list ⇒ Array<PublicSuffixService::Rule::*> (readonly)
Gets the list of rules.
49 50 51 |
# File 'lib/public_suffix_service/rule_list.rb', line 49 def list @list end |
Class Method Details
.clear ⇒ self
Sets the default rule list to nil
.
230 231 232 233 |
# File 'lib/public_suffix_service/rule_list.rb', line 230 def clear self.default = nil self end |
.default ⇒ PublicSuffixService::RuleList
Gets the default rule list. Initializes a new PublicSuffixService::RuleList parsing the content of default_definition, if required.
213 214 215 |
# File 'lib/public_suffix_service/rule_list.rb', line 213 def default @@default ||= parse(default_definition) end |
.default=(value) ⇒ PublicSuffixService::RuleList
Sets the default rule list to value
.
223 224 225 |
# File 'lib/public_suffix_service/rule_list.rb', line 223 def default=(value) @@default = value end |
.default_definition ⇒ File
Gets the default definition list. Can be any IOStream
including a File
or a simple String
. The object must respond to #each_line
.
249 250 251 |
# File 'lib/public_suffix_service/rule_list.rb', line 249 def default_definition File.new(File.join(File.dirname(__FILE__), "definitions.txt")) end |
.parse(input) ⇒ Array<PublicSuffixService::Rule::*>
Parse given input
treating the content as Public Suffix List.
See publicsuffix.org/format/ for more details about input format.
261 262 263 264 265 266 267 268 269 270 271 272 273 274 275 276 277 278 |
# File 'lib/public_suffix_service/rule_list.rb', line 261 def parse(input) new do |list| input.each_line do |line| line.strip! # strip blank lines if line.empty? next # strip comments elsif line =~ %r{^//} next # append rule else list.add(Rule.factory(line), false) end end end end |
.reload ⇒ PublicSuffixService::RuleList
Resets the default rule list and reinitialize it parsing the content of default_definition.
239 240 241 |
# File 'lib/public_suffix_service/rule_list.rb', line 239 def reload self.clear.default end |
Instance Method Details
#==(other) ⇒ Boolean Also known as: eql?
Checks whether two lists are equal.
RuleList one
is equal to two
, if two
is an instance of PublicSuffixService::RuleList and each PublicSuffixService::Rule::*
in list one
is available in list two
, in the same order.
98 99 100 101 102 |
# File 'lib/public_suffix_service/rule_list.rb', line 98 def ==(other) return false unless other.is_a?(RuleList) self.equal?(other) || self.list == other.list end |
#add(rule, index = true) ⇒ self Also known as: <<
Adds the given object to the list and optionally refreshes the rule index.
130 131 132 133 134 |
# File 'lib/public_suffix_service/rule_list.rb', line 130 def add(rule, index = true) @list << rule create_index! if index == true self end |
#clear ⇒ self
Removes all elements.
155 156 157 158 |
# File 'lib/public_suffix_service/rule_list.rb', line 155 def clear @list.clear self end |
#create_index! ⇒ Object
Creates a naive index for @list. Just a hash that will tell us where the elements of @list are relative to its first PublicSuffixService::Rule::Base#labels element.
For instance if @list and @list are the only elements of the list where Rule#labels.first is ‘us’ @indexes #=> [5,4], that way in select we can avoid mapping every single rule against the candidate domain.
77 78 79 80 81 82 83 84 85 |
# File 'lib/public_suffix_service/rule_list.rb', line 77 def create_index! @list.map { |l| l.labels.first }.each_with_index do |elm, inx| if !@indexes.has_key?(elm) @indexes[elm] = [inx] else @indexes[elm] << inx end end end |
#each(*args, &block) ⇒ Object
Iterates each rule in the list.
106 107 108 |
# File 'lib/public_suffix_service/rule_list.rb', line 106 def each(*args, &block) @list.each(*args, &block) end |
#empty? ⇒ Boolean
Checks whether the list is empty.
148 149 150 |
# File 'lib/public_suffix_service/rule_list.rb', line 148 def empty? @list.empty? end |
#find(domain) ⇒ PublicSuffixService::Rule::*?
Returns the most appropriate rule for domain.
From the Public Suffix List documentation:
-
If a hostname matches more than one rule in the file, the longest matching rule (the one with the most levels) will be used.
-
An exclamation mark (!) at the start of a rule marks an exception to a previous wildcard rule. An exception rule takes priority over any other matching rule.
Algorithm description
-
Match domain against all rules and take note of the matching ones.
-
If no rules match, the prevailing rule is “*”.
-
If more than one rule matches, the prevailing rule is the one which is an exception rule.
-
If there is no matching exception rule, the prevailing rule is the one with the most labels.
-
If the prevailing rule is a exception rule, modify it by removing the leftmost label.
-
The public suffix is the set of labels from the domain which directly match the labels of the prevailing rule (joined by dots).
-
The registered domain is the public suffix plus one additional label.
184 185 186 187 188 |
# File 'lib/public_suffix_service/rule_list.rb', line 184 def find(domain) rules = select(domain) rules.select { |r| r.type == :exception }.first || rules.inject { |t,r| t.length > r.length ? t : r } end |
#select(domain) ⇒ Array<PublicSuffixService::Rule::*>
Selects all the rules matching given domain.
Will use @indexes to try only the rules that share the same first label, that will speed up things when using RuleList.find(‘foo’) a lot.
198 199 200 201 |
# File 'lib/public_suffix_service/rule_list.rb', line 198 def select(domain) indices = (@indexes[Domain.domain_to_labels(domain).first] || []) @list.values_at(*indices).select { |rule| rule.match?(domain) } end |
#size ⇒ Integer Also known as: length
Gets the number of elements in the list.
140 141 142 |
# File 'lib/public_suffix_service/rule_list.rb', line 140 def size @list.size end |
#to_a ⇒ Array<PublicSuffixService::Rule::*>
Gets the list as array.
113 114 115 |
# File 'lib/public_suffix_service/rule_list.rb', line 113 def to_a @list end |