Class: PublicSuffix::List
- Inherits:
-
Object
- Object
- PublicSuffix::List
- Includes:
- Enumerable
- Defined in:
- lib/public_suffix/list.rb
Overview
A List is a collection of one or more Rule.
Given a List, you can add or remove Rule, iterate all items in the list or search for the first rule which matches a specific domain name.
# Create a new list
list = PublicSuffix::List.new
# Push two rules to the list
list << PublicSuffix::Rule.factory("it")
list << PublicSuffix::Rule.factory("com")
# Get the size of the list
list.size
# => 2
# Search for the rule matching given domain
list.find("example.com")
# => #<PublicSuffix::Rule::Normal>
list.find("example.org")
# => nil
You can create as many List you want. The List.default rule list is used to tokenize and validate a domain.
List implements Enumerable
module.
Constant Summary collapse
- @@default =
nil
Instance Attribute Summary collapse
-
#indexes ⇒ Array
readonly
Gets the naive index, a hash that with the keys being the first label of every rule pointing to an array of integers (indexes of the rules in @rules).
-
#rules ⇒ Array<PublicSuffix::Rule::*>
readonly
Gets the array of rules.
Class Method Summary collapse
-
.clear ⇒ self
Sets the default rule list to
nil
. -
.default ⇒ PublicSuffix::List
Gets the default rule list.
-
.default=(value) ⇒ PublicSuffix::List
Sets the default rule list to
value
. -
.default_definition ⇒ File
Gets the default definition list.
-
.parse(input) ⇒ Array<PublicSuffix::Rule::*>
Parse given
input
treating the content as Public Suffix List. -
.reload ⇒ PublicSuffix::List
Resets the default rule list and reinitialize it parsing the content of List.default_definition.
Instance Method Summary collapse
-
#==(other) ⇒ Boolean
(also: #eql?)
Checks whether two lists are equal.
-
#add(rule, index = true) ⇒ self
(also: #<<)
Adds the given object to the list and optionally refreshes the rule index.
-
#clear ⇒ self
Removes all elements.
-
#create_index! ⇒ Object
Creates a naive index for @rules.
-
#each(*args, &block) ⇒ Object
Iterates each rule in the list.
-
#empty? ⇒ Boolean
Checks whether the list is empty.
-
#find(domain) ⇒ PublicSuffix::Rule::*?
Returns the most appropriate rule for domain.
-
#initialize {|self| ... } ⇒ List
constructor
Initializes an empty List.
-
#select(domain) ⇒ Array<PublicSuffix::Rule::*>
Selects all the rules matching given domain.
-
#size ⇒ Integer
(also: #length)
Gets the number of elements in the list.
-
#to_a ⇒ Array<PublicSuffix::Rule::*>
Gets the list as array.
Constructor Details
#initialize {|self| ... } ⇒ List
Initializes an empty PublicSuffix::List.
63 64 65 66 67 68 |
# File 'lib/public_suffix/list.rb', line 63 def initialize(&block) @rules = [] @indexes = {} yield(self) if block_given? create_index! end |
Instance Attribute Details
#indexes ⇒ Array (readonly)
Gets the naive index, a hash that with the keys being the first label of every rule pointing to an array of integers (indexes of the rules in @rules).
55 56 57 |
# File 'lib/public_suffix/list.rb', line 55 def indexes @indexes end |
#rules ⇒ Array<PublicSuffix::Rule::*> (readonly)
Gets the array of rules.
49 50 51 |
# File 'lib/public_suffix/list.rb', line 49 def rules @rules end |
Class Method Details
.clear ⇒ self
Sets the default rule list to nil
.
229 230 231 232 |
# File 'lib/public_suffix/list.rb', line 229 def clear self.default = nil self end |
.default ⇒ PublicSuffix::List
Gets the default rule list. Initializes a new PublicSuffix::List parsing the content of default_definition, if required.
212 213 214 |
# File 'lib/public_suffix/list.rb', line 212 def default @@default ||= parse(default_definition) end |
.default=(value) ⇒ PublicSuffix::List
Sets the default rule list to value
.
222 223 224 |
# File 'lib/public_suffix/list.rb', line 222 def default=(value) @@default = value end |
.default_definition ⇒ File
Gets the default definition list. Can be any IOStream
including a File
or a simple String
. The object must respond to #each_line
.
248 249 250 |
# File 'lib/public_suffix/list.rb', line 248 def default_definition File.new(File.join(File.dirname(__FILE__), "definitions.txt"), "r:utf-8") end |
.parse(input) ⇒ Array<PublicSuffix::Rule::*>
Parse given input
treating the content as Public Suffix List.
See publicsuffix.org/format/ for more details about input format.
260 261 262 263 264 265 266 267 268 269 270 271 272 273 274 275 276 277 |
# File 'lib/public_suffix/list.rb', line 260 def parse(input) new do |list| input.each_line do |line| line.strip! # strip blank lines if line.empty? next # strip comments elsif line =~ %r{^//} next # append rule else list.add(Rule.factory(line), false) end end end end |
.reload ⇒ PublicSuffix::List
Resets the default rule list and reinitialize it parsing the content of default_definition.
238 239 240 |
# File 'lib/public_suffix/list.rb', line 238 def reload self.clear.default end |
Instance Method Details
#==(other) ⇒ Boolean Also known as: eql?
Checks whether two lists are equal.
List one
is equal to two
, if two
is an instance of PublicSuffix::List and each PublicSuffix::Rule::*
in list one
is available in list two
, in the same order.
97 98 99 100 101 |
# File 'lib/public_suffix/list.rb', line 97 def ==(other) return false unless other.is_a?(List) self.equal?(other) || self.rules == other.rules end |
#add(rule, index = true) ⇒ self Also known as: <<
Adds the given object to the list and optionally refreshes the rule index.
129 130 131 132 133 |
# File 'lib/public_suffix/list.rb', line 129 def add(rule, index = true) @rules << rule create_index! if index == true self end |
#clear ⇒ self
Removes all elements.
154 155 156 157 |
# File 'lib/public_suffix/list.rb', line 154 def clear @rules.clear self end |
#create_index! ⇒ Object
Creates a naive index for @rules. Just a hash that will tell us where the elements of @rules are relative to its first Rule::Base#labels element.
For instance if @rules and @rules are the only elements of the list where Rule#labels.first is ‘us’ @indexes #=> [5,4], that way in select we can avoid mapping every single rule against the candidate domain.
77 78 79 80 81 82 83 84 85 |
# File 'lib/public_suffix/list.rb', line 77 def create_index! @rules.map { |l| l.labels.first }.each_with_index do |elm, inx| if !@indexes.has_key?(elm) @indexes[elm] = [inx] else @indexes[elm] << inx end end end |
#each(*args, &block) ⇒ Object
Iterates each rule in the list.
105 106 107 |
# File 'lib/public_suffix/list.rb', line 105 def each(*args, &block) @rules.each(*args, &block) end |
#empty? ⇒ Boolean
Checks whether the list is empty.
147 148 149 |
# File 'lib/public_suffix/list.rb', line 147 def empty? @rules.empty? end |
#find(domain) ⇒ PublicSuffix::Rule::*?
Returns the most appropriate rule for domain.
From the Public Suffix List documentation:
-
If a hostname matches more than one rule in the file, the longest matching rule (the one with the most levels) will be used.
-
An exclamation mark (!) at the start of a rule marks an exception to a previous wildcard rule. An exception rule takes priority over any other matching rule.
Algorithm description
-
Match domain against all rules and take note of the matching ones.
-
If no rules match, the prevailing rule is “*”.
-
If more than one rule matches, the prevailing rule is the one which is an exception rule.
-
If there is no matching exception rule, the prevailing rule is the one with the most labels.
-
If the prevailing rule is a exception rule, modify it by removing the leftmost label.
-
The public suffix is the set of labels from the domain which directly match the labels of the prevailing rule (joined by dots).
-
The registered domain is the public suffix plus one additional label.
183 184 185 186 187 |
# File 'lib/public_suffix/list.rb', line 183 def find(domain) rules = select(domain) rules.select { |r| r.type == :exception }.first || rules.inject { |t,r| t.length > r.length ? t : r } end |
#select(domain) ⇒ Array<PublicSuffix::Rule::*>
Selects all the rules matching given domain.
Will use @indexes to try only the rules that share the same first label, that will speed up things when using List.find(‘foo’) a lot.
197 198 199 200 |
# File 'lib/public_suffix/list.rb', line 197 def select(domain) indices = (@indexes[Domain.domain_to_labels(domain).first] || []) @rules.values_at(*indices).select { |rule| rule.match?(domain) } end |
#size ⇒ Integer Also known as: length
Gets the number of elements in the list.
139 140 141 |
# File 'lib/public_suffix/list.rb', line 139 def size @rules.size end |
#to_a ⇒ Array<PublicSuffix::Rule::*>
Gets the list as array.
112 113 114 |
# File 'lib/public_suffix/list.rb', line 112 def to_a @rules end |