Class: Hamster::Set
- Inherits:
-
Object
- Object
- Hamster::Set
- Includes:
- Enumerable, Immutable
- Defined in:
- lib/hamster/set.rb
Overview
‘Hamster::Set` is a collection of unordered values with no duplicates. Testing whether an object is present in the `Set` can be done in constant time. `Set` is also `Enumerable`, so you can iterate over the members of the set with #each, transform them with #map, filter them with #select, and so on. Some of the `Enumerable` methods are overridden to return Hamster collections.
Like the ‘Set` class in Ruby’s standard library, which we will call RubySet, ‘Hamster::Set` defines equivalency of objects using `#hash` and `#eql?`. No two objects with the same `#hash` code, and which are also `#eql?`, can coexist in the same `Set`. If one is already in the `Set`, attempts to add another one will have no effect.
‘Set`s have no natural ordering and cannot be compared using `#<=>`. However, they define #<, #>, #<=, and #>= as shorthand for #proper_subset?, #proper_superset?, #subset?, and #superset? respectively.
The basic set-theoretic operations #union, #intersection, #difference, and #exclusion work with any ‘Enumerable` object.
A ‘Set` can be created in either of the following ways:
Hamster::Set.new([1, 2, 3]) # any Enumerable can be used to initialize
Hamster::Set['A', 'B', 'C', 'D']
The latter 2 forms of initialization can be used with your own, custom subclasses of ‘Hamster::Set`.
Unlike RubySet, all methods which you might expect to “modify” a ‘Hamster::Set` actually return a new set and leave the existing one unchanged.
Class Method Summary collapse
-
.[](*items) ⇒ Set
Create a new ‘Set` populated with the given items.
-
.alloc(trie = EmptyTrie) ⇒ Set
“Raw” allocation of a new ‘Set`.
-
.empty ⇒ Set
Return an empty ‘Set`.
Instance Method Summary collapse
-
#add(item) ⇒ Set
(also: #<<)
Return a new ‘Set` with `item` added.
-
#add?(item) ⇒ Set, false
If ‘item` is not a member of this `Set`, return a new `Set` with `item` added.
-
#clear ⇒ Set
Return an empty ‘Set` instance, of the same class as this one.
-
#delete(item) ⇒ Set
Return a new ‘Set` with `item` removed.
-
#delete?(item) ⇒ Set, false
If ‘item` is a member of this `Set`, return a new `Set` with `item` removed.
-
#difference(other) ⇒ Set
(also: #subtract, #-)
Return a new ‘Set` with all the items in `other` removed.
-
#disjoint?(other) ⇒ Boolean
Return ‘true` if this `Set` and `other` do not share any items.
-
#each {|item| ... } ⇒ self, Enumerator
Call the block once for each item in this ‘Set`.
-
#empty? ⇒ Boolean
Return ‘true` if this `Set` contains no items.
-
#eql?(other) ⇒ Boolean
(also: #==)
Return true if ‘other` has the same type and contents as this `Set`.
-
#exclusion(other) ⇒ Set
(also: #^)
Return a new ‘Set` which contains all the items which are members of this `Set` or of `other`, but not both.
-
#first ⇒ Object
Return a member of this ‘Set`.
-
#flatten ⇒ Set
Recursively insert the contents of any nested ‘Set`s into this `Set`, and remove them.
-
#hash ⇒ Integer
See ‘Object#hash`.
-
#include?(object) ⇒ Boolean
(also: #member?)
Return ‘true` if the given item is present in this `Set`.
-
#initialize(items = []) ⇒ Set
constructor
A new instance of Set.
-
#intersect?(other) ⇒ Boolean
Return ‘true` if this `Set` and `other` have at least one item in common.
-
#intersection(other) ⇒ Set
(also: #&)
Return a new ‘Set` which contains all the items which are members of both this `Set` and `other`.
-
#map {|item| ... } ⇒ Set
(also: #collect)
Call the block once for each item in this ‘Set`.
- #marshal_dump ⇒ Object
- #marshal_load(dictionary) ⇒ Object
-
#proper_subset?(other) ⇒ Boolean
(also: #<)
Returns ‘true` if `other` contains all the items in this `Set`, plus at least one item which is not in this set.
-
#proper_superset?(other) ⇒ Boolean
(also: #>)
Returns ‘true` if this `Set` contains all the items in `other`, plus at least one item which is not in `other`.
-
#reverse_each {|item| ... } ⇒ self
Call the block once for each item in this ‘Set`.
-
#sample ⇒ Object
Return a randomly chosen item from this ‘Set`.
-
#select {|item| ... } ⇒ Set
(also: #find_all, #keep_if)
Return a new ‘Set` with all the items for which the block returns true.
-
#size ⇒ Integer
(also: #length)
Return the number of items in this ‘Set`.
-
#sort {|a, b| ... } ⇒ SortedSet
Return a SortedSet which contains the same items as this ‘Set`, ordered by the given comparator block.
-
#sort_by {|item| ... } ⇒ SortedSet
Return a SortedSet which contains the same items as this ‘Set`, ordered by mapping each item through the provided block to obtain sort keys, and then sorting the keys.
-
#subset?(other) ⇒ Boolean
(also: #<=)
Return ‘true` if all items in this `Set` are also in `other`.
-
#superset?(other) ⇒ Boolean
(also: #>=)
Return ‘true` if all items in `other` are also in this `Set`.
-
#to_set ⇒ self
Return ‘self`.
-
#union(other) ⇒ Set
(also: #|, #+, #merge)
Return a new ‘Set` which contains all the members of both this `Set` and `other`.
Methods included from Enumerable
#<=>, #compact, #each_index, #grep, #grep_v, #group_by, #inspect, #join, #partition, #pretty_print, #product, #reject, #sum
Methods included from Enumerable
Methods included from Immutable
Constructor Details
Class Method Details
.[](*items) ⇒ Set
Create a new ‘Set` populated with the given items.
57 58 59 |
# File 'lib/hamster/set.rb', line 57 def [](*items) items.empty? ? empty : new(items) end |
.alloc(trie = EmptyTrie) ⇒ Set
“Raw” allocation of a new ‘Set`. Used internally to create a new instance quickly after obtaining a modified Trie.
74 75 76 |
# File 'lib/hamster/set.rb', line 74 def alloc(trie = EmptyTrie) allocate.tap { |s| s.instance_variable_set(:@trie, trie) } end |
.empty ⇒ Set
Return an empty ‘Set`. If used on a subclass, returns an empty instance of that class.
65 66 67 |
# File 'lib/hamster/set.rb', line 65 def empty @empty ||= self.new end |
Instance Method Details
#add(item) ⇒ Set Also known as: <<
Return a new ‘Set` with `item` added. If `item` is already in the set, return `self`.
106 107 108 |
# File 'lib/hamster/set.rb', line 106 def add(item) include?(item) ? self : self.class.alloc(@trie.put(item, nil)) end |
#add?(item) ⇒ Set, false
If ‘item` is not a member of this `Set`, return a new `Set` with `item` added. Otherwise, return `false`.
120 121 122 |
# File 'lib/hamster/set.rb', line 120 def add?(item) !include?(item) && add(item) end |
#clear ⇒ Set
Return an empty ‘Set` instance, of the same class as this one. Useful if you have multiple subclasses of `Set` and want to treat them polymorphically.
505 506 507 |
# File 'lib/hamster/set.rb', line 505 def clear self.class.empty end |
#delete(item) ⇒ Set
Return a new ‘Set` with `item` removed. If `item` is not a member of the set, return `self`.
133 134 135 136 |
# File 'lib/hamster/set.rb', line 133 def delete(item) trie = @trie.delete(item) new_trie(trie) end |
#delete?(item) ⇒ Set, false
If ‘item` is a member of this `Set`, return a new `Set` with `item` removed. Otherwise, return `false`.
147 148 149 |
# File 'lib/hamster/set.rb', line 147 def delete?(item) include?(item) && delete(item) end |
#difference(other) ⇒ Set Also known as: subtract, -
Return a new ‘Set` with all the items in `other` removed. `other` can be any `Enumerable` object.
347 348 349 350 351 352 353 354 |
# File 'lib/hamster/set.rb', line 347 def difference(other) trie = if (@trie.size <= other.size) && (other.is_a?(Hamster::Set) || (defined?(::Set) && other.is_a?(::Set))) @trie.select { |key, _| !other.include?(key) } else @trie.bulk_delete(other) end new_trie(trie) end |
#disjoint?(other) ⇒ Boolean
Return ‘true` if this `Set` and `other` do not share any items.
449 450 451 452 453 454 455 456 457 458 459 460 |
# File 'lib/hamster/set.rb', line 449 def disjoint?(other) if other.size <= size other.each { |item| return false if include?(item) } else # See comment on #subset? if other.size >= 150 && @trie.size >= 190 && !(other.is_a?(Hamster::Set) || other.is_a?(::Set)) other = ::Set.new(other) end each { |item| return false if other.include?(item) } end true end |
#each {|item| ... } ⇒ self, Enumerator
Call the block once for each item in this ‘Set`. No specific iteration order is guaranteed, but the order will be stable for any particular `Set`. If no block is given, an `Enumerator` is returned instead.
164 165 166 167 168 |
# File 'lib/hamster/set.rb', line 164 def each return to_enum if not block_given? @trie.each { |key, _| yield(key) } self end |
#empty? ⇒ Boolean
Return ‘true` if this `Set` contains no items.
86 87 88 |
# File 'lib/hamster/set.rb', line 86 def empty? @trie.empty? end |
#eql?(other) ⇒ Boolean Also known as: ==
Return true if ‘other` has the same type and contents as this `Set`.
513 514 515 516 517 518 519 520 521 522 |
# File 'lib/hamster/set.rb', line 513 def eql?(other) return true if other.equal?(self) return false if not instance_of?(other.class) other_trie = other.instance_variable_get(:@trie) return false if @trie.size != other_trie.size @trie.each do |key, _| return false if !other_trie.key?(key) end true end |
#exclusion(other) ⇒ Set Also known as: ^
Return a new ‘Set` which contains all the items which are members of this `Set` or of `other`, but not both. `other` can be any `Enumerable` object.
366 367 368 |
# File 'lib/hamster/set.rb', line 366 def exclusion(other) ((self | other) - (self & other)) end |
#first ⇒ Object
Return a member of this ‘Set`. The member chosen will be the first one which would be yielded by #each. If the set is empty, return `nil`.
243 244 245 |
# File 'lib/hamster/set.rb', line 243 def first (entry = @trie.at(0)) && entry[0] end |
#flatten ⇒ Set
Recursively insert the contents of any nested ‘Set`s into this `Set`, and remove them.
481 482 483 484 485 486 |
# File 'lib/hamster/set.rb', line 481 def flatten reduce(self.class.empty) do |set, item| next set.union(item.flatten) if item.is_a?(Set) set.add(item) end end |
#hash ⇒ Integer
See ‘Object#hash`.
527 528 529 |
# File 'lib/hamster/set.rb', line 527 def hash reduce(0) { |hash, item| (hash << 5) - hash + item.hash } end |
#include?(object) ⇒ Boolean Also known as: member?
Return ‘true` if the given item is present in this `Set`. More precisely, return `true` if an object with the same `#hash` code, and which is also `#eql?` to the given object is present.
231 232 233 |
# File 'lib/hamster/set.rb', line 231 def include?(object) @trie.key?(object) end |
#intersect?(other) ⇒ Boolean
Return ‘true` if this `Set` and `other` have at least one item in common.
469 470 471 |
# File 'lib/hamster/set.rb', line 469 def intersect?(other) !disjoint?(other) end |
#intersection(other) ⇒ Set Also known as: &
Return a new ‘Set` which contains all the items which are members of both this `Set` and `other`. `other` can be any `Enumerable` object.
324 325 326 327 328 329 330 331 332 333 334 335 336 |
# File 'lib/hamster/set.rb', line 324 def intersection(other) if other.size < @trie.size if other.is_a?(Hamster::Set) trie = other.instance_variable_get(:@trie).select { |key, _| include?(key) } else trie = Trie.new(0) other.each { |obj| trie.put!(obj, nil) if include?(obj) } end else trie = @trie.select { |key, _| other.include?(key) } end new_trie(trie) end |
#map {|item| ... } ⇒ Set Also known as: collect
Call the block once for each item in this ‘Set`. All the values returned from the block will be gathered into a new `Set`. If no block is given, an `Enumerator` is returned instead.
214 215 216 217 218 |
# File 'lib/hamster/set.rb', line 214 def map return enum_for(:map) if not block_given? return self if empty? self.class.new(super) end |
#marshal_dump ⇒ Object
542 543 544 545 546 547 548 |
# File 'lib/hamster/set.rb', line 542 def marshal_dump output = {} each do |key| output[key] = nil end output end |
#marshal_load(dictionary) ⇒ Object
551 552 553 554 555 |
# File 'lib/hamster/set.rb', line 551 def marshal_load(dictionary) @trie = dictionary.reduce(EmptyTrie) do |trie, key_value| trie.put(key_value.first, nil) end end |
#proper_subset?(other) ⇒ Boolean Also known as: <
Returns ‘true` if `other` contains all the items in this `Set`, plus at least one item which is not in this set.
418 419 420 421 422 423 424 425 |
# File 'lib/hamster/set.rb', line 418 def proper_subset?(other) return false if other.size <= size # See comments above if other.size >= 150 && @trie.size >= 190 && !(other.is_a?(Hamster::Set) || other.is_a?(::Set)) other = ::Set.new(other) end all? { |item| other.include?(item) } end |
#proper_superset?(other) ⇒ Boolean Also known as: >
Returns ‘true` if this `Set` contains all the items in `other`, plus at least one item which is not in `other`.
437 438 439 |
# File 'lib/hamster/set.rb', line 437 def proper_superset?(other) other.proper_subset?(self) end |
#reverse_each {|item| ... } ⇒ self
Call the block once for each item in this ‘Set`. Iteration order will be the opposite of #each. If no block is given, an `Enumerator` is returned instead.
183 184 185 186 187 |
# File 'lib/hamster/set.rb', line 183 def reverse_each return enum_for(:reverse_each) if not block_given? @trie.reverse_each { |key, _| yield(key) } self end |
#sample ⇒ Object
Return a randomly chosen item from this ‘Set`. If the set is empty, return `nil`.
497 498 499 |
# File 'lib/hamster/set.rb', line 497 def sample empty? ? nil : @trie.at(rand(size))[0] end |
#select {|item| ... } ⇒ Set Also known as: find_all, keep_if
Return a new ‘Set` with all the items for which the block returns true.
196 197 198 199 200 |
# File 'lib/hamster/set.rb', line 196 def select return enum_for(:select) unless block_given? trie = @trie.select { |key, _| yield(key) } new_trie(trie) end |
#size ⇒ Integer Also known as: length
Return the number of items in this ‘Set`.
92 93 94 |
# File 'lib/hamster/set.rb', line 92 def size @trie.size end |
#sort {|a, b| ... } ⇒ SortedSet
Return a Hamster::SortedSet which contains the same items as this ‘Set`, ordered by the given comparator block.
261 262 263 |
# File 'lib/hamster/set.rb', line 261 def sort(&comparator) SortedSet.new(self.to_a, &comparator) end |
#sort_by {|item| ... } ⇒ SortedSet
Return a Hamster::SortedSet which contains the same items as this ‘Set`, ordered by mapping each item through the provided block to obtain sort keys, and then sorting the keys.
279 280 281 |
# File 'lib/hamster/set.rb', line 279 def sort_by(&mapper) SortedSet.new(self.to_a, &mapper) end |
#subset?(other) ⇒ Boolean Also known as: <=
Return ‘true` if all items in this `Set` are also in `other`.
378 379 380 381 382 383 384 385 386 387 388 389 390 391 392 393 394 |
# File 'lib/hamster/set.rb', line 378 def subset?(other) return false if other.size < size # This method has the potential to be very slow if 'other' is a large Array, so to avoid that, # we convert those Arrays to Sets before checking presence of items # Time to convert Array -> Set is linear in array.size # Time to check for presence of all items in an Array is proportional to set.size * array.size # Note that both sides of that equation have array.size -- hence those terms cancel out, # and the break-even point is solely dependent on the size of this collection # After doing some benchmarking to estimate the constants, it appears break-even is at ~190 items # We also check other.size, to avoid the more expensive #is_a? checks in cases where it doesn't matter # if other.size >= 150 && @trie.size >= 190 && !(other.is_a?(Hamster::Set) || other.is_a?(::Set)) other = ::Set.new(other) end all? { |item| other.include?(item) } end |
#superset?(other) ⇒ Boolean Also known as: >=
Return ‘true` if all items in `other` are also in this `Set`.
404 405 406 |
# File 'lib/hamster/set.rb', line 404 def superset?(other) other.subset?(self) end |
#to_set ⇒ self
Return ‘self`.
537 538 539 |
# File 'lib/hamster/set.rb', line 537 def to_set self end |
#union(other) ⇒ Set Also known as: |, +, merge
Return a new ‘Set` which contains all the members of both this `Set` and `other`. `other` can be any `Enumerable` object.
291 292 293 294 295 296 297 298 299 300 301 302 303 304 305 306 307 308 309 310 311 |
# File 'lib/hamster/set.rb', line 291 def union(other) if other.is_a?(Hamster::Set) if other.size > size small_set_pairs = @trie large_set_trie = other.instance_variable_get(:@trie) else small_set_pairs = other.instance_variable_get(:@trie) large_set_trie = @trie end else if other.respond_to?(:lazy) small_set_pairs = other.lazy.map { |e| [e, nil] } else small_set_pairs = other.map { |e| [e, nil] } end large_set_trie = @trie end trie = large_set_trie.bulk_put(small_set_pairs) new_trie(trie) end |