Class: Hamster::Set
Overview
Hamster::Set
is a collection of unordered values with no duplicates. Testing whether
an object is present in the Set
can be done in constant time. Set
is also Enumerable
, so you can
iterate over the members of the set with #each, transform them with #map, filter
them with #select, and so on. Some of the Enumerable
methods are overridden to
return Hamster collections.
Like the Set
class in Ruby's standard library, which we will call RubySet,
Hamster::Set
defines equivalency of objects using #hash
and #eql?
. No two
objects with the same #hash
code, and which are also #eql?
, can coexist in the
same Set
. If one is already in the Set
, attempts to add another one will have
no effect.
Set
s have no natural ordering and cannot be compared using #<=>
. However, they
define #<, #>, #<=, and #>= as shorthand for #proper_subset?,
#proper_superset?, #subset?, and #superset? respectively.
The basic set-theoretic operations #union, #intersection, #difference, and
#exclusion work with any Enumerable
object.
A Set
can be created in either of the following ways:
Hamster::Set.new([1, 2, 3]) # any Enumerable can be used to initialize
Hamster::Set['A', 'B', 'C', 'D']
The latter 2 forms of initialization can be used with your own, custom subclasses
of Hamster::Set
.
Unlike RubySet, all methods which you might expect to "modify" a Hamster::Set
actually return a new set and leave the existing one unchanged.
Class Method Summary collapse
-
.[](*items) ⇒ Set
Create a new
Set
populated with the given items. -
.empty ⇒ Set
Return an empty
Set
.
Instance Method Summary collapse
-
#add(item) ⇒ Set
(also: #<<)
Return a new
Set
withitem
added. -
#add?(item) ⇒ Set, false
If
item
is not a member of thisSet
, return a newSet
withitem
added. -
#clear ⇒ Set
Return an empty
Set
instance, of the same class as this one. -
#delete(item) ⇒ Set
Return a new
Set
withitem
removed. -
#delete?(item) ⇒ Set, false
If
item
is a member of thisSet
, return a newSet
withitem
removed. -
#difference(other) ⇒ Set
(also: #subtract, #-)
Return a new
Set
with all the items inother
removed. -
#disjoint?(other) ⇒ Boolean
Return
true
if thisSet
andother
do not share any items. -
#each {|item| ... } ⇒ self, Enumerator
Call the block once for each item in this
Set
. -
#empty? ⇒ Boolean
Return
true
if thisSet
contains no items. -
#eql?(other) ⇒ Boolean
(also: #==)
Return true if
other
has the same type and contents as thisSet
. -
#exclusion(other) ⇒ Set
(also: #^)
Return a new
Set
which contains all the items which are members of thisSet
or ofother
, but not both. -
#first ⇒ Object
Return a member of this
Set
. -
#flatten ⇒ Set
Recursively insert the contents of any nested
Set
s into thisSet
, and remove them. -
#hash ⇒ Integer
See
Object#hash
. -
#include?(object) ⇒ Boolean
(also: #member?)
Return
true
if the given item is present in thisSet
. -
#initialize(items = []) ⇒ Set
constructor
A new instance of Set.
-
#intersect?(other) ⇒ Boolean
Return
true
if thisSet
andother
have at least one item in common. -
#intersection(other) ⇒ Set
(also: #&)
Return a new
Set
which contains all the items which are members of both thisSet
andother
. -
#map {|item| ... } ⇒ Set
(also: #collect)
Call the block once for each item in this
Set
. -
#proper_subset?(other) ⇒ Boolean
(also: #<)
Returns
true
ifother
contains all the items in thisSet
, plus at least one item which is not in this set. -
#proper_superset?(other) ⇒ Boolean
(also: #>)
Returns
true
if thisSet
contains all the items inother
, plus at least one item which is not inother
. -
#reverse_each {|item| ... } ⇒ self
Call the block once for each item in this
Set
. -
#sample ⇒ Object
Return a randomly chosen item from this
Set
. -
#select {|item| ... } ⇒ Set
(also: #find_all, #keep_if)
Return a new
Set
with all the items for which the block returns true. -
#size ⇒ Integer
(also: #length)
Return the number of items in this
Set
. -
#sort {|a, b| ... } ⇒ SortedSet
Return a SortedSet which contains the same items as this
Set
, ordered by the given comparator block. -
#sort_by {|item| ... } ⇒ SortedSet
Return a SortedSet which contains the same items as this
Set
, ordered by mapping each item through the provided block to obtain sort keys, and then sorting the keys. -
#subset?(other) ⇒ Boolean
(also: #<=)
Return
true
if all items in thisSet
are also inother
. -
#superset?(other) ⇒ Boolean
(also: #>=)
Return
true
if all items inother
are also in thisSet
. -
#to_set ⇒ self
Return
self
. -
#union(other) ⇒ Set
(also: #|, #+, #merge)
Return a new
Set
which contains all the members of both thisSet
andother
.
Methods included from Enumerable
#<=>, #compact, #each_index, #grep, #grep_v, #group_by, #inspect, #join, #partition, #product, #reject, #sum
Methods included from Enumerable
Constructor Details
#initialize(items = []) ⇒ Set
Returns a new instance of Set.
79 80 81 82 |
# File 'lib/hamster/set.rb', line 79 def initialize(items=[]) @trie = Trie.new(0) items.each { |item| @trie.put!(item, nil) } end |
Class Method Details
.[](*items) ⇒ Set
Create a new Set
populated with the given items.
57 58 59 |
# File 'lib/hamster/set.rb', line 57 def [](*items) items.empty? ? empty : new(items) end |
.empty ⇒ Set
Return an empty Set
. If used on a subclass, returns an empty instance
of that class.
65 66 67 |
# File 'lib/hamster/set.rb', line 65 def empty @empty ||= self.new end |
Instance Method Details
#add(item) ⇒ Set Also known as: <<
Return a new Set
with item
added. If item
is already in the set,
return self
.
106 107 108 |
# File 'lib/hamster/set.rb', line 106 def add(item) include?(item) ? self : self.class.alloc(@trie.put(item, nil)) end |
#add?(item) ⇒ Set, false
If item
is not a member of this Set
, return a new Set
with item
added.
Otherwise, return false
.
120 121 122 |
# File 'lib/hamster/set.rb', line 120 def add?(item) !include?(item) && add(item) end |
#clear ⇒ Set
Return an empty Set
instance, of the same class as this one. Useful if you
have multiple subclasses of Set
and want to treat them polymorphically.
505 506 507 |
# File 'lib/hamster/set.rb', line 505 def clear self.class.empty end |
#delete(item) ⇒ Set
Return a new Set
with item
removed. If item
is not a member of the set,
return self
.
133 134 135 136 |
# File 'lib/hamster/set.rb', line 133 def delete(item) trie = @trie.delete(item) new_trie(trie) end |
#delete?(item) ⇒ Set, false
If item
is a member of this Set
, return a new Set
with item
removed.
Otherwise, return false
.
147 148 149 |
# File 'lib/hamster/set.rb', line 147 def delete?(item) include?(item) && delete(item) end |
#difference(other) ⇒ Set Also known as: subtract, -
Return a new Set
with all the items in other
removed. other
can be
any Enumerable
object.
347 348 349 350 351 352 353 354 |
# File 'lib/hamster/set.rb', line 347 def difference(other) trie = if (@trie.size <= other.size) && (other.is_a?(Hamster::Set) || (defined?(::Set) && other.is_a?(::Set))) @trie.select { |key, _| !other.include?(key) } else @trie.bulk_delete(other) end new_trie(trie) end |
#disjoint?(other) ⇒ Boolean
Return true
if this Set
and other
do not share any items.
449 450 451 452 453 454 455 456 457 458 459 460 |
# File 'lib/hamster/set.rb', line 449 def disjoint?(other) if other.size <= size other.each { |item| return false if include?(item) } else # See comment on #subset? if other.size >= 150 && @trie.size >= 190 && !(other.is_a?(Hamster::Set) || other.is_a?(::Set)) other = ::Set.new(other) end each { |item| return false if other.include?(item) } end true end |
#each {|item| ... } ⇒ self, Enumerator
Call the block once for each item in this Set
. No specific iteration order
is guaranteed, but the order will be stable for any particular Set
. If
no block is given, an Enumerator
is returned instead.
164 165 166 167 168 |
# File 'lib/hamster/set.rb', line 164 def each return to_enum if not block_given? @trie.each { |key, _| yield(key) } self end |
#empty? ⇒ Boolean
Return true
if this Set
contains no items.
86 87 88 |
# File 'lib/hamster/set.rb', line 86 def empty? @trie.empty? end |
#eql?(other) ⇒ Boolean Also known as: ==
Return true if other
has the same type and contents as this Set
.
513 514 515 516 517 518 519 520 521 522 |
# File 'lib/hamster/set.rb', line 513 def eql?(other) return true if other.equal?(self) return false if not instance_of?(other.class) other_trie = other.instance_variable_get(:@trie) return false if @trie.size != other_trie.size @trie.each do |key, _| return false if !other_trie.key?(key) end true end |
#exclusion(other) ⇒ Set Also known as: ^
Return a new Set
which contains all the items which are members of this
Set
or of other
, but not both. other
can be any Enumerable
object.
366 367 368 |
# File 'lib/hamster/set.rb', line 366 def exclusion(other) ((self | other) - (self & other)) end |
#first ⇒ Object
Return a member of this Set
. The member chosen will be the first one which
would be yielded by #each. If the set is empty, return nil
.
243 244 245 |
# File 'lib/hamster/set.rb', line 243 def first (entry = @trie.at(0)) && entry[0] end |
#flatten ⇒ Set
Recursively insert the contents of any nested Set
s into this Set
, and
remove them.
481 482 483 484 485 486 |
# File 'lib/hamster/set.rb', line 481 def flatten reduce(self.class.empty) do |set, item| next set.union(item.flatten) if item.is_a?(Set) set.add(item) end end |
#hash ⇒ Integer
See Object#hash
.
527 528 529 |
# File 'lib/hamster/set.rb', line 527 def hash reduce(0) { |hash, item| (hash << 5) - hash + item.hash } end |
#include?(object) ⇒ Boolean Also known as: member?
Return true
if the given item is present in this Set
. More precisely,
return true
if an object with the same #hash
code, and which is also #eql?
to the given object is present.
231 232 233 |
# File 'lib/hamster/set.rb', line 231 def include?(object) @trie.key?(object) end |
#intersect?(other) ⇒ Boolean
Return true
if this Set
and other
have at least one item in common.
469 470 471 |
# File 'lib/hamster/set.rb', line 469 def intersect?(other) !disjoint?(other) end |
#intersection(other) ⇒ Set Also known as: &
Return a new Set
which contains all the items which are members of both
this Set
and other
. other
can be any Enumerable
object.
324 325 326 327 328 329 330 331 332 333 334 335 336 |
# File 'lib/hamster/set.rb', line 324 def intersection(other) if other.size < @trie.size if other.is_a?(Hamster::Set) trie = other.instance_variable_get(:@trie).select { |key, _| include?(key) } else trie = Trie.new(0) other.each { |obj| trie.put!(obj, nil) if include?(obj) } end else trie = @trie.select { |key, _| other.include?(key) } end new_trie(trie) end |
#map {|item| ... } ⇒ Set Also known as: collect
Call the block once for each item in this Set
. All the values returned
from the block will be gathered into a new Set
. If no block is given,
an Enumerator
is returned instead.
214 215 216 217 218 |
# File 'lib/hamster/set.rb', line 214 def map return enum_for(:map) if not block_given? return self if empty? self.class.new(super) end |
#proper_subset?(other) ⇒ Boolean Also known as: <
Returns true
if other
contains all the items in this Set
, plus at least
one item which is not in this set.
418 419 420 421 422 423 424 425 |
# File 'lib/hamster/set.rb', line 418 def proper_subset?(other) return false if other.size <= size # See comments above if other.size >= 150 && @trie.size >= 190 && !(other.is_a?(Hamster::Set) || other.is_a?(::Set)) other = ::Set.new(other) end all? { |item| other.include?(item) } end |
#proper_superset?(other) ⇒ Boolean Also known as: >
Returns true
if this Set
contains all the items in other
, plus at least
one item which is not in other
.
437 438 439 |
# File 'lib/hamster/set.rb', line 437 def proper_superset?(other) other.proper_subset?(self) end |
#reverse_each {|item| ... } ⇒ self
Call the block once for each item in this Set
. Iteration order will be
the opposite of #each. If no block is given, an Enumerator
is
returned instead.
183 184 185 186 187 |
# File 'lib/hamster/set.rb', line 183 def reverse_each return enum_for(:reverse_each) if not block_given? @trie.reverse_each { |key, _| yield(key) } self end |
#sample ⇒ Object
Return a randomly chosen item from this Set
. If the set is empty, return nil
.
497 498 499 |
# File 'lib/hamster/set.rb', line 497 def sample empty? ? nil : @trie.at(rand(size))[0] end |
#select {|item| ... } ⇒ Set Also known as: find_all, keep_if
Return a new Set
with all the items for which the block returns true.
196 197 198 199 200 |
# File 'lib/hamster/set.rb', line 196 def select return enum_for(:select) unless block_given? trie = @trie.select { |key, _| yield(key) } new_trie(trie) end |
#size ⇒ Integer Also known as: length
Return the number of items in this Set
.
92 93 94 |
# File 'lib/hamster/set.rb', line 92 def size @trie.size end |
#sort {|a, b| ... } ⇒ SortedSet
Return a Hamster::SortedSet which contains the same items as this Set
, ordered by
the given comparator block.
261 262 263 |
# File 'lib/hamster/set.rb', line 261 def sort(&comparator) SortedSet.new(self.to_a, &comparator) end |
#sort_by {|item| ... } ⇒ SortedSet
Return a Hamster::SortedSet which contains the same items as this Set
, ordered
by mapping each item through the provided block to obtain sort keys, and
then sorting the keys.
279 280 281 |
# File 'lib/hamster/set.rb', line 279 def sort_by(&mapper) SortedSet.new(self.to_a, &mapper) end |
#subset?(other) ⇒ Boolean Also known as: <=
Return true
if all items in this Set
are also in other
.
378 379 380 381 382 383 384 385 386 387 388 389 390 391 392 393 394 |
# File 'lib/hamster/set.rb', line 378 def subset?(other) return false if other.size < size # This method has the potential to be very slow if 'other' is a large Array, so to avoid that, # we convert those Arrays to Sets before checking presence of items # Time to convert Array -> Set is linear in array.size # Time to check for presence of all items in an Array is proportional to set.size * array.size # Note that both sides of that equation have array.size -- hence those terms cancel out, # and the break-even point is solely dependent on the size of this collection # After doing some benchmarking to estimate the constants, it appears break-even is at ~190 items # We also check other.size, to avoid the more expensive #is_a? checks in cases where it doesn't matter # if other.size >= 150 && @trie.size >= 190 && !(other.is_a?(Hamster::Set) || other.is_a?(::Set)) other = ::Set.new(other) end all? { |item| other.include?(item) } end |
#superset?(other) ⇒ Boolean Also known as: >=
Return true
if all items in other
are also in this Set
.
404 405 406 |
# File 'lib/hamster/set.rb', line 404 def superset?(other) other.subset?(self) end |
#to_set ⇒ self
Return self
.
537 538 539 |
# File 'lib/hamster/set.rb', line 537 def to_set self end |
#union(other) ⇒ Set Also known as: |, +, merge
Return a new Set
which contains all the members of both this Set
and other
.
other
can be any Enumerable
object.
291 292 293 294 295 296 297 298 299 300 301 302 303 304 305 306 307 308 309 310 311 |
# File 'lib/hamster/set.rb', line 291 def union(other) if other.is_a?(Hamster::Set) if other.size > size small_set_pairs = @trie large_set_trie = other.instance_variable_get(:@trie) else small_set_pairs = other.instance_variable_get(:@trie) large_set_trie = @trie end else if other.respond_to?(:lazy) small_set_pairs = other.lazy.map { |e| [e, nil] } else small_set_pairs = other.map { |e| [e, nil] } end large_set_trie = @trie end trie = large_set_trie.bulk_put(small_set_pairs) new_trie(trie) end |