Weighted quick-union algorithm with path compression
Union Find is an algorithm that uses a disjoint-set data structure. It allows us to efficiently connect any items of a given list and to efficiently check whether two items of this list are connected (any degree of separation) or not.
Possible applications where we might want to find out whether two items are connected to each other are:
- Social networks
- Computers in a network
- Web pages on the Internet
- Transistors in a computer chip
- Pixels in a digital photo
- Metallic sites in a composite system
Click here for more information.
This a Ruby implementation of a modified version of Robert Sedgewick's and Kevin Wayne's weighted quick-union algorithm with path compression. Credit goes to these two authors of the book Algorithms and to the many computer scientists that have contributed to this algorithm in the past decades.
Installation
Add this line to your application's Gemfile:
gem 'union_find'
And then execute:
$ bundle
Or install it yourself as:
$ gem install union_find
Usage
Create a new instance of UnionFind
and pass in a Set
of items:
require 'set'
people = Set.new ['Grandfather', 'Father', 'Daughter', 'Single Person']
union_find = UnionFind::UnionFind.new(people)
A Set
is used instead of an Array
because Sets
filter out duplicate entries without having to call an expensive #uniq!
method. If your data comes in form of an Array
, you can convert it to a Set
like so:
require 'set'
array = ['Grandfather', 'Father', 'Daughter', 'Single Person']
set = array.to_set
Add more items on the fly:
union_find.add('Grandmother')
Connect items (in any order):
union_find.union('Grandfather', 'Grandmother')
union_find.union('Grandfather', 'Father')
union_find.union('Father', 'Daughter')
Check whether two items are connected (in any order):
union_find.connected?('Grandfather', 'Daughter')
=> true
union_find.connected?('Daughter', 'Father')
=> true
union_find.connected?('Grandfather', 'Single Person')
=> false
Check how many isolated items there are. In this example, there are 2, namely the family (Grandfather - Grandmother - Father - Daugther) and the Single Person:
union_find.count_isolated_components
=> 2
Performance
Initializing a data structure takes constant time: θ(1).
Afterwards, the union()
and connected?()
operations take logarithmic time in the worst case: O(log n).
The count_isolated_components()
operation takes constant time: θ(1).
Contributing
- Fork it ( https://github.com/[my-github-username]/union_find/fork )
- Create your feature branch (
git checkout -b my-new-feature
) - Commit your changes (
git commit -am 'Add some feature'
) - Push to the branch (
git push origin my-new-feature
) - Create a new Pull Request