Class: DHeap

Inherits:
Object
  • Object
show all
Defined in:
lib/d_heap.rb,
lib/d_heap/version.rb,
ext/d_heap/d_heap.c

Overview

A fast d-ary heap implementation for ruby, useful in priority queues and graph algorithms.

The d-ary heap data structure is a generalization of the binary heap, in which the nodes have d children instead of 2. This allows for “decrease priority” operations to be performed more quickly with the tradeoff of slower delete minimum. Additionally, d-ary heaps can have better memory cache behavior than binary heaps, allowing them to pop more quickly in practice despite slower worst-case time complexity.

Although d can be configured when creating the heap, it’s usually best to keep the default value of 4, because d=4 gives the smallest coefficient for (d + 1) log n / log d result. As always, use benchmarks for your particular use-case.

Examples:

Basic push, peek, and pop

# create some example objects to place in our heap
Task = Struct.new(:id, :time) do
  def to_f; time.to_f end
end
t1 = Task.new(1, Time.now + 5*60)
t2 = Task.new(2, Time.now + 50)
t3 = Task.new(3, Time.now + 60)
t4 = Task.new(4, Time.now +  5)

# create the heap
require "d_heap"
heap = DHeap.new

# push with an explicit score (which might be extrinsic to the value)
heap.push t1, t1.to_f

# the score will be implicitly cast with Float, so any object with #to_f
heap.push t2, t2

# if the object has an intrinsic score via #to_f, "<<" is the simplest API
heap << t3 << t4

# pop returns the lowest scored item, and removes it from the heap
heap.pop    # => #<struct Task id=4, time=2021-01-17 17:02:22.5574 -0500>
heap.pop    # => #<struct Task id=2, time=2021-01-17 17:03:07.5574 -0500>

# peek returns the lowest scored item, without removing it from the heap
heap.peek   # => #<struct Task id=3, time=2021-01-17 17:03:17.5574 -0500>
heap.pop    # => #<struct Task id=3, time=2021-01-17 17:03:17.5574 -0500>

# pop_lte handles the common "h.pop if h.peek_score < max" pattern
heap.pop_lte(Time.now + 65) # => nil

# the heap size can be inspected with size and empty?
heap.empty? # => false
heap.size   # => 1
heap.pop    # => #<struct Task id=1, time=2021-01-17 17:07:17.5574 -0500>
heap.empty? # => true
heap.size   # => 0

# popping from an empty heap returns nil
heap.pop    # => nil

Direct Known Subclasses

Map

Defined Under Namespace

Classes: Map

Constant Summary collapse

VERSION =
"0.7.0"
MAX_D =

This is based on INT_MAX. But it is very very unlikely you will want a large value for d. The tradeoff is that higher d values give faster push and slower pop. If you expect pushes and pops to be balanced, then just stick with the default. If you expect more pushes than pops, it might be worthwhile to increase d.

INT2NUM(DHEAP_MAX_D)
DEFAULT_D =

d=4 uses the fewest comparisons for (worst case) insert + delete-min.

INT2NUM(DHEAP_DEFAULT_D)
DEFAULT_CAPA =

The default heap capacity. The heap grows automatically as necessary, so you shouldn’t need to worry about this.

INT2NUM(DHEAP_DEFAULT_CAPA)

Instance Method Summary collapse

Constructor Details

#initialize(d: DEFAULT_D, capacity: DEFAULT_CAPA) ⇒ DHeap

Initialize a d-ary min-heap.

Parameters:

  • d (Integer) (defaults to: DEFAULT_D)

    Number of children for each parent node. Higher values generally speed up push but slow down pop. If all pushes are popped, the default is probably best.

  • capacity (Integer) (defaults to: DEFAULT_CAPA)

    initial capacity of the heap.



85
86
87
# File 'lib/d_heap.rb', line 85

def initialize(d: DEFAULT_D, capacity: DEFAULT_CAPA) # rubocop:disable Naming/MethodParameterName
  __init_without_kw__(d, capacity, false)
end

Instance Method Details

#clearself

Clears all values from the heap, leaving it empty.

Returns:

  • (self)


968
969
970
971
972
973
974
975
976
977
978
979
# File 'ext/d_heap/d_heap.c', line 968

static VALUE
dheap_clear(VALUE self)
{
    dheap_t *heap = get_dheap_struct_unfrozen(self);
    if (!DHEAP_EMPTY_P(heap)) {
        heap->size = 0;
#ifdef DHEAP_MAP
        if (DHEAPMAP_P(heap)) rb_hash_clear(heap->indexes);
#endif
    }
    return self;
}

#dInteger

Returns the maximum number of children per parent.

Returns:

  • (Integer)

    the maximum number of children per parent



495
496
497
498
499
500
# File 'ext/d_heap/d_heap.c', line 495

static VALUE
dheap_attr_d(VALUE self)
{
    dheap_t *heap = get_dheap_struct(self);
    return INT2FIX(heap->d);
}

#each_pop(with_scores: false) {|value, score| ... } ⇒ Enumerator?

Consumes the heap by popping each minumum value until it is empty.

If you want to iterate over the heap without consuming it, you will need to first call #dup

Parameters:

  • with_score (Boolean)

    if scores shoul also be yielded

Yield Parameters:

  • value (Object)

    each value that would be popped

  • score (Numeric)

    each value’s score, if with_scores is true

Returns:

  • (Enumerator)

    if no block is given

  • (nil)

    if a block is given



101
102
103
104
105
106
107
108
109
# File 'lib/d_heap.rb', line 101

def each_pop(with_scores: false)
  return to_enum(__method__, with_scores: with_scores) unless block_given?
  if with_scores
    yield(*pop_with_score) until empty?
  else
    yield pop until empty?
  end
  nil
end

#empty?Boolean

Returns if the heap is empty.

Returns:

  • (Boolean)

    if the heap is empty



485
486
487
488
489
490
# File 'ext/d_heap/d_heap.c', line 485

static VALUE
dheap_empty_p(VALUE self)
{
    dheap_t *heap = get_dheap_struct(self);
    return DHEAP_EMPTY_P(heap) ? Qtrue : Qfalse;
}

#peeknil, Object Also known as: first

Returns the next value on the heap to be popped without popping it.

Time complexity: O(1) (worst-case)

Returns:

  • (nil, Object)

    the next value to be popped without popping it.

See Also:



716
717
718
719
720
721
722
# File 'ext/d_heap/d_heap.c', line 716

static VALUE
dheap_peek(VALUE self)
{
    dheap_t *heap = get_dheap_struct(self);
    if (DHEAP_EMPTY_P(heap)) return Qnil;
    return PEEK_VALUE(heap);
}

#peek_scorenil, Numeric

Returns the next score on the heap, without the value and without popping it.

Time complexity: O(1) (worst-case)

Returns:

  • (nil, Numeric)

    the next score, if there is one

See Also:



699
700
701
702
703
704
705
# File 'ext/d_heap/d_heap.c', line 699

static VALUE
dheap_peek_score(VALUE self)
{
    dheap_t *heap = get_dheap_struct(self);
    if (DHEAP_EMPTY_P(heap)) return Qnil;
    return SCORE2NUM(PEEK_SCORE(heap));
}

#peek_with_scorenil, Array<(Object, Numeric)>

Returns the next value on the heap, and its score, without popping it

Time complexity: O(1) (worst-case)

Returns:

  • (nil, Array<(Object, Numeric)>)

    the next value and its score

See Also:



682
683
684
685
686
687
# File 'ext/d_heap/d_heap.c', line 682

static VALUE
dheap_peek_with_score(VALUE self)
{
    dheap_t *heap = get_dheap_struct(self);
    return PEEK_WITH_SCORE(heap);
}

#pop_all_below(max_score, receiver = []) ⇒ Object Also known as: pop_all_lt

Pops all value with score less than max score.

Time complexity: O(m * d log n / log d), m = number popped

Parameters:

  • max_score (Integer, #to_f)

    the maximum score to be popped

  • receiver (Array, #<<)

    object onto which the values will be pushed, in order by score.

Returns:

  • (Object)

    the object onto which the values were pushed

See Also:

  • #pop
  • #pop_lt


935
936
937
938
939
940
941
942
943
944
# File 'ext/d_heap/d_heap.c', line 935

static VALUE
dheap_pop_all_below(int argc, VALUE *argv, VALUE self)
{
    dheap_t *heap      = get_dheap_struct_unfrozen(self);
    SCORE    max_score = (argc) ? VAL2SCORE(argv[0]) : 0.0;
    VALUE    array     = (argc == 1) ? rb_ary_new() : argv[1];
    rb_check_arity(argc, 1, 2);
    DHEAP_DISPATCH_STMT(heap, POP_ALL_BELOW, max_score, array);
    return array;
}

#sizeInteger Also known as: length, count

Returns the number of elements in the heap.

Returns:

  • (Integer)

    the number of elements in the heap



473
474
475
476
477
478
# File 'ext/d_heap/d_heap.c', line 473

static VALUE
dheap_size(VALUE self)
{
    dheap_t *heap = get_dheap_struct(self);
    return ULONG2NUM(heap->size);
}

#to_aObject

******************************************************************

DHeap, misc methods

******************************************************************



952
953
954
955
956
957
958
959
960
961
# File 'ext/d_heap/d_heap.c', line 952

static VALUE
dheap_to_a(VALUE self)
{
    dheap_t *heap  = get_dheap_struct(self);
    VALUE    array = rb_ary_new_capa(heap->size);
    for (size_t i = 0; i < heap->size; i++) {
        rb_ary_push(array, DHEAP_ENTRY_ARY(heap, i));
    }
    return array;
}