Class: Bmg::Summarizer

Inherits:
Object
  • Object
show all
Defined in:
lib/bmg/summarizer.rb,
lib/bmg/summarizer/avg.rb,
lib/bmg/summarizer/max.rb,
lib/bmg/summarizer/min.rb,
lib/bmg/summarizer/sum.rb,
lib/bmg/summarizer/last.rb,
lib/bmg/summarizer/count.rb,
lib/bmg/summarizer/first.rb,
lib/bmg/summarizer/concat.rb,
lib/bmg/summarizer/stddev.rb,
lib/bmg/summarizer/by_proc.rb,
lib/bmg/summarizer/collect.rb,
lib/bmg/summarizer/distinct.rb,
lib/bmg/summarizer/multiple.rb,
lib/bmg/summarizer/value_by.rb,
lib/bmg/summarizer/variance.rb,
lib/bmg/summarizer/bucketize.rb,
lib/bmg/summarizer/percentile.rb,
lib/bmg/summarizer/positional.rb,
lib/bmg/summarizer/distinct_count.rb

Overview

Summarizer.

This class provides a basis for implementing aggregation operators.

Aggregation operators are made available through factory methods on the Summarizer class itself:

Summarizer.count
Summarizer.sum(:qty)
Summarizer.sum{|t| t[:qty] * t[:price] }

Once built, summarizers can be used either in black-box or white-box modes.

relation = ...
agg = Summarizer.sum(:qty)

# Black box mode:
result = agg.summarize(relation)

# White box mode:
memo = agg.least
relation.each do |tuple|
  memo = agg.happens(memo, tuple)
end
result = agg.finalize(memo)

Defined Under Namespace

Classes: Avg, Bucketize, ByProc, Collect, Concat, Count, Distinct, DistinctCount, First, Last, Max, Min, Multiple, Percentile, Positional, Stddev, Sum, ValueBy, Variance

Instance Attribute Summary collapse

Class Method Summary collapse

Instance Method Summary collapse

Constructor Details

#initialize(*args, &block) ⇒ Summarizer

Creates an Summarizer instance.

Private method, please use the factory methods



40
41
42
43
44
45
46
47
48
49
50
51
# File 'lib/bmg/summarizer.rb', line 40

def initialize(*args, &block)
  @options = default_options
  args.push(block) if block
  args.each do |arg|
    case arg
    when Symbol, Proc then @functor = arg
    when Hash         then @options = @options.merge(arg)
    else
      raise ArgumentError, "Unexpected `#{arg}`"
    end
  end
end

Instance Attribute Details

#functorObject (readonly)

Returns the underlying functor, either a Symbol or a Proc.

Returns:

  • the underlying functor, either a Symbol or a Proc



35
36
37
# File 'lib/bmg/summarizer.rb', line 35

def functor
  @functor
end

#optionsObject (readonly)

Returns Aggregation options as a Hash.

Returns:

  • Aggregation options as a Hash



32
33
34
# File 'lib/bmg/summarizer.rb', line 32

def options
  @options
end

Class Method Details

.avg(*args, &bl) ⇒ Object

Factors an average summarizer



31
32
33
# File 'lib/bmg/summarizer/avg.rb', line 31

def self.avg(*args, &bl)
  Avg.new(*args, &bl)
end

.bucketize(*args, &bl) ⇒ Object

Factors a bucketize summarizer



77
78
79
# File 'lib/bmg/summarizer/bucketize.rb', line 77

def self.bucketize(*args, &bl)
  Bucketize.new(*args, &bl)
end

.by_proc(least = nil, proc = nil, &bl) ⇒ Object

Factors a distinct summarizer



35
36
37
38
# File 'lib/bmg/summarizer/by_proc.rb', line 35

def self.by_proc(least = nil, proc = nil, &bl)
  least, proc = nil, least if least.is_a?(Proc)
  ByProc.new(least, proc || bl)
end

.collect(*args, &bl) ⇒ Object

Factors a collect summarizer



26
27
28
# File 'lib/bmg/summarizer/collect.rb', line 26

def self.collect(*args, &bl)
  Collect.new(*args, &bl)
end

.concat(*args, &bl) ⇒ Object

Factors a concatenation summarizer



37
38
39
# File 'lib/bmg/summarizer/concat.rb', line 37

def self.concat(*args, &bl)
  Concat.new(*args, &bl)
end

.count(*args, &bl) ⇒ Object

Factors a count summarizer



26
27
28
# File 'lib/bmg/summarizer/count.rb', line 26

def self.count(*args, &bl)
  Count.new(*args, &bl)
end

.distinct(*args, &bl) ⇒ Object

Factors a distinct summarizer



31
32
33
# File 'lib/bmg/summarizer/distinct.rb', line 31

def self.distinct(*args, &bl)
  Distinct.new(*args, &bl)
end

.distinct_count(*args, &bl) ⇒ Object

Factors a distinct count summarizer



31
32
33
# File 'lib/bmg/summarizer/distinct_count.rb', line 31

def self.distinct_count(*args, &bl)
  DistinctCount.new(*args, &bl)
end

.first(*args, &bl) ⇒ Object

Factors a first summarizer



20
21
22
# File 'lib/bmg/summarizer/first.rb', line 20

def self.first(*args, &bl)
  First.new(*args, &bl)
end

.last(*args, &bl) ⇒ Object

Factors a last summarizer



20
21
22
# File 'lib/bmg/summarizer/last.rb', line 20

def self.last(*args, &bl)
  Last.new(*args, &bl)
end

.max(*args, &bl) ⇒ Object

Factors a max summarizer



26
27
28
# File 'lib/bmg/summarizer/max.rb', line 26

def self.max(*args, &bl)
  Max.new(*args, &bl)
end

.median(*args, &bl) ⇒ Object



66
67
68
# File 'lib/bmg/summarizer/percentile.rb', line 66

def self.median(*args, &bl)
  Percentile.new(*(args + [50]), &bl)
end

.median_cont(*args, &bl) ⇒ Object



70
71
72
# File 'lib/bmg/summarizer/percentile.rb', line 70

def self.median_cont(*args, &bl)
  Percentile.new(*(args + [50, {:variant => :continuous}]), &bl)
end

.median_disc(*args, &bl) ⇒ Object



74
75
76
# File 'lib/bmg/summarizer/percentile.rb', line 74

def self.median_disc(*args, &bl)
  Percentile.new(*(args + [50, {:variant => :discrete}]), &bl)
end

.min(*args, &bl) ⇒ Object

Factors a min summarizer



26
27
28
# File 'lib/bmg/summarizer/min.rb', line 26

def self.min(*args, &bl)
  Min.new(*args, &bl)
end

.multiple(defs) ⇒ Object

Factors a distinct summarizer



41
42
43
# File 'lib/bmg/summarizer/multiple.rb', line 41

def self.multiple(defs)
  Multiple.new(defs)
end

.percentile(*args, &bl) ⇒ Object

class Avg



54
55
56
# File 'lib/bmg/summarizer/percentile.rb', line 54

def self.percentile(*args, &bl)
  Percentile.new(*args, &bl)
end

.percentile_cont(*args, &bl) ⇒ Object



58
59
60
# File 'lib/bmg/summarizer/percentile.rb', line 58

def self.percentile_cont(*args, &bl)
  Percentile.new(*(args + [{:variant => :continuous}]), &bl)
end

.percentile_disc(*args, &bl) ⇒ Object



62
63
64
# File 'lib/bmg/summarizer/percentile.rb', line 62

def self.percentile_disc(*args, &bl)
  Percentile.new(*(args + [{:variant => :discrete}]), &bl)
end

.stddev(*args, &bl) ⇒ Object

Factors a standard deviation summarizer



21
22
23
# File 'lib/bmg/summarizer/stddev.rb', line 21

def self.stddev(*args, &bl)
  Stddev.new(*args, &bl)
end

.sum(*args, &bl) ⇒ Object

Factors a sum summarizer



26
27
28
# File 'lib/bmg/summarizer/sum.rb', line 26

def self.sum(*args, &bl)
  Sum.new(*args, &bl)
end

.summarization(defs) ⇒ Object

Converts some summarization definitions to a Hash of summarizers.



55
56
57
58
59
60
61
62
63
64
65
66
# File 'lib/bmg/summarizer.rb', line 55

def self.summarization(defs)
  Hash[defs.map{|k,v|
    summarizer = case v
    when Summarizer then v
    when Symbol     then Summarizer.send(v, k)
    when Proc       then Summarizer.by_proc(&v)
    else
      raise ArgumentError, "Unexpected summarizer #{k} => #{v}"
    end
    [ k, summarizer ]
  }]
end

.value_by(*args, &bl) ⇒ Object

class ValueBy



57
58
59
# File 'lib/bmg/summarizer/value_by.rb', line 57

def self.value_by(*args, &bl)
  ValueBy.new(*args, &bl)
end

.variance(*args, &bl) ⇒ Object

Factors a variance summarizer



37
38
39
# File 'lib/bmg/summarizer/variance.rb', line 37

def self.variance(*args, &bl)
  Variance.new(*args, &bl)
end

Instance Method Details

#finalize(memo) ⇒ Object

This method finalizes an aggregation.

Argument memo is either least or the result of aggregating through happens. The default implementation simply returns memo. The method is intended to be overriden for complex aggregations that need statefull information such as ‘avg`.

Parameters:

  • memo (Object)

    the current aggregation value

Returns:

  • (Object)

    the aggregation value, as finalized



120
121
122
# File 'lib/bmg/summarizer.rb', line 120

def finalize(memo)
  memo
end

#happens(memo, tuple) ⇒ Object

This method is called on each aggregated tuple and must return an updated memo value. It can be seen as the block typically given to Enumerable.inject.

The default implementation collects the pre-value on the tuple and delegates to _happens.

Parameters:

  • memo

    the current aggregation value

  • the

    current iterated tuple

Returns:

  • updated memo value



97
98
99
100
# File 'lib/bmg/summarizer.rb', line 97

def happens(memo, tuple)
  value = extract_value(tuple)
  _happens(memo, value)
end

#leastObject

Returns the least value, which is the one to use on an empty set.

This method is intended to be overriden by subclasses; default implementation returns nil.

Returns:

  • the least value for this summarizer



83
84
85
# File 'lib/bmg/summarizer.rb', line 83

def least
  nil
end

#summarize(enum) ⇒ Object

Summarizes an enumeration of tuples.

Parameters:

  • an

    enumerable of tuples

Returns:

  • the computed summarization value



128
129
130
# File 'lib/bmg/summarizer.rb', line 128

def summarize(enum)
  finalize(enum.inject(least){|m,t| happens(m, t) })
end

#to_summarizer_nameObject

Returns the canonical summarizer name



133
134
135
136
137
138
# File 'lib/bmg/summarizer.rb', line 133

def to_summarizer_name
  self.class.name
    .gsub(/[a-z][A-Z]/){|x| x.split('').join('_') }
    .downcase[/::([a-z_]+)$/, 1]
    .to_sym
end