Class: Airbrake::TDigest Private
- Inherits:
-
Object
- Object
- Airbrake::TDigest
- Defined in:
- lib/airbrake-ruby/tdigest.rb
Overview
This class is part of a private API. You should avoid using this class if possible, as it may be removed or be changed in the future.
Ruby implementation of Ted Dunning’s t-digest data structure.
This implementation is imported from github.com/castle/tdigest with custom modifications. Huge thanks to Castle for the implementation :beer:
The difference is that we pack with Big Endian (unlike Native Endian in Castle’s version). Our backend does not permit little endian.
rubocop:disable Metrics/ClassLength
Defined Under Namespace
Classes: Centroid
Constant Summary collapse
- VERBOSE_ENCODING =
This constant is part of a private API. You should avoid using this constant if possible, as it may be removed or be changed in the future.
1
- SMALL_ENCODING =
This constant is part of a private API. You should avoid using this constant if possible, as it may be removed or be changed in the future.
2
Instance Attribute Summary collapse
- #centroids ⇒ Object private
- #size ⇒ Object readonly private
Class Method Summary collapse
-
.from_bytes(bytes) ⇒ Object
private
rubocop:disable Metrics/PerceivedComplexity, Metrics/MethodLength rubocop:disable Metrics/CyclomaticComplexity, Metrics/AbcSize.
-
.from_json(array) ⇒ Object
private
rubocop:enable Metrics/PerceivedComplexity, Metrics/MethodLength rubocop:enable Metrics/CyclomaticComplexity, Metrics/AbcSize.
Instance Method Summary collapse
- #+(other) ⇒ Object private
- #as_bytes ⇒ Object private
-
#as_json(_ = nil) ⇒ Object
private
rubocop:enable Metrics/AbcSize.
-
#as_small_bytes ⇒ Object
private
rubocop:disable Metrics/AbcSize.
- #bound_mean(x) ⇒ Object private
- #bound_mean_cumn(cumn) ⇒ Object private
- #compress! ⇒ Object private
- #compression ⇒ Object private
- #find_nearest(x) ⇒ Object private
-
#initialize(delta = 0.01, k = 25, cx = 1.1) ⇒ TDigest
constructor
private
A new instance of TDigest.
- #merge!(other) ⇒ Object private
-
#p_rank(x) ⇒ Object
private
rubocop:disable Metrics/PerceivedComplexity, Metrics/AbcSize rubocop:disable Metrics/CyclomaticComplexity.
-
#percentile(p) ⇒ Object
private
rubocop:disable Metrics/PerceivedComplexity, Metrics/CyclomaticComplexity rubocop:disable Metrics/AbcSize.
-
#push(x, n = 1) ⇒ Object
private
rubocop:enable Metrics/PerceivedComplexity, Metrics/CyclomaticComplexity rubocop:enable Metrics/AbcSize.
- #push_centroid(c) ⇒ Object private
- #reset! ⇒ Object private
- #to_a ⇒ Object private
Constructor Details
#initialize(delta = 0.01, k = 25, cx = 1.1) ⇒ TDigest
This method is part of a private API. You should avoid using this method if possible, as it may be removed or be changed in the future.
Returns a new instance of TDigest.
43 44 45 46 47 48 49 50 |
# File 'lib/airbrake-ruby/tdigest.rb', line 43 def initialize(delta = 0.01, k = 25, cx = 1.1) @delta = delta @k = k @cx = cx @centroids = RBTree.new @size = 0 @last_cumulate = 0 end |
Instance Attribute Details
#centroids ⇒ Object
This method is part of a private API. You should avoid using this method if possible, as it may be removed or be changed in the future.
40 41 42 |
# File 'lib/airbrake-ruby/tdigest.rb', line 40 def centroids @centroids end |
#size ⇒ Object (readonly)
This method is part of a private API. You should avoid using this method if possible, as it may be removed or be changed in the future.
41 42 43 |
# File 'lib/airbrake-ruby/tdigest.rb', line 41 def size @size end |
Class Method Details
.from_bytes(bytes) ⇒ Object
This method is part of a private API. You should avoid using this method if possible, as it may be removed or be changed in the future.
rubocop:disable Metrics/PerceivedComplexity, Metrics/MethodLength rubocop:disable Metrics/CyclomaticComplexity, Metrics/AbcSize
249 250 251 252 253 254 255 256 257 258 259 260 261 262 263 264 265 266 267 268 269 270 271 272 273 274 275 276 277 278 279 280 281 282 283 284 285 286 287 288 289 290 291 292 |
# File 'lib/airbrake-ruby/tdigest.rb', line 249 def self.from_bytes(bytes) format, compression, size = bytes.unpack('NGN') tdigest = new(1 / compression) start_idx = 16 # after header case format when VERBOSE_ENCODING array = bytes[start_idx..-1].unpack("G#{size}N#{size}") means, counts = array.each_slice(size).to_a if array.any? when SMALL_ENCODING means = bytes[start_idx..(start_idx + (4 * size))].unpack("g#{size}") # Decode delta encoding of means x = 0 means.map! do |m| m += x x = m m end counts_bytes = bytes[(start_idx + (4 * size))..-1].unpack('C*') counts = [] # Decode variable length integer bytes size.times do v = counts_bytes.shift z = 0x7f & v shift = 7 while (v & 0x80) != 0 raise 'Shift too large in decode' if shift > 28 v = counts_bytes.shift || 0 z += (v & 0x7f) << shift shift += 7 end counts << z end # This shouldn't happen raise 'Mismatch' unless counts.size == means.size else raise 'Unknown compression format' end means.zip(counts).each { |val| tdigest.push(val[0], val[1]) } if means && counts tdigest end |
.from_json(array) ⇒ Object
This method is part of a private API. You should avoid using this method if possible, as it may be removed or be changed in the future.
rubocop:enable Metrics/PerceivedComplexity, Metrics/MethodLength rubocop:enable Metrics/CyclomaticComplexity, Metrics/AbcSize
296 297 298 299 300 301 |
# File 'lib/airbrake-ruby/tdigest.rb', line 296 def self.from_json(array) tdigest = new # Handle both string and symbol keys array.each { |a| tdigest.push(a['m'] || a[:m], a['n'] || a[:n]) } tdigest end |
Instance Method Details
#+(other) ⇒ Object
This method is part of a private API. You should avoid using this method if possible, as it may be removed or be changed in the future.
52 53 54 55 56 57 58 |
# File 'lib/airbrake-ruby/tdigest.rb', line 52 def +(other) # Uses delta, k and cx from the caller t = self.class.new(@delta, @k, @cx) data = centroids.values + other.centroids.values t.push_centroid(data.delete_at(rand(data.length))) while data.any? t end |
#as_bytes ⇒ Object
This method is part of a private API. You should avoid using this method if possible, as it may be removed or be changed in the future.
60 61 62 63 64 65 66 67 |
# File 'lib/airbrake-ruby/tdigest.rb', line 60 def as_bytes # compression as defined by Java implementation size = @centroids.size output = [VERBOSE_ENCODING, compression, size] output += @centroids.each_value.map(&:mean) output += @centroids.each_value.map(&:n) output.pack("NGNG#{size}N#{size}") end |
#as_json(_ = nil) ⇒ Object
This method is part of a private API. You should avoid using this method if possible, as it may be removed or be changed in the future.
rubocop:enable Metrics/AbcSize
99 100 101 |
# File 'lib/airbrake-ruby/tdigest.rb', line 99 def as_json(_ = nil) @centroids.each_value.map(&:as_json) end |
#as_small_bytes ⇒ Object
This method is part of a private API. You should avoid using this method if possible, as it may be removed or be changed in the future.
rubocop:disable Metrics/AbcSize
70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 |
# File 'lib/airbrake-ruby/tdigest.rb', line 70 def as_small_bytes size = @centroids.size output = [self.class::SMALL_ENCODING, compression, size] x = 0 # delta encoding allows saving 4-bytes floats mean_arr = @centroids.each_value.map do |c| val = c.mean - x x = c.mean val end output += mean_arr # Variable length encoding of numbers c_arr = @centroids.each_value.with_object([]) do |c, arr| k = 0 n = c.n while n < 0 || n > 0x7f b = 0x80 | (0x7f & n) arr << b n = n >> 7 k += 1 raise 'Unreasonable large number' if k > 6 end arr << n end output += c_arr output.pack("NGNg#{size}C#{size}") end |
#bound_mean(x) ⇒ Object
This method is part of a private API. You should avoid using this method if possible, as it may be removed or be changed in the future.
103 104 105 106 107 |
# File 'lib/airbrake-ruby/tdigest.rb', line 103 def bound_mean(x) upper = @centroids.upper_bound(x) lower = @centroids.lower_bound(x) [lower[1], upper[1]] end |
#bound_mean_cumn(cumn) ⇒ Object
This method is part of a private API. You should avoid using this method if possible, as it may be removed or be changed in the future.
109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 |
# File 'lib/airbrake-ruby/tdigest.rb', line 109 def bound_mean_cumn(cumn) last_c = nil bounds = [] @centroids.each_value do |v| if v.mean_cumn == cumn bounds << v break elsif v.mean_cumn > cumn bounds << last_c bounds << v break else last_c = v end end # If still no results, pick lagging value if any bounds << last_c if bounds.empty? && !last_c.nil? bounds end |
#compress! ⇒ Object
This method is part of a private API. You should avoid using this method if possible, as it may be removed or be changed in the future.
130 131 132 133 134 135 136 |
# File 'lib/airbrake-ruby/tdigest.rb', line 130 def compress! points = to_a reset! push_centroid(points.shuffle) _cumulate(exact: true, force: true) nil end |
#compression ⇒ Object
This method is part of a private API. You should avoid using this method if possible, as it may be removed or be changed in the future.
138 139 140 |
# File 'lib/airbrake-ruby/tdigest.rb', line 138 def compression 1 / @delta end |
#find_nearest(x) ⇒ Object
This method is part of a private API. You should avoid using this method if possible, as it may be removed or be changed in the future.
142 143 144 145 146 147 148 149 150 151 152 153 154 155 |
# File 'lib/airbrake-ruby/tdigest.rb', line 142 def find_nearest(x) return if size == 0 upper_key, upper = @centroids.upper_bound(x) lower_key, lower = @centroids.lower_bound(x) return lower unless upper_key return upper unless lower_key if (lower_key - x).abs < (upper_key - x).abs lower else upper end end |
#merge!(other) ⇒ Object
This method is part of a private API. You should avoid using this method if possible, as it may be removed or be changed in the future.
157 158 159 160 |
# File 'lib/airbrake-ruby/tdigest.rb', line 157 def merge!(other) push_centroid(other.centroids.values.shuffle) self end |
#p_rank(x) ⇒ Object
This method is part of a private API. You should avoid using this method if possible, as it may be removed or be changed in the future.
rubocop:disable Metrics/PerceivedComplexity, Metrics/AbcSize rubocop:disable Metrics/CyclomaticComplexity
164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 |
# File 'lib/airbrake-ruby/tdigest.rb', line 164 def p_rank(x) is_array = x.is_a? Array x = [x] unless is_array min = @centroids.first max = @centroids.last x.map! do |item| if size == 0 nil elsif item < min[1].mean 0.0 elsif item > max[1].mean 1.0 else _cumulate(exact: true) bound = bound_mean(item) lower, upper = bound mean_cumn = lower.mean_cumn if lower != upper mean_cumn += (item - lower.mean) * (upper.mean_cumn - lower.mean_cumn) \ / (upper.mean - lower.mean) end mean_cumn / @size end end is_array ? x : x.first end |
#percentile(p) ⇒ Object
This method is part of a private API. You should avoid using this method if possible, as it may be removed or be changed in the future.
rubocop:disable Metrics/PerceivedComplexity, Metrics/CyclomaticComplexity rubocop:disable Metrics/AbcSize
197 198 199 200 201 202 203 204 205 206 207 208 209 210 211 212 213 214 215 216 217 218 219 220 221 222 223 |
# File 'lib/airbrake-ruby/tdigest.rb', line 197 def percentile(p) is_array = p.is_a? Array p = [p] unless is_array p.map! do |item| unless (0..1).cover?(item) raise ArgumentError, "p should be in [0,1], got #{item}" end if size == 0 nil else _cumulate(exact: true) h = @size * item lower, upper = bound_mean_cumn(h) if lower.nil? && upper.nil? nil elsif upper == lower || lower.nil? || upper.nil? (lower || upper).mean elsif h == lower.mean_cumn lower.mean else upper.mean end end end is_array ? p : p.first end |
#push(x, n = 1) ⇒ Object
This method is part of a private API. You should avoid using this method if possible, as it may be removed or be changed in the future.
rubocop:enable Metrics/PerceivedComplexity, Metrics/CyclomaticComplexity rubocop:enable Metrics/AbcSize
227 228 229 230 |
# File 'lib/airbrake-ruby/tdigest.rb', line 227 def push(x, n = 1) x = [x] unless x.is_a? Array x.each { |value| _digest(value, n) } end |
#push_centroid(c) ⇒ Object
This method is part of a private API. You should avoid using this method if possible, as it may be removed or be changed in the future.
232 233 234 235 |
# File 'lib/airbrake-ruby/tdigest.rb', line 232 def push_centroid(c) c = [c] unless c.is_a? Array c.each { |centroid| _digest(centroid.mean, centroid.n) } end |
#reset! ⇒ Object
This method is part of a private API. You should avoid using this method if possible, as it may be removed or be changed in the future.
237 238 239 240 241 |
# File 'lib/airbrake-ruby/tdigest.rb', line 237 def reset! @centroids.clear @size = 0 @last_cumulate = 0 end |
#to_a ⇒ Object
This method is part of a private API. You should avoid using this method if possible, as it may be removed or be changed in the future.
243 244 245 |
# File 'lib/airbrake-ruby/tdigest.rb', line 243 def to_a @centroids.each_value.to_a end |