Class: Dat::Analysis

Inherits:
Object
  • Object
show all
Defined in:
lib/dat/analysis.rb

Overview

Public: Analyze the findings of an Experiment

Typically implementors will wish to subclass this to provide their own implementations of the following methods suited to the environment where ‘dat-science` is being used: `#read`, `#count`, `#cook`.

Example:

class AnalyzeThis < Dat::Analysis
  # Read a result out of our redis stash
  def read
    RedisHandle.rpop "scienceness.#{experiment_name}.results"
  end

  # Query our redis stash to see how many new results are pending
  def count
    RedisHandle.llen("scienceness.#{experiment_name}.results")
  end

  # Deserialize a JSON-encoded result from redis
  def cook(raw_result)
    return nil unless raw_result
    JSON.parse raw_result
  end
end

Defined Under Namespace

Classes: Library, Matcher, Registry, Result, Tally

Instance Attribute Summary collapse

Instance Method Summary collapse

Constructor Details

#initialize(experiment_name) ⇒ Analysis

Public: Create a new Dat::Analysis object. Will load any matcher and

wrapper classes for this experiment if `#path` is non-nil.

experiment_name - The String naming the experiment to analyze.

Examples

analyzer = Dat::Analysis.new('bcrypt-passwords')
=> #<Dat::Analysis:...>


55
56
57
58
59
60
# File 'lib/dat/analysis.rb', line 55

def initialize(experiment_name)
  @experiment_name = experiment_name
  @wrappers = []

  load_classes unless path.nil? rescue nil
end

Instance Attribute Details

#currentObject (readonly) Also known as: result

Public: Returns the current science mismatch result



33
34
35
# File 'lib/dat/analysis.rb', line 33

def current
  @current
end

#experiment_nameObject (readonly)

Public: Returns the name of the experiment



30
31
32
# File 'lib/dat/analysis.rb', line 30

def experiment_name
  @experiment_name
end

#pathObject

Public: Gets/Sets the base path for loading matcher and wrapper classes.

Note that the base path will be appended with the experiment name
before searching for wrappers and matchers.


44
45
46
# File 'lib/dat/analysis.rb', line 44

def path
  @path
end

#rawObject (readonly)

Public: Returns a raw (“un-cooked”) version of the current science mismatch result



39
40
41
# File 'lib/dat/analysis.rb', line 39

def raw
  @raw
end

Instance Method Details

#add(klass) ⇒ Object

Public: Add a matcher or wrapper class to this analyzer.

klass - a subclass of either Dat::Analysis::Matcher or Dat::Analysis::Result

to be registered with this analyzer.

Returns the list of known matchers and wrappers for this analyzer.



342
343
344
# File 'lib/dat/analysis.rb', line 342

def add(klass)
  klass.add_to_analyzer(self)
end

#add_matcher(matcher_class) ⇒ Object

Internal: Add a matcher class to this analyzer’s registry. (Intended to be called only by Dat::Analysis::Matcher and subclasses)



428
429
430
431
# File 'lib/dat/analysis.rb', line 428

def add_matcher(matcher_class)
  puts "Loading matcher class [#{matcher_class}]"
  registry.add matcher_class
end

#add_wrapper(wrapper_class) ⇒ Object

Internal: Add a wrapper class to this analyzer’s registry. (Intended to be called only by Dat::Analysis::Result and its subclasses)



435
436
437
438
# File 'lib/dat/analysis.rb', line 435

def add_wrapper(wrapper_class)
  puts "Loading results wrapper class [#{wrapper_class}]"
  registry.add wrapper_class
end

#analyzeObject

Public: fetch and summarize pending science mismatch results until an an unrecognized result is found. Outputs summaries to STDOUT. May modify current mismatch result.

Returns nil. Leaves current mismatch result set to first unknown result, if one is found.



79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
# File 'lib/dat/analysis.rb', line 79

def analyze
  track do
    while true
      unless more?
        fetch # clear current result
        return summarize_unknown_result
      end

      fetch
      break if unknown?
      summarize
      count_as_seen identify
    end

    print "\n"
    summarize_unknown_result
  end
end

#cook(raw_result) ⇒ Object

Public: process a raw science mismatch result to make it usable in analysis. This is typically overridden by subclasses to do any sort of unmarshalling or deserialization required.

raw_result - a raw science mismatch result, typically, as returned by ‘#read`

Returns a “cooked” science mismatch result.



69
70
71
# File 'lib/dat/analysis.rb', line 69

def cook(raw_result)
  raw_result
end

#count_as_seen(obj) ⇒ Object

Internal: Increment count for an object in an ongoing tally.

obj - an Object for which we are recording occurrence counts

Returns updated tally count for obj.



391
392
393
# File 'lib/dat/analysis.rb', line 391

def count_as_seen(obj)
  tally.count(obj.class.name || obj.class.inspect)
end

#experiment_filesObject

Internal: which class files are candidates for loading matchers and wrappers for this experiment?

Returns: sorted Array of paths to ruby files which may contain declarations of matcher and wrapper classes for this experiment.



422
423
424
# File 'lib/dat/analysis.rb', line 422

def experiment_files
  Dir[File.join(path, experiment_name, '*.rb')].sort
end

#fetchObject

Public: retrieve a new science mismatch result, as returned by ‘#read`.

Returns nil if no new science mismatch results are available. Returns a cooked and wrapped science mismatch result if available. Raises NoMethodError if ‘#read` is not defined on this class.



146
147
148
149
150
# File 'lib/dat/analysis.rb', line 146

def fetch
  @identified = nil
  @raw = read
  @current = raw ? prepare(raw) : nil
end

#identifyObject

Public: Find a matcher which can identify the current science mismatch result.

Returns nil if current result is nil. Returns matcher class if a single matcher can identify current result. Returns false if no matcher can identify the current result. Raises RuntimeError if multiple matchers can identify the current result.



191
192
193
194
195
196
197
198
199
200
# File 'lib/dat/analysis.rb', line 191

def identify
  return @identified if @identified

  results = registry.identify(current)
  if results.size > 1
    report_multiple_matchers(results)
  end

  @identified = results.first
end

#key_sort(keys) ⇒ Object



317
318
319
320
# File 'lib/dat/analysis.rb', line 317

def key_sort(keys)
  str_keys = keys.map {|k| k.to_s }
  (preferred_fields & str_keys) + (str_keys - preferred_fields)
end

#libraryObject

Internal: handle to the library, used for collecting newly discovered matcher and wrapper classes.

Returns: handle to the library class.



406
407
408
# File 'lib/dat/analysis.rb', line 406

def library
  Dat::Analysis::Library
end

#load_classesObject

Public: Load matcher and wrapper classes from the library for our experiment.

Returns: a list of loaded matcher and wrapper classes.



349
350
351
352
353
354
355
# File 'lib/dat/analysis.rb', line 349

def load_classes
  new_classes = library.select_classes do
    experiment_files.each { |file| load file }
  end

  new_classes.map {|klass| add klass }
end

#matchersObject

Public: Which matcher classes are known?

Returns: list of Dat::Analysis::Matcher classes known to this analyzer.



325
326
327
# File 'lib/dat/analysis.rb', line 325

def matchers
  registry.matchers
end

#more?Boolean

Public: Are additional science mismatch results available?

Returns true if more results can be fetched. Returns false if no more results can be fetched.

Returns:

  • (Boolean)


137
138
139
# File 'lib/dat/analysis.rb', line 137

def more?
  count != 0
end

#preferred_fieldsObject



313
314
315
# File 'lib/dat/analysis.rb', line 313

def preferred_fields
  %w(id name title owner description login username)
end

#prepare(raw_result) ⇒ Object

Internal: cook and wrap a raw science mismatch result.

raw_result - an unmodified result, typically, as returned by ‘#read`

Returns the science mismatch result processed by ‘#cook` and then by `#wrap`.



230
231
232
# File 'lib/dat/analysis.rb', line 230

def prepare(raw_result)
  wrap(cook(raw_result))
end

#readableObject

Internal: Return the default readable representation of the current science mismatch result. This method is typically overridden by subclasses or defined in matchers which wish to customize the readable representation of a science mismatch result. This implementation is provided as a default.

Returns a string containing a readable representation of the current science mismatch result.



269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
# File 'lib/dat/analysis.rb', line 269

def readable
  synopsis = []

  synopsis << "Experiment %-20s first: %10s @ %s" % [
    "[#{current['experiment']}]", current['first'], current['timestamp']
  ]
  synopsis << "Duration:  control (%6.2f) | candidate (%6.2f)" % [
    current['control']['duration'], current['candidate']['duration']
  ]

  synopsis << ""

  if current['control']['exception']
    synopsis << "Control raised exception:\n\t#{current['control']['exception'].inspect}"
  else
    synopsis << "Control value:   [#{current['control']['value']}]"
  end

  if current['candidate']['exception']
    synopsis << "Candidate raised exception:\n\t#{current['candidate']['exception'].inspect}"
  else
    synopsis << "Candidate value: [#{current['candidate']['value']}]"
  end

  synopsis << ""

  remaining = current.keys - ['control', 'candidate', 'experiment', 'first', 'timestamp']
  remaining.sort.each do |key|
    if current[key].respond_to?(:keys)
      # do ordered sorting of hash keys
      subkeys = key_sort(current[key].keys)
      synopsis << "\t%15s => {" % [ key ]
      subkeys.each do |subkey|
        synopsis << "\t%15s       %15s => %-20s" % [ '', subkey, current[key][subkey].inspect ]
      end
      synopsis << "\t%15s    }" % [ '' ]
    else
      synopsis << "\t%15s => %-20s" % [ key, current[key] ]
    end
  end

  synopsis.join "\n"
end

#registryObject

Internal: registry of wrapper and matcher classes known to this analyzer.

Returns a (cached between calls) handle to our registry instance.



413
414
415
# File 'lib/dat/analysis.rb', line 413

def registry
  @registry ||= Dat::Analysis::Registry.new
end

#report_multiple_matchers(dupes) ⇒ Object

Internal: Output failure message about duplicate matchers for a science

mismatch result.

dupes - Array of Dat::Analysis::Matcher instances, initialized with a result

Raises RuntimeError.



208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
# File 'lib/dat/analysis.rb', line 208

def report_multiple_matchers(dupes)
  puts "\n\nMultiple matchers identified result:"
  puts

  dupes.each_with_index do |matcher, i|
    print " #{i+1}. "
    if matcher.respond_to?(:readable)
      puts matcher.readable
    else
      puts readable
    end
  end

  puts
  raise "Result cannot be uniquely identified."
end

#skip(&block) ⇒ Object

Public: skip pending mismatch results not satisfying the provided block. May modify current mismatch result.

&block - block accepting a prepared mismatch result and returning true

or false.

Examples:

skip do |result|
  result.user.staff?
end

skip do |result|
  result['group']['id'] > 100 && result['url'] =~ %r{/admin}
end

skip do |result|
  result['timestamp'].to_i > 1.hour.ago
end

Returns nil if no satisfying results are found. Current result will be nil. Returns count of remaining results if a satisfying result found. Leaves current result set to first result for which block returns a truthy value.

Raises:

  • (ArgumentError)


121
122
123
124
125
126
127
128
129
130
131
# File 'lib/dat/analysis.rb', line 121

def skip(&block)
  raise ArgumentError, "a block is required" unless block_given?

  while more?
    fetch
    return count if yield(current)
  end

  # clear current result since nothing of interest was found.
  @current = @identified = nil
end

#summarizeObject

Public: Print a readable summary for the current science mismatch result to STDOUT.

Returns nil.



170
171
172
# File 'lib/dat/analysis.rb', line 170

def summarize
  puts summary
end

#summarize_unknown_resultObject

Internal: Print to STDOUT a readable summary of the current (unknown) science mismatch result, as well a summary of the tally of identified science mismatch results analyzed to this point.

Returns nil if there are no pending science mismatch results. Returns the number of pending science mismatch results.



363
364
365
366
367
368
369
370
371
372
373
# File 'lib/dat/analysis.rb', line 363

def summarize_unknown_result
  tally.summarize
  if current
    puts "\nFirst unidentifiable result:\n\n"
    summarize
  else
    puts "\nNo unidentifiable results found. \\m/\n"
  end

  more? ? count : nil
end

#summaryObject

Public: Return a readable representation of the current science mismatch result. This will utilize the ‘#readable` methods declared on a matcher which identifies the current result.

Returns a string containing a readable representation of the current science mismatch result. Returns nil if there is no current result.



159
160
161
162
163
164
# File 'lib/dat/analysis.rb', line 159

def summary
  return nil unless current
  recognizer = identify
  return readable unless recognizer && recognizer.respond_to?(:readable)
  recognizer.readable
end

#tallyObject

Internal: The current Tally instance. Cached between calls to ‘#track`.

Returns the current Tally instance object.



398
399
400
# File 'lib/dat/analysis.rb', line 398

def tally
  @tally ||= Tally.new
end

#track(&block) ⇒ Object

Internal: keep a tally of analyzed science mismatch results.

&block: block which will presumably call ‘#count_as_seen` to update

tallies of identified science mismatch results.

Returns: value returned by &block.



381
382
383
384
# File 'lib/dat/analysis.rb', line 381

def track(&block)
  @tally = Tally.new
  yield
end

#unknown?Boolean

Public: Is the current science mismatch result unidentifiable?

Returns nil if current result is nil. Returns true if no matcher can identify current result. Returns false if a single matcher can identify the current result. Raises RuntimeError if multiple matchers can identify the current result.

Returns:

  • (Boolean)


180
181
182
183
# File 'lib/dat/analysis.rb', line 180

def unknown?
  return nil if current.nil?
  !identify
end

#wrap(cooked_result) ⇒ Object

Internal: wrap a “cooked” science mismatch result with any known wrapper methods

cooked_result - a “cooked” mismatch result, as returned by ‘#cook`

Returns the cooked science mismatch result, which will now respond to any instance methods found on our known wrapper classes



240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
# File 'lib/dat/analysis.rb', line 240

def wrap(cooked_result)
  cooked_result.extend Dat::Analysis::Result::DefaultMethods

  if !wrappers.empty?
    cooked_result.send(:instance_variable_set, '@analyzer', self)

    class << cooked_result
      define_method(:method_missing) do |meth, *args|
        found = nil
        @analyzer.wrappers.each do |wrapper|
          next unless wrapper.public_instance_methods.detect {|m| m.to_s == meth.to_s }
          found = wrapper.new(self).send(meth, *args)
          break
        end
        found
      end
    end
  end

  cooked_result
end

#wrappersObject

Public: Which wrapper classes are known?

Returns: list of Dat::Analysis::Result classes known to this analyzer.



332
333
334
# File 'lib/dat/analysis.rb', line 332

def wrappers
  registry.wrappers
end