Class: Stamina::Abbadingo::RandomSample
- Inherits:
-
Object
- Object
- Stamina::Abbadingo::RandomSample
- Defined in:
- lib/stamina-induction/stamina/abbadingo/random_sample.rb
Overview
Generates a random Sample using the Abbadingo protocol.
Defined Under Namespace
Classes: StringEnumerator
Class Method Summary collapse
-
.execute(classifier, max_length = classifier.depth + 3) ⇒ Object
Generates a Sample instance with nb strings randomly sampled with a uniform distribution over all strings up.
Class Method Details
.execute(classifier, max_length = classifier.depth + 3) ⇒ Object
Generates a Sample instance with nb strings randomly sampled with a uniform distribution over all strings up
111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 |
# File 'lib/stamina-induction/stamina/abbadingo/random_sample.rb', line 111 def self.execute(classifier, max_length = classifier.depth + 3) enum = StringEnumerator.new(max_length) # We generate 1800 strings for the test set plus n^2/2 strings for # the training set. If there are no enough strings available, we generate # the maximum we can seen = {} nb = Math.min(1800 + (classifier.state_count**2), enum.max) # Let's go now enum.each do |s| seen[s] = true seen.size < nb end # Make them strings = seen.keys.collect{|s| InputString.new(s, classifier.accepts?(s))} pos, neg = strings.partition{|s| s.positive?} # Split them, 1800 in test and the rest in training set if (pos.size > 900) && (neg.size > 900) pos_test, pos_training = pos[0...900], pos[900..-1] neg_test, neg_training = neg[0...900], neg[900..-1] else pos_test, pos_training = pos.partition{|s| Kernel.rand < 0.5} neg_test, neg_training = neg.partition{|s| Kernel.rand < 0.5} end flusher = lambda{|x,y| Kernel.rand < 0.5 ? 1 : -1} training = (pos_training + neg_training).sort &flusher test = (pos_test + neg_test).sort &flusher [Sample.new(training), Sample.new(test)] end |