Class: Stamina::Abbadingo::RandomSample::StringEnumerator
- Inherits:
-
Object
- Object
- Stamina::Abbadingo::RandomSample::StringEnumerator
- Includes:
- Enumerable
- Defined in:
- lib/stamina-induction/stamina/abbadingo/random_sample.rb
Overview
Implements an enumerator for binary strings whose length lies between 0 and max_length (passed at construction).
The enumerator guarantees that strings are sampled with an uniform distribution among all available. As the number of strings of a given length is an exponential function, this means that you’ve got 50% change of having a string of length max_length, 25% of max_length - 1, 12.5% of max_length - 2 and so on.
How to use it?
# create for strings between 0 and 10 symbols, inclusive
enum = Stamina::Abbadingo::StringEnumerator.new(10)
# this is how to generate strings while a predicate is true
enum.each do |s|
# s is an array of binary integer symbols (0 or 1)
# true for continuing, false otherwise
(true || false)
end
# this is how to generate a fixed number of strings
(1..1000).collect{ enum.one }
How does it work? Well, the distribution of strings is as follows:
length [n]b_strings [c]umul log2(n) log2(c) log2(c).floor
(2**n) 2**(n+1)-1
0 1 1 0.0000000000 0.000000 0
1 2 3 1.0000000000 1.584963 1
2 4 7 2.0000000000 2.807355 2
3 8 15 3.0000000000 3.906891 3
4 16 31 4.0000000000 4.954196 4
5 32 63 5.0000000000 5.977280 5
where cumul is the total number of string upto length symbols.
Therefore, the idea is to see each string has an identifier, say x, between 1 and 2**(max_length+1)-1 (see max).
* The length of the _x_th string is log2(x).floor (see length_for)
* The string itself is the binary decomposition of x, up to length_for(x)
symbols (see string_for)
As those identifiers naturally respect the exponential distribution, sampling the strings is the same as taking string_for(x) for random x upto max.
Instance Attribute Summary collapse
-
#max_length ⇒ Object
readonly
Maximal length of a string.
Instance Method Summary collapse
-
#each ⇒ Object
Yields the block with a random string, until the block return false or nil.
-
#initialize(max_length = 16) ⇒ StringEnumerator
constructor
A new instance of StringEnumerator.
-
#length_for(x) ⇒ Object
Returns the length of the string whose identifier is x (> 0).
-
#max ⇒ Object
Returns the maximum identifier, which is also the number of strings up to max_length symbols.
-
#one ⇒ Object
Generates a string at random.
-
#string_for(x) ⇒ Object
Returns the binary string whose identifier is x (> 0).
Constructor Details
#initialize(max_length = 16) ⇒ StringEnumerator
Returns a new instance of StringEnumerator.
61 62 63 |
# File 'lib/stamina-induction/stamina/abbadingo/random_sample.rb', line 61 def initialize(max_length = 16) @max_length = max_length end |
Instance Attribute Details
#max_length ⇒ Object (readonly)
Maximal length of a string
59 60 61 |
# File 'lib/stamina-induction/stamina/abbadingo/random_sample.rb', line 59 def max_length @max_length end |
Instance Method Details
#each ⇒ Object
Yields the block with a random string, until the block return false or nil.
99 100 101 102 103 |
# File 'lib/stamina-induction/stamina/abbadingo/random_sample.rb', line 99 def each begin cont = yield(one) end while cont end |
#length_for(x) ⇒ Object
Returns the length of the string whose identifier is x (> 0)
68 69 70 |
# File 'lib/stamina-induction/stamina/abbadingo/random_sample.rb', line 68 def length_for(x) Math.log2(x).floor end |
#max ⇒ Object
Returns the maximum identifier, which is also the number of strings up to max_length symbols
84 85 86 |
# File 'lib/stamina-induction/stamina/abbadingo/random_sample.rb', line 84 def max @max ||= 2 ** (max_length+1) - 1 end |
#one ⇒ Object
Generates a string at random
91 92 93 |
# File 'lib/stamina-induction/stamina/abbadingo/random_sample.rb', line 91 def one string_for(1+Kernel.rand(max)) end |
#string_for(x) ⇒ Object
Returns the binary string whose identifier is x (> 0)
75 76 77 78 |
# File 'lib/stamina-induction/stamina/abbadingo/random_sample.rb', line 75 def string_for(x) length = length_for(x) (0..length-1).collect{|i| ((x >> i) % 2).to_s} end |