Class: Chars::CharSet

Inherits:
Set
  • Object
show all
Defined in:
lib/chars/char_set.rb

Class Method Summary collapse

Instance Method Summary collapse

Constructor Details

#initialize(*arguments) ⇒ CharSet

Creates a new CharSet object.

Parameters:

  • arguments (Array<String, Integer, Enumerable>)

    The chars for the CharSet.

Raises:

  • (TypeError)

    One of the arguments was not a String, Integer or Enumerable.



17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
# File 'lib/chars/char_set.rb', line 17

def initialize(*arguments)
  super()

  @chars = Hash.new do |hash,key|
    hash[key] = if key > 0xff
                  key.chr(Encoding::UTF_8)
                else
                  key.chr(Encoding::ASCII_8BIT)
                end
  end

  arguments.each do |subset|
    case subset
    when String, Integer
      self << subset
    when Enumerable
      subset.each { |char| self << char }
    else
      raise(TypeError,"arguments must be a String, Integer or Enumerable")
    end
  end
end

Class Method Details

.[](*arguments) ⇒ CharSet

Creates a new Chars::CharSet.

Parameters:

  • arguments (Array<String, Integer, Enumerable>)

    The chars for the CharSet.

Returns:

  • (CharSet)

    The new character set.

See Also:

Since:

  • 0.2.1



63
64
65
# File 'lib/chars/char_set.rb', line 63

def self.[](*arguments)
  new(*arguments)
end

Instance Method Details

#<<(other) ⇒ CharSet

Adds a character to the set.

Parameters:

Returns:

  • (CharSet)

    The modified character set.

Raises:

Since:

  • 0.2.1



81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
# File 'lib/chars/char_set.rb', line 81

def <<(other)
  case other
  when String
    other.each_char do |char|
      byte = char.ord

      @chars[byte] = char
      super(byte)
    end

    return self
  when Integer
    super(other)
  else
    raise(TypeError,"can only append Strings and Integers")
  end
end

#===(other) ⇒ Boolean Also known as: =~

Compares the bytes within a given string with the bytes of the Chars::CharSet.

Examples:

Chars.alpha === "hello"
# => true

Parameters:

Returns:

  • (Boolean)

    Specifies whether all of the bytes within the given string are included in the Chars::CharSet.



684
685
686
687
688
689
690
691
692
693
694
695
696
697
698
699
700
# File 'lib/chars/char_set.rb', line 684

def ===(other)
  case other
  when String
    other.each_char.all? { |char| include_char?(char) }
  when Enumerable
    other.all? do |element|
      case element
      when String
        include_char?(element)
      when Integer
        include_byte?(element)
      end
    end
  else
    false
  end
end

#charsArray<String>

The characters within the Chars::CharSet.

Returns:



129
130
131
# File 'lib/chars/char_set.rb', line 129

def chars
  map { |byte| @chars[byte] }
end

#each_char {|char| ... } ⇒ Enumerator

Iterates over every character within the Chars::CharSet.

Yields:

  • (char)

    If a block is given, it will be passed each character in the Chars::CharSet.

Yield Parameters:

Returns:

  • (Enumerator)

    If no block is given, an enumerator object will be returned.



146
147
148
149
150
# File 'lib/chars/char_set.rb', line 146

def each_char
  return enum_for(__method__) unless block_given?

  each { |byte| yield @chars[byte] }
end

#each_random_byte(n, **kwargs) {|byte| ... } ⇒ Enumerator

Pass random bytes to a given block.

Parameters:

  • n (Integer)

    Specifies how many times to pass a random byte to the block.

  • kwargs (Hash{Symbol => Object})

    Additional keyword arguments.

Options Hash (**kwargs):

  • :random (Random, SecureRandom)

    The random number generator to use.

Yields:

  • (byte)

    The block will receive the random bytes.

Yield Parameters:

Returns:

  • (Enumerator)

    If no block is given, an enumerator object will be returned.



236
237
238
239
240
241
242
243
# File 'lib/chars/char_set.rb', line 236

def each_random_byte(n,**kwargs,&block)
  return enum_for(__method__,n,**kwargs) unless block_given?

  n.times do
    yield random_byte(**kwargs)
  end
  return nil
end

#each_random_char(n, **kwargs) {|char| ... } ⇒ Enumerator

Pass random characters to a given block.

Parameters:

  • n (Integer)

    Specifies how many times to pass a random character to the block.

  • kwargs (Hash{Symbol => Object})

    Additional keyword arguments.

Options Hash (**kwargs):

  • :random (Random, SecureRandom)

    The random number generator to use.

Yields:

  • (char)

    The block will receive the random characters.

Yield Parameters:

Returns:

  • (Enumerator)

    If no block is given, an enumerator object will be returned.



266
267
268
269
270
271
272
# File 'lib/chars/char_set.rb', line 266

def each_random_char(n,**kwargs,&block)
  return enum_for(__method__,n,**kwargs) unless block_given?

  each_random_byte(n,**kwargs) do |byte|
    yield @chars[byte]
  end
end

#each_string_of_length(length) {|string| ... } ⇒ Enumerator

Enumerates through every possible string belonging to the Chars::CharSet and of the given length.

Parameters:

  • length (Range, Array, Integer)

    The desired length(s) of each string.

Yields:

  • (string)

    The given block will be passed each sequential string.

Yield Parameters:

Returns:

  • (Enumerator)

    If no block is given, an Enumerator will be returned.

Since:

  • 0.3.0



623
624
625
626
627
628
629
630
631
632
633
634
# File 'lib/chars/char_set.rb', line 623

def each_string_of_length(length,&block)
  return enum_for(__method__,length) unless block

  case length
  when Range, Array
    length.each do |len|
      StringEnumerator.new(self,len).each(&block)
    end
  else
    StringEnumerator.new(self,length).each(&block)
  end
end

#each_substring(data, **kwargs) ⇒ Enumerator

Enumerates over all substrings within the given string, of minimum length and that are made up of characters from the Chars::CharSet.

Parameters:

Options Hash (**kwargs):

  • :min_length (Integer)

    The minimum length of sub-strings found within the given data.

Returns:

  • (Enumerator)

    If no block is given, an Enumerator object will be returned.

See Also:

Since:

  • 0.3.0



519
520
521
522
523
524
525
# File 'lib/chars/char_set.rb', line 519

def each_substring(data,**kwargs)
  return enum_for(__method__,data,**kwargs) unless block_given?

  each_substring_with_index(data,**kwargs) do |substring,index|
    yield substring
  end
end

#each_substring_with_index(data, min_length: 4) {|match, index| ... } ⇒ Enumerator

Enumerates over all substrings and their indices within the given string, of minimum length and that are made up of characters from the Chars::CharSet.

Parameters:

  • data (String)

    The data to find sub-strings within.

  • min_length (Integer) (defaults to: 4)

    The minimum length of sub-strings found within the given data.

Yields:

  • (match, index)

    The given block will be passed every matched sub-string and it's index.

  • (String)

    match A sub-string containing the characters from the Chars::CharSet.

  • (Integer)

    index The index the sub-string was found at.

Returns:

  • (Enumerator)

    If no block is given, an Enumerator object will be returned.

Since:

  • 0.3.0



433
434
435
436
437
438
439
440
441
442
443
444
445
446
447
448
449
450
451
452
453
454
455
456
457
458
459
460
461
462
463
464
465
466
467
468
469
470
471
472
# File 'lib/chars/char_set.rb', line 433

def each_substring_with_index(data, min_length: 4)
  unless block_given?
    return enum_for(__method__,data, min_length: min_length)
  end

  return if data.size < min_length

  index = 0

  match_start = nil
  match_end   = nil

  while index < data.size
    unless match_start
      if self.include_char?(data[index])
        match_start = index
      end
    else
      unless self.include_char?(data[index])
        match_end    = index
        match_length = (match_end - match_start)

        if match_length >= min_length
          match = data[match_start,match_length]

          yield match, match_start
        end

        match_start = match_end = nil
      end
    end

    index += 1
  end

  # yield the remaining match
  if match_start
    yield data[match_start, data.size - match_start], match_start
  end
end

#include_char?(char) ⇒ Boolean

Determines if a character is contained within the Chars::CharSet.

Parameters:

  • char (String)

    The character to search for.

Returns:

  • (Boolean)

    Specifies whether the character is contained within the Chars::CharSet.



115
116
117
118
119
120
121
# File 'lib/chars/char_set.rb', line 115

def include_char?(char)
  unless char.empty?
    @chars.has_value?(char) || include_byte?(char.ord)
  else
    false
  end
end

#initialize_copy(other) ⇒ Object

Initializes the copy of another Chars::CharSet object.

Parameters:



46
47
48
# File 'lib/chars/char_set.rb', line 46

def initialize_copy(other)
  @chars = other.instance_variable_get('@chars').dup
end

#inspectString

Inspects the Chars::CharSet.

Returns:



710
711
712
713
714
715
716
717
718
719
720
721
722
723
# File 'lib/chars/char_set.rb', line 710

def inspect
  "#<#{self.class.name}: {" + map { |byte|
    case byte
    when (0x07..0x0d), (0x20..0x7e)
      @chars[byte].dump
    when 0x00
      # sly hack to make char-sets more friendly
      # to us C programmers
      '"\0"'
    else
      sprintf("0x%02x",byte)
    end
  }.join(', ') + "}>"
end

#map_chars {|char| ... } ⇒ Array<String>

Maps the characters of the Chars::CharSet.

Yields:

  • (char)

    The given block will be used to transform the characters within the Chars::CharSet.

Yield Parameters:

Returns:



182
183
184
# File 'lib/chars/char_set.rb', line 182

def map_chars(&block)
  each_char.map(&block)
end

#random_byte(random: Random) ⇒ Integer

Returns a random byte from the Chars::CharSet.

Parameters:

  • random (Random, SecureRandom) (defaults to: Random)

    The random number generator to use.

Returns:

  • (Integer)

    A random byte value.



195
196
197
# File 'lib/chars/char_set.rb', line 195

def random_byte(random: Random)
  self.entries[random.rand(self.length)]
end

#random_bytes(length, random: Random) ⇒ Array<Integer>

Creates an Array of random bytes from the Chars::CharSet.

Parameters:

  • length (Integer, Array, Range)

    The length of the Array of random bytes.

  • random (Random, SecureRandom) (defaults to: Random)

    The random number generator to use.

Returns:

  • (Array<Integer>)

    The randomly selected bytes.



286
287
288
289
290
291
292
293
294
295
296
297
298
299
# File 'lib/chars/char_set.rb', line 286

def random_bytes(length, random: Random)
  case length
  when Array
    Array.new(length.sample(random: random)) do
      random_byte(random: random)
    end
  when Range
    Array.new(random.rand(length)) do
      random_byte(random: random)
    end
  else
    Array.new(length) { random_byte(random: random) }
  end
end

#random_char(**kwargs) ⇒ String

Returns a random character from the Chars::CharSet.

Parameters:

  • kwargs (Hash{Symbol => Object})

    Additional keyword arguments.

Options Hash (**kwargs):

  • :random (Random, SecureRandom)

    The random number generator to use.

Returns:

  • (String)

    A random char value.



211
212
213
# File 'lib/chars/char_set.rb', line 211

def random_char(**kwargs)
  @chars[random_byte(**kwargs)]
end

#random_chars(length, **kwargs) ⇒ Array<String>

Creates an Array of random characters from the Chars::CharSet.

Parameters:

  • length (Integer, Array, Range)

    The length of the Array of random characters.

  • kwargs (Hash{Symbol => Object})

    Additional keyword arguments.

Options Hash (**kwargs):

  • :random (Random, SecureRandom)

    The random number generator to use.

Returns:

  • (Array<String>)

    The randomly selected characters.



341
342
343
# File 'lib/chars/char_set.rb', line 341

def random_chars(length,**kwargs)
  random_bytes(length,**kwargs).map { |byte| @chars[byte] }
end

#random_distinct_bytes(length, random: Random) ⇒ Array<Integer>

Creates an Array of random non-repeating bytes from the Chars::CharSet.

Parameters:

  • length (Integer, Array, Range)

    The length of the Array of random non-repeating bytes.

  • random (Random, SecureRandom) (defaults to: Random)

    The random number generator to use.

Returns:

  • (Array<Integer>)

    The randomly selected non-repeating bytes.



313
314
315
316
317
318
319
320
321
322
323
324
# File 'lib/chars/char_set.rb', line 313

def random_distinct_bytes(length, random: Random)
  shuffled_bytes = bytes.shuffle(random: random)

  case length
  when Array
    shuffled_bytes[0,length.sample(random: random)]
  when Range
    shuffled_bytes[0,random.rand(length)]
  else
    shuffled_bytes[0,length]
  end
end

#random_distinct_chars(length, **kwargs) ⇒ Array<Integer>

Creates an Array of random non-repeating characters from the Chars::CharSet.

Parameters:

  • length (Integer, Array, Range)

    The length of the Array of random non-repeating characters.

  • kwargs (Hash{Symbol => Object})

    Additional keyword arguments.

Options Hash (**kwargs):

  • :random (Random, SecureRandom)

    The random number generator to use.

Returns:

  • (Array<Integer>)

    The randomly selected non-repeating characters.



383
384
385
# File 'lib/chars/char_set.rb', line 383

def random_distinct_chars(length,**kwargs)
  random_distinct_bytes(length,**kwargs).map { |byte| @chars[byte] }
end

#random_distinct_string(length, **kwargs) ⇒ String

Creates a String containing randomly selected non-repeating characters from the Chars::CharSet.

Parameters:

  • length (Integer, Array, Range)

    The length of the String of random non-repeating characters.

  • kwargs (Hash{Symbol => Object})

    Additional keyword arguments.

Options Hash (**kwargs):

  • :random (Random, SecureRandom)

    The random number generator to use.

Returns:

  • (String)

    The String of randomly selected non-repeating characters.

See Also:



405
406
407
# File 'lib/chars/char_set.rb', line 405

def random_distinct_string(length,**kwargs)
  random_distinct_chars(length,**kwargs).join
end

#random_string(length, **kwargs) ⇒ String

Creates a String containing randomly selected characters from the Chars::CharSet.

Parameters:

  • length (Integer, Array, Range)

    The length of the String of random characters.

  • kwargs (Hash{Symbol => Object})

    Additional keyword arguments.

Options Hash (**kwargs):

  • :random (Random, SecureRandom)

    The random number generator to use.

Returns:

  • (String)

    The String of randomly selected characters.

See Also:



363
364
365
# File 'lib/chars/char_set.rb', line 363

def random_string(length,**kwargs)
  random_chars(length,**kwargs).join
end

#select_chars {|char| ... } ⇒ Array<String>

Selects characters from the Chars::CharSet.

Yields:

  • (char)

    If a block is given, it will be used to select the characters from the Chars::CharSet.

Yield Parameters:

  • char (String)

    The character to select or reject.

Returns:



165
166
167
# File 'lib/chars/char_set.rb', line 165

def select_chars(&block)
  each_char.select(&block)
end

#strings_in(data, options = {}) {|match, (index)| ... } ⇒ Array, Hash

Finds sub-strings within given data that are made of characters within the Chars::CharSet.

Parameters:

  • data (String)

    The data to find sub-strings within.

  • options (Hash) (defaults to: {})

    Additional options.

Options Hash (options):

  • :length (Integer) — default: 4

    The minimum length of sub-strings found within the given data.

  • :offsets (Boolean) — default: false

    Specifies whether to return a Hash of offsets and matched sub-strings within the data, or to just return the matched sub-strings themselves.

Yields:

  • (match, (index))

    The given block will be passed every matched sub-string, and the optional index.

  • (String)

    match A sub-string containing the characters from the Chars::CharSet.

  • (Integer)

    index The index the sub-string was found at.

Returns:

  • (Array, Hash)

    If no block is given, an Array or Hash of sub-strings is returned.



586
587
588
589
590
591
592
593
594
595
596
597
598
599
600
601
602
603
# File 'lib/chars/char_set.rb', line 586

def strings_in(data,options={},&block)
  kwargs = {min_length: options.fetch(:length,4)}

  unless block
    if options[:offsets]
      return Hash[substrings_with_indexes(data,**kwargs)]
    else
      return substrings(data,**kwargs)
    end
  end

  case block.arity
  when 2
    each_substring_with_index(data,**kwargs,&block)
  else
    each_substring(data,**kwargs,&block)
  end
end

#strings_of_length(length) ⇒ Enumerator

Returns an Enumerator that enumerates through every possible string belonging to the Chars::CharSet and of the given length.

Parameters:

  • length (Range, Array, Integer)

    The desired length(s) of each string.

Returns:

  • (Enumerator)

See Also:

  • #each_string


647
648
649
# File 'lib/chars/char_set.rb', line 647

def strings_of_length(length)
  each_string_of_length(length)
end

#substrings(data, **kwargs) ⇒ Array<String>

Returns an Array of all substrings within the given string, of minimum length and that are made up of characters from the Chars::CharSet.

Parameters:

Options Hash (**kwargs):

  • :min_length (Integer)

    The minimum length of sub-strings found within the given data.

Returns:

  • (Array<String>)

    Tthe array of substrings within the given data.

See Also:

Since:

  • 0.3.0



547
548
549
# File 'lib/chars/char_set.rb', line 547

def substrings(data,**kwargs)
  each_substring(data,**kwargs).to_a
end

#substrings_with_indexes(data, **kwargs) ⇒ Array<(String, Integer)>

Returns an Array of all substrings and their indices within the given string, of minimum length and that are made up of characters from the Chars::CharSet.

Parameters:

Options Hash (**kwargs):

  • :min_length (Integer)

    The minimum length of sub-strings found within the given data.

Returns:

  • (Array<(String, Integer)>)

    Tthe array of substrings and their indices within the given data.

See Also:

Since:

  • 0.3.0



495
496
497
# File 'lib/chars/char_set.rb', line 495

def substrings_with_indexes(data,**kwargs)
  each_substring_with_index(data,**kwargs).to_a
end

#|(set) ⇒ CharSet Also known as: +

Creates a new CharSet object by unioning the Chars::CharSet with another Chars::CharSet.

Parameters:

Returns:



661
662
663
664
665
# File 'lib/chars/char_set.rb', line 661

def |(set)
  set = CharSet.new(set) unless set.kind_of?(CharSet)

  return super(set)
end