Class: String

Inherits:
Object show all
Defined in:
lib/openc3/core_ext/string.rb,
lib/openc3/io/json_rpc.rb

Overview

OpenC3 specific additions to the Ruby String class

Constant Summary collapse

NON_ASCII_PRINTABLE =
/[^\x21-\x7e\s]/
NON_UTF8_PRINTABLE =
/[\x00-\x08\x0E-\x1F\x7F]/
OUTSIDE_LATIN_RANGE =

Matches characters outside the Latin range (U+0000-U+00FF) or C1 control characters (U+0080-U+009F) Latin range covers Basic Latin (U+0000-U+007F) and Latin-1 Supplement (U+00A0-U+00FF) This includes common characters like µ (U+00B5), ° (U+00B0), ñ (U+00F1), etc.

/[^\u0000-\u007F\u00A0-\u00FF]/
PRINTABLE_RANGE =

The printable range of ASCII characters

32..126
NON_PRINTABLE_REGEX =

Regular expression to identify a character that is not in the printable range

/[^\s!-~]/
FLOAT_CHECK_REGEX =

Regular expression to identify a String as a floating point number

/\A\s*[-+]?\d*\.\d+\s*\z/
SCIENTIFIC_CHECK_REGEX =

Regular expression to identify a String as a floating point number in scientific notation

/\A\s*[-+]?(\d+((\.\d+)?)|(\.\d+))[eE][-+]?\d+\s*\z/
INT_CHECK_REGEX =

Regular expression to identify a String as an integer

/\A\s*[-+]?\d+\s*\z/
HEX_CHECK_REGEX =

Regular expression to identify a String as an integer in hexadecimal format

/\A\s*0[xX][\dabcdefABCDEF]+\s*\z/
ARRAY_CHECK_REGEX =

Regular expression to identify a String as an Array of numbers

/\A\s*\[.*\]\s*\z/
OBJECT_CHECK_REGEX =

Regular expression to identify a String containing object notation

/\A\s*\{.*\}\s*\z/

Instance Method Summary collapse

Instance Method Details

#as_json(_options = nil) ⇒ Object



61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
# File 'lib/openc3/io/json_rpc.rb', line 61

def as_json(_options = nil)
  # Try to interpret the string as UTF-8
  # This handles both:
  # 1. Unicode text in ASCII-8BIT strings (e.g., "µA" for micro-Ampères from config files)
  # 2. Binary data from hex_to_byte_string (e.g., \xDE\xAD\xBE\xEF) which will fail valid_encoding?
  as_utf8 = self.dup.force_encoding('UTF-8')
  if as_utf8.valid_encoding?
    # Valid UTF-8 - check for non-printable control characters
    if as_utf8 =~ NON_UTF8_PRINTABLE
      return self.to_json_raw_object
    end
    # Check if all characters are in the expected Latin range (U+0000-U+00FF)
    # This prevents binary data that happens to be valid UTF-8 from being treated as text
    # For example, \xDE\xAD decodes to U+07AD (Thaana script) which should be treated as binary
    # Also reject C1 control characters (U+0080-U+009F) which are non-printable
    if as_utf8 =~ OUTSIDE_LATIN_RANGE
      return self.to_json_raw_object
    end
    return as_utf8
  else
    # Invalid UTF-8 means this is truly binary data, encode as raw object
    return self.to_json_raw_object
  end
end

#class_name_to_filename(include_extension = true) ⇒ String

Converts a String representing a class (i.e. “MyGreatClass”) to a Ruby filename which implements the class (i.e. “my_great_class.rb”).

Parameters:

  • include_extension (Boolean) (defaults to: true)

    Whether to add ???.rb??? extension

Returns:

  • (String)

    Filename which implements the class name



293
294
295
296
297
298
299
300
301
302
303
# File 'lib/openc3/core_ext/string.rb', line 293

def class_name_to_filename(include_extension = true)
  string = self.split("::")[-1] # Remove any namespacing
  filename = ''
  length = string.length
  length.times do |index|
    filename << '_' if index != 0 and string[index..index] == string[index..index].upcase
    filename << string[index..index].downcase
  end
  filename << '.rb' if include_extension
  filename
end

#comment_erbObject



395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
# File 'lib/openc3/core_ext/string.rb', line 395

def comment_erb
  output = self.lines.collect! do |line|
    # If we have a commented out line that starts with #
    # but not followed by % (allows for disabling ERB comments),
    # which contains an ERB statement (<% ...)
    # then comment out the ERB statement (<%# ...).
    # We explicitly don't comment out trailing ERB statements
    # as that is not typical and is difficult to regex
    if line =~ /^\s*#[^%]*<%/
      line.gsub!('<%', '<%#')
    end
    line
  end
  return output.join("")
end

#convert_to_valueObject

depending on what the String represents. It can successfully convert floating point numbers in both fixed and scientific notation, integers in hexadecimal notation, and Arrays. If it can’t be converted into any of the above then the original String is returned.

Returns:

  • Converts the String into either a Float, Integer, or Array



225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
# File 'lib/openc3/core_ext/string.rb', line 225

def convert_to_value
  return_value = self
  begin
    upcase_self = self.upcase
    if upcase_self == 'INFINITY'.freeze
      return_value = Float::INFINITY
    elsif upcase_self == '-INFINITY'.freeze
      return_value = -Float::INFINITY
    elsif upcase_self == 'NAN'.freeze
      return_value = Float::NAN
    elsif self.is_float?
      # Floating Point in normal or scientific notation
      return_value = self.to_f
    elsif self.is_int?
      # Integer
      return_value = self.to_i
    elsif self.is_hex?
      # Hex
      return_value = Integer(self)
    elsif self.is_array? or self.is_object?
      # Array or Object
      return_value = YAML.safe_load(self)
    end
  rescue Exception
    # Something went wrong so just return the string as is
  end
  return return_value
end

#filename_to_class_nameString

Converts a String representing a filename (i.e. “my_great_class.rb”) to a Ruby class name (i.e. “MyGreatClass”).

Returns:

  • (String)

    Class name associated with the filename



309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
# File 'lib/openc3/core_ext/string.rb', line 309

def filename_to_class_name
  filename = File.basename(self)
  class_name = ''
  length = filename.length
  upcase_next = true
  length.times do |index|
    break if filename[index..index] == '.'

    if filename[index..index] == '_'
      upcase_next = true
    elsif upcase_next
      class_name << filename[index..index].upcase
      upcase_next = false
    else
      class_name << filename[index..index].downcase
    end
  end
  class_name
end

#formatted(word_size = 1, words_per_line = 16, word_separator = ' ', indent = 0, show_address = true, address_separator = ': ', show_ascii = true, ascii_separator = ' ', unprintable_character = ' ', line_separator = "\n") ⇒ Object

Displays a String containing binary data in a human readable format by converting each byte to the hex representation.

Parameters:

  • word_size (Integer) (defaults to: 1)

    How many bytes compose a word. Words are grouped together without spaces in between

  • words_per_line (Integer) (defaults to: 16)

    The number of words to display on a single formatted line

  • word_separator (String) (defaults to: ' ')

    The string to place between words

  • indent (Integer) (defaults to: 0)

    The amount of spaces to put in front of each formatted line

  • show_address (Boolean) (defaults to: true)

    Whether to show the hex address of the first byte in the formatted output

  • address_separator (String) (defaults to: ': ')

    The string to put after the hex address. Only used if show_address is true.

  • show_ascii (Boolean) (defaults to: true)

    Whether to interpret the binary data as ASCII characters and display the printable characters to the right of the formatted line

  • ascii_separator (String) (defaults to: ' ')

    The string to put between the formatted line and the ASCII characters. Only used if show_ascii is true.

  • unprintable_character (String) (defaults to: ' ')

    The string to output when data in the binary String does not result in a printable ASCII character. Only used if show_ascii is true.

  • line_separator (String) (defaults to: "\n")

    The string used to end a line. Normally newline.



65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
# File 'lib/openc3/core_ext/string.rb', line 65

def formatted(
  word_size = 1,
  words_per_line = 16,
  word_separator = ' ',
  indent = 0,
  show_address = true,
  address_separator = ': ',
  show_ascii = true,
  ascii_separator = '  ',
  unprintable_character = ' ',
  line_separator = "\n"
)
  string = ''
  byte_offset = 0
  bytes_per_line = word_size * words_per_line
  indent_string = ' ' * indent
  ascii_line = ''

  self.each_byte do |byte|
    if byte_offset % bytes_per_line == 0
      # Create the indentation at the beginning of each line
      string << indent_string

      # Add the address if requested
      string << sprintf("%08X%s", byte_offset, address_separator) if show_address
    end

    # Add the byte
    string << sprintf("%02X", byte)

    # Create the ASCII representation if requested
    if show_ascii
      if PRINTABLE_RANGE.include?(byte)
        ascii_line << [byte].pack('C')
      else
        ascii_line << unprintable_character
      end
    end

    # Move to next byte
    byte_offset += 1

    # If we're at the end of the line we output the ascii if requested
    if byte_offset % bytes_per_line == 0
      if show_ascii
        string << "#{ascii_separator}#{ascii_line}"
        ascii_line = ''
      end
      string << line_separator

    # If we're at a word junction then output the word_separator
    elsif (byte_offset % word_size == 0) and byte_offset != self.length
      string << word_separator
    end
  end

  # We're done printing all the bytes. Now check to see if we ended in the
  # middle of a line. If so we have to print out the final ASCII if
  # requested.
  if byte_offset % bytes_per_line != 0
    if show_ascii
      num_word_separators = ((byte_offset % bytes_per_line) - 1) / word_size
      existing_length = (num_word_separators * word_separator.length) + ((byte_offset % bytes_per_line) * 2)
      full_line_length = (bytes_per_line * 2) + ((words_per_line - 1) * word_separator.length)
      filler = ' ' * (full_line_length - existing_length)
      ascii_filler = ' ' * (bytes_per_line - ascii_line.length)
      string << "#{filler}#{ascii_separator}#{ascii_line}#{ascii_filler}"
      ascii_line = ''
    end
    string << line_separator
  end
  string
end

#hex_to_byte_stringString

Converts the String representing a hexadecimal number (i.e. “0xABCD”) to a binary String with the same data (i.e “\xAB\xCD”)

Returns:

  • (String)

    Binary byte string



258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
# File 'lib/openc3/core_ext/string.rb', line 258

def hex_to_byte_string
  string = self.dup

  # Remove leading 0x or 0X
  if string[0..1] == '0x' or string[0..1] == '0X'
    string = string[2..-1]
  end

  length = string.length
  length += 1 unless (length % 2) == 0

  array = []
  (length / 2).times do
    # Grab last two characters
    if string.length >= 2
      last_two_characters = string[-2..-1]
      string = string[0..-3]
    else
      last_two_characters = string[0..0]
      string = ''
    end

    int_value = Integer('0x' + last_two_characters)

    array.unshift(int_value)
  end

  array.pack("C*")
end

#is_array?Boolean

Returns Whether the String represents an Array.

Returns:

  • (Boolean)

    Whether the String represents an Array



206
207
208
# File 'lib/openc3/core_ext/string.rb', line 206

def is_array?
  if ARRAY_CHECK_REGEX.match?(self) then true else false end
end

#is_float?Boolean

Returns Whether the String represents a floating point number.

Returns:

  • (Boolean)

    Whether the String represents a floating point number



191
192
193
# File 'lib/openc3/core_ext/string.rb', line 191

def is_float?
  if self =~ FLOAT_CHECK_REGEX or self =~ SCIENTIFIC_CHECK_REGEX then true else false end
end

#is_hex?Boolean

Returns Whether the String represents a hexadecimal number.

Returns:

  • (Boolean)

    Whether the String represents a hexadecimal number



201
202
203
# File 'lib/openc3/core_ext/string.rb', line 201

def is_hex?
  if HEX_CHECK_REGEX.match?(self) then true else false end
end

#is_int?Boolean

Returns Whether the String represents an integer.

Returns:

  • (Boolean)

    Whether the String represents an integer



196
197
198
# File 'lib/openc3/core_ext/string.rb', line 196

def is_int?
  if INT_CHECK_REGEX.match?(self) then true else false end
end

#is_object?Boolean

Returns Whether the String represents an Object.

Returns:

  • (Boolean)

    Whether the String represents an Object



211
212
213
# File 'lib/openc3/core_ext/string.rb', line 211

def is_object?
  if OBJECT_CHECK_REGEX.match?(self) then true else false end
end

#is_printable?Boolean

Returns Whether the string contains only printable characters.

Returns:

  • (Boolean)

    Whether the string contains only printable characters



216
217
218
# File 'lib/openc3/core_ext/string.rb', line 216

def is_printable?
  if NON_PRINTABLE_REGEX.match?(self) then false else true end
end

#num_linesInteger

Returns The number of lines in the string (as split by the newline character).

Returns:

  • (Integer)

    The number of lines in the string (as split by the newline character)



169
170
171
172
173
# File 'lib/openc3/core_ext/string.rb', line 169

def num_lines
  value = self.count("\n")
  value += 1 if self[-1..-1] and self[-1..-1] != "\n"
  value
end

#quote_if_necessary(quote_char = '"') ⇒ String

Adds quotes if the string contains whitespace

Parameters:

  • quote_char (String) (defaults to: '"')

    The quote character to add if necessary

Returns:

  • (String)

    quoted string if necessary



361
362
363
364
365
366
367
# File 'lib/openc3/core_ext/string.rb', line 361

def quote_if_necessary(quote_char = '"')
  if /\s/.match?(self)
    return quote_char + self + quote_char
  else
    return self
  end
end

#remove_line(line_number, separator = $/) ⇒ String

Uses the String each_line method to iterate through the lines and removes the line specified.

Parameters:

  • line_number (Integer)

    The line to remove from the string (1 based)

  • separator (String) (defaults to: $/)

    The record separator to pass to #each_line ($/ by default is the newline character)

Returns:

  • (String)

    A new string with the line removed



157
158
159
160
161
162
163
164
165
# File 'lib/openc3/core_ext/string.rb', line 157

def remove_line(line_number, separator = $/)
  new_string = ''
  index = 1
  self.each_line(separator) do |line|
    new_string << line unless index == line_number
    index += 1
  end
  new_string
end

#remove_quotesObject

Removes quotes from the given string if present.

"'quoted string'".remove_quotes #=> "quoted string"


177
178
179
180
181
182
183
184
185
186
187
# File 'lib/openc3/core_ext/string.rb', line 177

def remove_quotes
  return self if self.length < 2

  first_char = self[0]
  return self if (first_char != '"') && (first_char != "'")

  last_char = self[-1]
  return self if first_char != last_char

  return self[1..-2]
end

#simple_formattedObject

Displays a String containing binary data in a human readable format by converting each byte to the hex representation. Simply formatted as a single string of bytes



142
143
144
145
146
147
148
# File 'lib/openc3/core_ext/string.rb', line 142

def simple_formatted
  string = ''
  self.each_byte do |byte|
    string << sprintf("%02X", byte)
  end
  string
end

#to_classClass

Converts a String representing a class (i.e. “MyGreatClass”) to the actual class that has been required and is present in the Ruby runtime.

Returns:



333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
# File 'lib/openc3/core_ext/string.rb', line 333

def to_class
  klass = nil
  split_self = self.split('::')
  if split_self.length > 1
    split_self.each do |class_name|
      if klass
        klass = klass.const_get(class_name)
      else
        klass = Object.const_get(class_name)
      end
    end
  else
    begin
      klass = OpenC3.const_get(self)
    rescue
      begin
        klass = Object.const_get(self)
      rescue
      end
    end
  end
  klass
end

#to_utf8String

Converts a string to UTF-8 and returns a new string Assumes the string is Windows-1252 encoded if marked ASCII-8BIT and not UTF-8 compatible

Returns:

  • (String)

    UTF-8 encoded string



373
374
375
# File 'lib/openc3/core_ext/string.rb', line 373

def to_utf8
  self.dup.to_utf8!
end

#to_utf8!String

Converts a string to UTF-8 in place Assumes the string is Windows-1252 encoded if marked ASCII-8BIT and not UTF-8 compatible

Returns:

  • (String)

    UTF-8 encoded string



381
382
383
384
385
386
387
388
389
390
391
392
393
# File 'lib/openc3/core_ext/string.rb', line 381

def to_utf8!
  if self.encoding == Encoding::ASCII_8BIT
    if self.force_encoding('UTF-8').valid_encoding?
      return self
    else
      # Note: this will replace any characters without a valid conversion with space (shouldn't be possible from Windows-1252)
      return self.force_encoding('Windows-1252').encode!('UTF-8', invalid: :replace, undef: :replace, replace: ' ')
    end
  else
    # Note: this will replace any characters without a valid conversion with space
    self.encode!('UTF-8', invalid: :replace, undef: :replace, replace: ' ')
  end
end