Class: Mustermann::StringScanner

Inherits:
Object
  • Object
show all
Defined in:
lib/mustermann/string_scanner.rb

Overview

Note:

This structure is not thread-safe, you should not scan on the same StringScanner instance concurrently. Even if it was thread-safe, scanning concurrently would probably lead to unwanted behaviour.

Class inspired by Ruby’s StringScanner to scan an input string using multiple patterns.

Examples:

require 'mustermann/string_scanner'
scanner = Mustermann::StringScanner.new("here is our example string")

scanner.scan("here") # => "here"
scanner.getch        # => " "

if scanner.scan(":verb our")
  scanner.scan(:noun, capture: :word)
  scanner[:verb]  # => "is"
  scanner[:nound] # => "example"
end

scanner.rest # => "string"

Defined Under Namespace

Classes: ScanResult

Constant Summary collapse

ScanError =

Exception raised if scan/unscan operation cannot be performed.

Class.new(::ScanError)

Instance Attribute Summary collapse

Class Method Summary collapse

Instance Method Summary collapse

Constructor Details

#initialize(string = "", **pattern_options) ⇒ StringScanner

Returns a new instance of StringScanner.

Examples:

with different default type

require 'mustermann/string_scanner'
scanner = Mustermann::StringScanner.new("foo/bar/baz", type: :shell)
scanner.scan('*')     # => "foo"
scanner.scan('**/*')  # => "/bar/baz"

Parameters:

  • string (String) (defaults to: "")

    the string to scan

  • pattern_options (Hash)

    default options used for #scan



132
133
134
135
136
# File 'lib/mustermann/string_scanner.rb', line 132

def initialize(string = "", **pattern_options)
  @pattern_options = pattern_options
  @string          = String(string).dup
  reset
end

Instance Attribute Details

#paramsHash (readonly)

Params from all previous matches from #scan and #scan_until, but not from #check and #check_until. Changes can be reverted with #unscan and it can be completely cleared via #reset.

Returns:

  • (Hash)

    current params



117
118
119
# File 'lib/mustermann/string_scanner.rb', line 117

def params
  @params
end

#pattern_optionsHash (readonly)

Returns default pattern options used for #scan and similar methods.

Returns:

  • (Hash)

    default pattern options used for #scan and similar methods

See Also:



110
111
112
# File 'lib/mustermann/string_scanner.rb', line 110

def pattern_options
  @pattern_options
end

#positionInteger Also known as: pos

Returns current scan position on the input string.

Returns:

  • (Integer)

    current scan position on the input string



120
121
122
# File 'lib/mustermann/string_scanner.rb', line 120

def position
  @position
end

Class Method Details

.cache_sizeInteger

This method is part of a private API. You should avoid using this method if possible, as it may be removed or be changed in the future.

Returns number of cached patterns.

Returns:

  • (Integer)

    number of cached patterns

See Also:



44
45
46
# File 'lib/mustermann/string_scanner.rb', line 44

def self.cache_size
  PATTERN_CACHE.size
end

.clear_cacheObject

Patterns created by #scan will be globally cached, since we assume that there is a finite number of different patterns used and that they are more likely to be reused than not. This method allows clearing the cache.

See Also:

  • PatternCache


37
38
39
# File 'lib/mustermann/string_scanner.rb', line 37

def self.clear_cache
  PATTERN_CACHE.clear
end

Instance Method Details

#<<(string) ⇒ Mustermann::StringScanner

Appends the given string to the string being scanned

Examples:

require 'mustermann/string_scanner'
scanner = Mustermann::StringScanner.new
scanner << "foo"
scanner.scan(/.+/) # => "foo"

Parameters:

  • string (String)

    will be appended

Returns:



235
236
237
238
# File 'lib/mustermann/string_scanner.rb', line 235

def <<(string)
  @string << string
  self
end

#[](key) ⇒ Object

Shorthand for accessing #params. Accepts symbols as keys.



269
270
271
# File 'lib/mustermann/string_scanner.rb', line 269

def [](key)
  params[key.to_s]
end

#beginning_of_line?true, false

Returns whether or not the current position is at the start of a line.

Returns:

  • (true, false)

    whether or not the current position is at the start of a line



246
247
248
# File 'lib/mustermann/string_scanner.rb', line 246

def beginning_of_line?
  @position == 0 or @string[@position - 1] == "\n"
end

#check(pattern, **options) ⇒ Mustermann::StringScanner::ScanResult?

Checks if the given pattern matches any substring starting at the current position.

Does not affect #position or #params.

Returns:



195
196
197
198
# File 'lib/mustermann/string_scanner.rb', line 195

def check(pattern, **options)
  params, length = create_pattern(pattern, **options).peek_params(rest)
  ScanResult.new(self, @position, length, params) if params
end

#check_until(pattern, **options) ⇒ Mustermann::StringScanner::ScanResult?

Checks if the given pattern matches any substring starting at any position after the current position.

Does not affect #position or #params.

Returns:



206
207
208
# File 'lib/mustermann/string_scanner.rb', line 206

def check_until(pattern, **options)
  check_until_with_prefix(pattern, **options).first
end

#eos?true, false

Returns whether or not the end of the string has been reached.

Returns:

  • (true, false)

    whether or not the end of the string has been reached



241
242
243
# File 'lib/mustermann/string_scanner.rb', line 241

def eos?
  @position >= @string.size
end

#getchMustermann::StringScanner::ScanResult?

Reads a single character and advances the #position by one.

Returns:



221
222
223
# File 'lib/mustermann/string_scanner.rb', line 221

def getch
  track_result ScanResult.new(self, @position, 1) unless eos?
end

#peek(length = 1) ⇒ String

Allows to peek at a number of still unscanned characters without advacing the #position.

Parameters:

  • length (Integer) (defaults to: 1)

    how many characters to look at

Returns:

  • (String)

    the substring



264
265
266
# File 'lib/mustermann/string_scanner.rb', line 264

def peek(length = 1)
  @string[@position, length]
end

#resetMustermann::StringScanner

Resets the #position to the start and clears all #params.

Returns:



140
141
142
143
144
145
# File 'lib/mustermann/string_scanner.rb', line 140

def reset
  @position = 0
  @params   = {}
  @history  = []
  self
end

#restString

Returns outstanding string not yet matched, empty string at end of input string.

Returns:

  • (String)

    outstanding string not yet matched, empty string at end of input string



251
252
253
# File 'lib/mustermann/string_scanner.rb', line 251

def rest
  @string[@position..-1] || ""
end

#rest_sizeInteger

Returns number of character remaining to be scanned.

Returns:

  • (Integer)

    number of character remaining to be scanned



256
257
258
# File 'lib/mustermann/string_scanner.rb', line 256

def rest_size
  @position > size ? 0 : size - @position
end

#scan(pattern, **options) ⇒ Mustermann::StringScanner::ScanResult?

Checks if the given pattern matches any substring starting at the current position.

If it does, it will advance the current #position to the end of the substring and merges any params parsed from the substring into #params.

Returns:



161
162
163
# File 'lib/mustermann/string_scanner.rb', line 161

def scan(pattern, **options)
  track_result check(pattern, **options)
end

#scan_until(pattern, **options) ⇒ Mustermann::StringScanner::ScanResult?

Checks if the given pattern matches any substring starting at any position after the current position.

If it does, it will advance the current #position to the end of the substring and merges any params parsed from the substring into #params.

Returns:



172
173
174
175
# File 'lib/mustermann/string_scanner.rb', line 172

def scan_until(pattern, **options)
  result, prefix = check_until_with_prefix(pattern, **options)
  track_result(prefix, result)
end

#sizeInteger

Returns size of the input string.

Returns:

  • (Integer)

    size of the input string



286
287
288
# File 'lib/mustermann/string_scanner.rb', line 286

def size
  @string.size
end

#terminateMustermann::StringScanner

Moves the position to the end of the input string.

Returns:



149
150
151
152
# File 'lib/mustermann/string_scanner.rb', line 149

def terminate
  track_result ScanResult.new(self, @position, size - @position)
  self
end

#to_hHash

Params from all previous matches from #scan and #scan_until, but not from #check and #check_until. Changes can be reverted with #unscan and it can be completely cleared via #reset.

Returns:

  • (Hash)

    current params



274
275
276
# File 'lib/mustermann/string_scanner.rb', line 274

def to_h
  params.dup
end

#to_sString

Returns the input string.

Returns:

  • (String)

    the input string

See Also:



281
282
283
# File 'lib/mustermann/string_scanner.rb', line 281

def to_s
  @string.dup
end

#unscanMustermann::StringScanner

Reverts the last operation that advanced the position.

Operations advancing the position: #terminate, #scan, #scan_until, #getch.

Returns:

Raises:



181
182
183
184
185
186
187
# File 'lib/mustermann/string_scanner.rb', line 181

def unscan
  raise ScanError, 'unscan failed: previous match record not exist' if @history.empty?
  previous = @history[0..-2]
  reset
  previous.each { |r| track_result(*r) }
  self
end