Class: ANTLR3::StringStream

Inherits:
Object
  • Object
show all
Includes:
CharacterStream
Defined in:
lib/antlr3/streams.rb

Overview

A StringStream's purpose is to wrap the basic, naked text input of a recognition system. Like all other stream types, it provides serial navigation of the input; a recognizer can arbitrarily step forward and backward through the stream's symbols as it requires. StringStream and its subclasses are they main way to feed text input into an ANTLR Lexer for token processing.

The stream's symbols of interest, of course, are character values. Thus, the #peek method returns the integer character value at look-ahead position k and the #look method returns the character value as a String. They also track various pieces of information such as the line and column numbers at the current position.

Note About Text Encoding

This version of the runtime library primarily targets ruby version 1.8, which does not have strong built-in support for multi-byte character encodings. Thus, characters are assumed to be represented by a single byte – an integer between 0 and 255. Ruby 1.9 does provide built-in encoding support for multi-byte characters, but currently this library does not provide any streams to handle non-ASCII encoding. However, encoding-savvy recognition code is a future development goal for this project.

Constant Summary collapse

NEWLINE =
?\n.ord

Constants included from Constants

Constants::BUILT_IN_TOKEN_NAMES, Constants::DEFAULT, Constants::DOWN, Constants::EOF, Constants::EOF_TOKEN, Constants::EOR_TOKEN_TYPE, Constants::HIDDEN, Constants::INVALID, Constants::INVALID_TOKEN, Constants::MEMO_RULE_FAILED, Constants::MEMO_RULE_UNKNOWN, Constants::MIN_TOKEN_TYPE, Constants::SKIP_TOKEN, Constants::UP

Instance Attribute Summary collapse

Instance Method Summary collapse

Constructor Details

#initialize(data, options = {}) ⇒ StringStream


397
398
399
400
401
402
403
404
405
406
# File 'lib/antlr3/streams.rb', line 397

def initialize( data, options = {} )      # for 1.9
  @string   = data.to_s.encode( Encoding::UTF_8 ).freeze
  @data     = @string.codepoints.to_a.freeze
  @position = options.fetch :position, 0
  @line     = options.fetch :line, 1
  @column   = options.fetch :column, 0
  @markers  = []
  @name   ||= options[ :file ] || options[ :name ] # || '(string)'
  mark
end

Instance Attribute Details

#columnObject (readonly)

the current character position within the current line, indexed upward from 0


378
379
380
# File 'lib/antlr3/streams.rb', line 378

def column
  @column
end

#dataObject (readonly)

the entire string that is wrapped by the stream


385
386
387
# File 'lib/antlr3/streams.rb', line 385

def data
  @data
end

#lineObject (readonly)

the current line number of the input, indexed upward from 1


375
376
377
# File 'lib/antlr3/streams.rb', line 375

def line
  @line
end

#nameObject Also known as: source_name

the name associated with the stream – usually a file name defaults to "(string)"


382
383
384
# File 'lib/antlr3/streams.rb', line 382

def name
  @name
end

#positionObject (readonly) Also known as: index, character_index

current integer character index of the stream


372
373
374
# File 'lib/antlr3/streams.rb', line 372

def position
  @position
end

#stringObject (readonly)

Returns the value of attribute string


386
387
388
# File 'lib/antlr3/streams.rb', line 386

def string
  @string
end

Instance Method Details

#<<(k) ⇒ Object

operator style look-behind


521
522
523
# File 'lib/antlr3/streams.rb', line 521

def <<( k )
  self << -k
end

#[](start, *args) ⇒ Object

identical to String#[]


659
660
661
# File 'lib/antlr3/streams.rb', line 659

def []( start, *args )
  @string[ start, *args ]
end

#beginning_of_line?Boolean

Returns true if the stream appears to be at the beginning of a new line. This is an extra utility method for use inside lexer actions if needed.


534
535
536
# File 'lib/antlr3/streams.rb', line 534

def beginning_of_line?
  @position.zero? or @data[ @position - 1 ] == NEWLINE
end

#beginning_of_string?Boolean Also known as: bof?

Returns true if the stream appears to be at the beginning of a stream (position = 0). This is an extra utility method for use inside lexer actions if needed.


558
559
560
# File 'lib/antlr3/streams.rb', line 558

def beginning_of_string?
  @position == 0
end

#consumeObject

advance the stream by one character; returns the character consumed


478
479
480
481
482
483
484
485
486
487
488
489
# File 'lib/antlr3/streams.rb', line 478

def consume
  c = @data[ @position ] || EOF
  if @position < @data.length
    @column += 1
    if c == NEWLINE
      @line += 1
      @column = 0
    end
    @position += 1
  end
  return( c )
end

#end_of_line?Boolean

Returns true if the stream appears to be at the end of a new line. This is an extra utility method for use inside lexer actions if needed.


542
543
544
# File 'lib/antlr3/streams.rb', line 542

def end_of_line?
  @data[ @position ] == NEWLINE #if @position < @data.length
end

#end_of_string?Boolean Also known as: eof?

Returns true if the stream has been exhausted. This is an extra utility method for use inside lexer actions if needed.


550
551
552
# File 'lib/antlr3/streams.rb', line 550

def end_of_string?
  @position >= @data.length
end

#inspect(before_chars = 6, after_chars = 10) ⇒ Object

customized object inspection that shows:

  • the stream class

  • the stream's location in index / line:column format

  • before_chars characters before the cursor (6 characters by default)

  • after_chars characters after the cursor (10 characters by default)


638
639
640
641
642
643
644
645
646
647
# File 'lib/antlr3/streams.rb', line 638

def inspect( before_chars = 6, after_chars = 10 )
  before = through( -before_chars ).inspect
  @position - before_chars > 0 and before.insert( 0, '... ' )
  
  after = through( after_chars ).inspect
  @position + after_chars + 1 < @data.length and after << ' ...'
  
  location = "#@position / line #@line:#@column"
  "#<#{ self.class }: #{ before } | #{ after } @ #{ location }>"
end

#last_markerObject

the last marker value created by a call to #mark


597
598
599
# File 'lib/antlr3/streams.rb', line 597

def last_marker
  @markers.length - 1
end

#look(k = 1) ⇒ Object Also known as: >>


411
412
413
414
415
416
417
418
419
# File 'lib/antlr3/streams.rb', line 411

def look( k = 1 )               # for 1.9
  k == 0 and return nil
  k += 1 if k < 0
  
  index = @position + k - 1
  index < 0 and return nil
  
  @string[ index ]
end

#markObject

record the current stream location parameters in the stream's marker table and return an integer-valued bookmark that may be used to restore the stream's position with the #rewind method. This method is used to implement backtracking.


570
571
572
573
574
# File 'lib/antlr3/streams.rb', line 570

def mark
  state = [ @position, @line, @column ].freeze
  @markers << state
  return @markers.length - 1
end

#mark_depthObject

the total number of markers currently in existence


590
591
592
# File 'lib/antlr3/streams.rb', line 590

def mark_depth
  @markers.length
end

#peek(k = 1) ⇒ Object

return the character at look-ahead distance k as an integer. k = 1 represents the current character. k greater than 1 represents upcoming characters. A negative value of k returns previous characters consumed, where k = -1 is the last character consumed. k = 0 has undefined behavior and returns nil


497
498
499
500
501
502
503
# File 'lib/antlr3/streams.rb', line 497

def peek( k = 1 )
  k == 0 and return nil
  k += 1 if k < 0
  index = @position + k - 1
  index < 0 and return nil
  @data[ index ] or EOF
end

#release(marker = @markers.length - 1) ⇒ Object

let go of the bookmark data for the marker and all marker values created after the marker.


605
606
607
608
609
# File 'lib/antlr3/streams.rb', line 605

def release( marker = @markers.length - 1 )
  marker.between?( 1, @markers.length - 1 ) or return
  @markers.pop( @markers.length - marker )
  return self
end

#resetObject

rewinds the stream back to the start and clears out any existing marker entries


467
468
469
470
471
472
473
# File 'lib/antlr3/streams.rb', line 467

def reset
  initial_location = @markers.first
  @position, @line, @column = initial_location
  @markers.clear
  @markers << initial_location
  return self
end

#rewind(marker = @markers.length - 1, release = true) ⇒ Object

restore the stream to an earlier location recorded by #mark. If no marker value is provided, the last marker generated by #mark will be used.


580
581
582
583
584
585
# File 'lib/antlr3/streams.rb', line 580

def rewind( marker = @markers.length - 1, release = true )
  ( marker >= 0 and location = @markers[ marker ] ) or return( self )
  @position, @line, @column = location
  release( marker ) if release
  return self
end

#seek(index) ⇒ Object

jump to the absolute position value given by index. note: if index is before the current position, the line and column

attributes of the stream will probably be incorrect

616
617
618
619
620
621
622
623
624
625
626
627
628
629
# File 'lib/antlr3/streams.rb', line 616

def seek( index )
  index = index.bound( 0, @data.length )  # ensures index is within the stream's range
  if index > @position
    skipped = through( index - @position )
    if lc = skipped.count( "\n" ) and lc.zero?
      @column += skipped.length
    else
      @line += lc
      @column = skipped.length - skipped.rindex( "\n" ) - 1
    end
  end
  @position = index
  return nil
end

#sizeObject Also known as: length


458
459
460
# File 'lib/antlr3/streams.rb', line 458

def size
  @data.length
end

#substring(start, stop) ⇒ Object

return the string slice between position start and stop


652
653
654
# File 'lib/antlr3/streams.rb', line 652

def substring( start, stop )
  @string[ start, stop - start + 1 ]
end

#through(k) ⇒ Object

return a substring around the stream cursor at a distance k if k >= 0, return the next k characters if k < 0, return the previous |k| characters


510
511
512
513
514
515
# File 'lib/antlr3/streams.rb', line 510

def through( k )
  if k >= 0 then @string[ @position, k ] else
    start = ( @position + k ).at_least( 0 ) # start cannot be negative or index will wrap around
    @string[ start ... @position ]
  end
end