Class: Rley::RGN::Tokenizer

Inherits:
Object
  • Object
show all
Defined in:
lib/rley/rgn/tokenizer.rb

Overview

A tokenizer for the Rley notation language. Responsibility: break input into a sequence of token objects. The tokenizer should recognize: Identifiers, Number literals including single digit String literals (quote delimited) Delimiters: e.g. parentheses '(', ')' Separators: e.g. comma

Constant Summary collapse

PATT_KEY =
/[a-zA-Z_][a-zA-Z_0-9]*:/.freeze
PATT_INTEGER =
/\d+/.freeze
PATT_NEWLINE =
/(?:\r\n)|\r|\n/.freeze
PATT_STRING_START =
/"|'/.freeze
PATT_SYMBOL =
/[^?*+,:(){}\s]+/.freeze
PATT_WHITESPACE =
/[ \t\f]+/.freeze
Lexeme2name =

One or two special character tokens.

{
  '(' => 'LEFT_PAREN',
  ')' => 'RIGHT_PAREN',
  '{' => 'LEFT_BRACE',
  '}' => 'RIGHT_BRACE',
  ',' => 'COMMA',
  '+' => 'PLUS',
  '?' => 'QUESTION_MARK',
  '*' => 'STAR',
  '..' => 'ELLIPSIS'
}.freeze
@@keywords =

Here are all the implemented Rley notation keywords

%w[
  match_closest repeat
].to_h { |x| [x, x] }

Instance Attribute Summary collapse

Instance Method Summary collapse

Constructor Details

#initialize(source = nil) ⇒ Tokenizer

Constructor. Initialize a tokenizer for RGN input.

Parameters:

  • source (String) (defaults to: nil)

    RGN text to tokenize.



53
54
55
56
57
# File 'lib/rley/rgn/tokenizer.rb', line 53

def initialize(source = nil)
  reset
  input = source || ''
  @scanner = StringScanner.new(input)
end

Instance Attribute Details

#line_startInteger (readonly)

Returns Position of last start of line in the input.

Returns:

  • (Integer)

    Position of last start of line in the input



31
32
33
# File 'lib/rley/rgn/tokenizer.rb', line 31

def line_start
  @line_start
end

#linenoInteger (readonly)

Returns The current line number.

Returns:

  • (Integer)

    The current line number



28
29
30
# File 'lib/rley/rgn/tokenizer.rb', line 28

def lineno
  @lineno
end

#scannerStringScanner (readonly)

Returns Low-level input scanner.

Returns:

  • (StringScanner)

    Low-level input scanner



25
26
27
# File 'lib/rley/rgn/tokenizer.rb', line 25

def scanner
  @scanner
end

Instance Method Details

#start_with(source) ⇒ Object

Reset the tokenizer and make the given text, the current input.

Parameters:

  • source (String)

    RGN text to tokenize.



61
62
63
64
# File 'lib/rley/rgn/tokenizer.rb', line 61

def start_with(source)
  reset
  @scanner.string = source
end

#tokensArray<Rley::Lexical::Token>

Scan the source and return an array of tokens.

Returns:



68
69
70
71
72
73
74
75
76
# File 'lib/rley/rgn/tokenizer.rb', line 68

def tokens
  tok_sequence = []
  until @scanner.eos?
    token = _next_token
    tok_sequence << token unless token.nil?
  end

  tok_sequence
end