Class: Ast::Tokeniser

Inherits:
Object
  • Object
show all
Defined in:
lib/ast_ast/tokeniser.rb

Defined Under Namespace

Classes: Rule

Class Method Summary collapse

Class Method Details

.missing(&block) ⇒ Object

Define a block to run when no match is found, as with .token the block should return a token instance. The block will only be passed a single character at a time.

Examples:


missing do |i|
  Ast::Token.new(i, i)
end


127
128
129
# File 'lib/ast_ast/tokeniser.rb', line 127

def self.missing(&block)
  @missing ||= block
end

.rule(name, regex, &block) ⇒ Object

Creates a new Rule and adds to the @rules list.

Parameters:

  • name (Symbol)
  • regex (Regexp)

See Also:



69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
# File 'lib/ast_ast/tokeniser.rb', line 69

def self.rule(name, regex, &block)
  @rules ||= []
  # make rules with same name overwrite first rule
  @rules.delete_if {|i| i.name == name}
  
  # Create default block which just returns a value
  block ||= Proc.new {|i| i}
  # Make sure to return a token
  proc = Proc.new {|_i| 
    block_result = block.call(_i)
    if block_result.is_a? Array
      r = []
      block_result.each do |j|
        r << Ast::Token.new(name, j)
      end
      r
    else
      Ast::Token.new(name, block_result) 
    end
  }
  @rules << Rule.new(name, regex, proc)
end

.rulesArray

Returns Rules that have been defined.

Returns:

  • (Array)

    Rules that have been defined.



95
# File 'lib/ast_ast/tokeniser.rb', line 95

def self.rules; @rules; end

.token(regex, &block) ⇒ Object

Creates a new token rule, that is the block returns an Ast::Token instance.

Examples:


keywords = ['def', 'next', 'while', 'end']

token /[a-z]+/ do |i|
  if keywords.include?(i)
    Ast::Token.new(:keyword, i)
  else
    Ast::Token.new(:word, i)
end

Parameters:

  • regex (Regexp)


112
113
114
115
# File 'lib/ast_ast/tokeniser.rb', line 112

def self.token(regex, &block)
  @rules ||= []
  @rules << Rule.new(nil, regex, block)
end

.tokenise(input) ⇒ Tokens

Takes the input and uses the rules that were created to scan it.

Parameters:

  • Input (String)

    string to scan.

Returns:



138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
# File 'lib/ast_ast/tokeniser.rb', line 138

def self.tokenise(input)
  @rules ||= []
  @scanner = StringScanner.new(input)
  
  result = Tokens.new
  until @scanner.eos?
    m = false # keep track of matches
    @rules.each do |i|
      a = @scanner.scan(i.regex)
      unless a.nil?
        m = true # match happened
        ran = i.run(a)
        # split array into separate tokens, *not* values
        if ran.is_a? Array
          #ran.each {|a| result << [i.name, a]}
          ran.each {|a| result << a }
        else
          #result << [i.name, ran]
          result << ran
        end
      end
    end
    unless m # if no match happened
      # obviously no rule matches this so invoke missing if it exists
      ch = @scanner.getch # this advances pointer as well
      if @missing
        result << @missing.call(ch)
      end
    end
  end
  result
end