Class: SyntaxTree::Parser

Inherits:
Ripper
  • Object
show all
Defined in:
lib/syntax_tree/parser.rb

Overview

Parser is a subclass of the Ripper library that subscribes to the stream of tokens and nodes coming from the parser and builds up a syntax tree.

Defined Under Namespace

Classes: MultiByteString, ParseError, PinVisitor, Semicolon, SingleByteString, TokenList

Instance Attribute Summary collapse

Instance Method Summary collapse

Constructor Details

#initialize(source) ⇒ Parser

Returns a new instance of Parser.



116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
# File 'lib/syntax_tree/parser.rb', line 116

def initialize(source, *)
  super

  # We keep the source around so that we can refer back to it when we're
  # generating the AST. Sometimes it's easier to just reference the source
  # string when you want to check if it contains a certain character, for
  # example.
  @source = source

  # This is the full set of comments that have been found by the parser.
  # It's a running list. At the end of every block of statements, they will
  # go in and attempt to grab any comments that are on their own line and
  # turn them into regular statements. So at the end of parsing the only
  # comments left in here will be comments on lines that also contain code.
  @comments = []

  # This is the current embdoc (comments that start with =begin and end with
  # =end). Since they can't be nested, there's no need for a stack here, as
  # there can only be one active. These end up getting dumped into the
  # comments list before getting picked up by the statements that surround
  # them.
  @embdoc = nil

  # This is an optional node that can be present if the __END__ keyword is
  # used in the file. In that case, this will represent the content after
  # that keyword.
  @__end__ = nil

  # Heredocs can actually be nested together if you're using interpolation,
  # so this is a stack of heredoc nodes that are currently being created.
  # When we get to the token that finishes off a heredoc node, we pop the
  # top one off. If there are others surrounding it, then the body events
  # will now be added to the correct nodes.
  @heredocs = []

  # This is a running list of tokens that have fired. It's useful mostly for
  # maintaining location information. For example, if you're inside the
  # handle of a def event, then in order to determine where the AST node
  # started, you need to look backward in the tokens to find a def keyword.
  # Most of the time, when a parser event consumes one of these events, it
  # will be deleted from the list. So ideally, this list stays pretty short
  # over the course of parsing a source string.
  @tokens = TokenList.new

  # Here we're going to build up a list of SingleByteString or
  # MultiByteString objects. They're each going to represent a string in the
  # source. They are used by the `char_pos` method to determine where we are
  # in the source string.
  @line_counts = []
  last_index = 0

  @source.each_line do |line|
    @line_counts << if line.size == line.bytesize
      SingleByteString.new(last_index)
    else
      MultiByteString.new(last_index, line)
    end

    last_index += line.size
  end

  # Make sure line counts is filled out with the first and last line at
  # minimum so that it has something to compare against if the parser is in
  # a lineno=2 state for an empty file.
  @line_counts << SingleByteString.new(0) if @line_counts.empty?
  @line_counts << SingleByteString.new(last_index)
end

Instance Attribute Details

#commentsObject (readonly)

Array[ Comment | EmbDoc ]

the list of comments that have been found

while parsing the source.



114
115
116
# File 'lib/syntax_tree/parser.rb', line 114

def comments
  @comments
end

#line_countsObject (readonly)

Array[ SingleByteString | MultiByteString ]

the list of objects that

represent the start of each line in character offsets



105
106
107
# File 'lib/syntax_tree/parser.rb', line 105

def line_counts
  @line_counts
end

#sourceObject (readonly)

String

the source being parsed



101
102
103
# File 'lib/syntax_tree/parser.rb', line 101

def source
  @source
end

#tokensObject (readonly)

Array[ untyped ]

a running list of tokens that have been found in the

source. This list changes a lot as certain nodes will “consume” these tokens to determine their bounds.



110
111
112
# File 'lib/syntax_tree/parser.rb', line 110

def tokens
  @tokens
end