Class: SyntaxTree

Inherits:
Ripper
  • Object
show all
Defined in:
lib/syntax_tree.rb,
lib/syntax_tree/cli.rb,
lib/syntax_tree/version.rb

Defined Under Namespace

Modules: AssignFormatting, CLI, ContainsAssignment, HashKeyFormatter, Parentheses, Quotes, RemoveBreaks Classes: ARef, ARefField, Alias, ArgBlock, ArgParen, ArgStar, Args, ArgsForward, ArrayLiteral, AryPtn, Assign, Assoc, AssocSplat, BEGINBlock, Backref, Backtick, BareAssocHash, Begin, Binary, BlockArg, BlockFormatter, BlockVar, BodyStmt, BraceBlock, Break, CHAR, CVar, Call, CallOperatorFormatter, Case, ClassDeclaration, Comma, Command, CommandCall, Comment, ConditionalFormatter, ConditionalModFormatter, Const, ConstPathField, ConstPathRef, ConstRef, Def, DefEndless, Defined, Defs, DoBlock, Dot2, Dot3, DotFormatter, DynaSymbol, ENDBlock, Else, Elsif, EmbDoc, EmbExprBeg, EmbExprEnd, EmbVar, EndContent, Ensure, ExcessedComma, FCall, Field, FloatLiteral, FlowControlFormatter, FndPtn, For, Formatter, GVar, HashLiteral, Heredoc, HeredocBeg, HshPtn, IVar, Ident, If, IfMod, IfOp, Imaginary, In, Int, Kw, KwRestParam, LBrace, LBracket, LParen, Label, LabelEnd, Lambda, Location, LoopFormatter, MAssign, MLHS, MLHSParen, MRHS, MethodAddBlock, ModuleDeclaration, MultiByteString, Next, Not, Op, OpAssign, Params, Paren, ParseError, Period, Program, QSymbols, QSymbolsBeg, QWords, QWordsBeg, RAssign, RBrace, RBracket, RParen, RationalLiteral, Redo, RegexpBeg, RegexpContent, RegexpEnd, RegexpLiteral, Rescue, RescueEx, RescueMod, RestParam, Retry, Return, Return0, SClass, SingleByteString, Statements, StringConcat, StringContent, StringDVar, StringEmbExpr, StringLiteral, Super, SymBeg, SymbolContent, SymbolLiteral, Symbols, SymbolsBeg, TLamBeg, TLambda, TStringBeg, TStringContent, TStringEnd, TopConstField, TopConstRef, Unary, Undef, Unless, UnlessMod, Until, UntilMod, VCall, VarAlias, VarField, VarRef, VoidStmt, When, While, WhileMod, Word, Words, WordsBeg, XString, XStringLiteral, Yield, Yield0, ZSuper

Constant Summary collapse

VERSION =
"1.1.1"

Instance Attribute Summary collapse

Class Method Summary collapse

Instance Method Summary collapse

Constructor Details

#initialize(source) ⇒ SyntaxTree

Returns a new instance of SyntaxTree.



207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
# File 'lib/syntax_tree.rb', line 207

def initialize(source, *)
  super

  # We keep the source around so that we can refer back to it when we're
  # generating the AST. Sometimes it's easier to just reference the source
  # string when you want to check if it contains a certain character, for
  # example.
  @source = source

  # Similarly, we keep the lines of the source string around to be able to
  # check if certain lines contain certain characters. For example, we'll use
  # this to generate the content that goes after the __END__ keyword. Or we'll
  # use this to check if a comment has other content on its line.
  @lines = source.split(/\r?\n/)

  # This is the full set of comments that have been found by the parser. It's
  # a running list. At the end of every block of statements, they will go in
  # and attempt to grab any comments that are on their own line and turn them
  # into regular statements. So at the end of parsing the only comments left
  # in here will be comments on lines that also contain code.
  @comments = []

  # This is the current embdoc (comments that start with =begin and end with
  # =end). Since they can't be nested, there's no need for a stack here, as
  # there can only be one active. These end up getting dumped into the
  # comments list before getting picked up by the statements that surround
  # them.
  @embdoc = nil

  # This is an optional node that can be present if the __END__ keyword is
  # used in the file. In that case, this will represent the content after that
  # keyword.
  @__end__ = nil

  # Heredocs can actually be nested together if you're using interpolation, so
  # this is a stack of heredoc nodes that are currently being created. When we
  # get to the token that finishes off a heredoc node, we pop the top
  # one off. If there are others surrounding it, then the body events will now
  # be added to the correct nodes.
  @heredocs = []

  # This is a running list of tokens that have fired. It's useful
  # mostly for maintaining location information. For example, if you're inside
  # the handle of a def event, then in order to determine where the AST node
  # started, you need to look backward in the tokens to find a def
  # keyword. Most of the time, when a parser event consumes one of these
  # events, it will be deleted from the list. So ideally, this list stays
  # pretty short over the course of parsing a source string.
  @tokens = []

  # Here we're going to build up a list of SingleByteString or MultiByteString
  # objects. They're each going to represent a string in the source. They are
  # used by the `char_pos` method to determine where we are in the source
  # string.
  @line_counts = []
  last_index = 0

  @source.lines.each do |line|
    if line.size == line.bytesize
      @line_counts << SingleByteString.new(last_index)
    else
      @line_counts << MultiByteString.new(last_index, line)
    end

    last_index += line.size
  end

  # Make sure line counts is filled out with the first and last line at
  # minimum so that it has something to compare against if the parser is in a
  # lineno=2 state for an empty file.
  @line_counts << SingleByteString.new(0) if @line_counts.empty?
  @line_counts << SingleByteString.new(last_index)
end

Instance Attribute Details

#commentsObject (readonly)

Array[ Comment | EmbDoc ]

the list of comments that have been found while

parsing the source.



205
206
207
# File 'lib/syntax_tree.rb', line 205

def comments
  @comments
end

#line_countsObject (readonly)

Array[ SingleByteString | MultiByteString ]

the list of objects that

represent the start of each line in character offsets



196
197
198
# File 'lib/syntax_tree.rb', line 196

def line_counts
  @line_counts
end

#linesObject (readonly)

Array[ String ]

the list of lines in the source



192
193
194
# File 'lib/syntax_tree.rb', line 192

def lines
  @lines
end

#sourceObject (readonly)

String

the source being parsed



189
190
191
# File 'lib/syntax_tree.rb', line 189

def source
  @source
end

#tokensObject (readonly)

Array[ untyped ]

a running list of tokens that have been found in the

source. This list changes a lot as certain nodes will “consume” these tokens to determine their bounds.



201
202
203
# File 'lib/syntax_tree.rb', line 201

def tokens
  @tokens
end

Class Method Details

.format(source) ⇒ Object



287
288
289
290
291
292
293
294
295
# File 'lib/syntax_tree.rb', line 287

def self.format(source)
  output = []

  formatter = Formatter.new(source, output)
  parse(source).format(formatter)

  formatter.flush
  output.join
end

.parse(source) ⇒ Object



281
282
283
284
285
# File 'lib/syntax_tree.rb', line 281

def self.parse(source)
  parser = new(source)
  response = parser.parse
  response unless parser.error?
end

.read(filepath) ⇒ Object

Returns the source from the given filepath taking into account any potential magic encoding comments.



299
300
301
302
303
304
305
306
307
308
# File 'lib/syntax_tree.rb', line 299

def self.read(filepath)
  encoding =
    File.open(filepath, "r") do |file|
      header = file.readline
      header += file.readline if header.start_with?("#!")
      Ripper.new(header).tap(&:parse).encoding
    end

  File.read(filepath, encoding: encoding)
end