Class: SyntaxTree

Inherits:

Ripper

Object
Ripper
SyntaxTree

show all

Defined in:: lib/syntax_tree.rb,
lib/syntax_tree/cli.rb,
lib/syntax_tree/version.rb

Defined Under Namespace

Modules: AssignFormatting, CLI, ContainsAssignment, HashKeyFormatter, Parentheses, Quotes, RemoveBreaks Classes: ARef, ARefField, Alias, ArgBlock, ArgParen, ArgStar, Args, ArgsForward, ArrayLiteral, AryPtn, Assign, Assoc, AssocSplat, BEGINBlock, Backref, Backtick, BareAssocHash, Begin, Binary, BlockArg, BlockFormatter, BlockVar, BodyStmt, BraceBlock, Break, CHAR, CVar, Call, CallOperatorFormatter, Case, ClassDeclaration, Comma, Command, CommandCall, Comment, ConditionalFormatter, ConditionalModFormatter, Const, ConstPathField, ConstPathRef, ConstRef, Def, DefEndless, Defined, Defs, DoBlock, Dot2, Dot3, DotFormatter, DynaSymbol, ENDBlock, Else, Elsif, EmbDoc, EmbExprBeg, EmbExprEnd, EmbVar, EndContent, Ensure, ExcessedComma, FCall, Field, FloatLiteral, FlowControlFormatter, FndPtn, For, Formatter, GVar, HashLiteral, Heredoc, HeredocBeg, HshPtn, IVar, Ident, If, IfMod, IfOp, Imaginary, In, Int, Kw, KwRestParam, LBrace, LBracket, LParen, Label, LabelEnd, Lambda, Location, LoopFormatter, MAssign, MLHS, MLHSParen, MRHS, MethodAddBlock, ModuleDeclaration, MultiByteString, Next, Not, Op, OpAssign, Params, Paren, ParseError, Period, Program, QSymbols, QSymbolsBeg, QWords, QWordsBeg, RAssign, RBrace, RBracket, RParen, RationalLiteral, Redo, RegexpBeg, RegexpContent, RegexpEnd, RegexpLiteral, Rescue, RescueEx, RescueMod, RestParam, Retry, Return, Return0, SClass, SingleByteString, Statements, StringConcat, StringContent, StringDVar, StringEmbExpr, StringLiteral, Super, SymBeg, SymbolContent, SymbolLiteral, Symbols, SymbolsBeg, TLamBeg, TLambda, TStringBeg, TStringContent, TStringEnd, TopConstField, TopConstRef, Unary, Undef, Unless, UnlessMod, Until, UntilMod, VCall, VarAlias, VarField, VarRef, VoidStmt, When, While, WhileMod, Word, Words, WordsBeg, XString, XStringLiteral, Yield, Yield0, ZSuper

Constant Summary collapse

VERSION =

"1.1.1"

Instance Attribute Summary collapse

#comments ⇒ Object readonly
Array[ Comment | EmbDoc ]

the list of comments that have been found while parsing the source.
#line_counts ⇒ Object readonly
Array[ SingleByteString | MultiByteString ]

the list of objects that represent the start of each line in character offsets.
#lines ⇒ Object readonly
Array[ String ]

the list of lines in the source.
#source ⇒ Object readonly
String

the source being parsed.
#tokens ⇒ Object readonly
Array[ untyped ]

a running list of tokens that have been found in the source.

Class Method Summary collapse

.format(source) ⇒ Object
.parse(source) ⇒ Object
.read(filepath) ⇒ Object

Returns the source from the given filepath taking into account any potential magic encoding comments.

Instance Method Summary collapse

#initialize(source) ⇒ SyntaxTree constructor

A new instance of SyntaxTree.

Constructor Details

#initialize(source) ⇒ `SyntaxTree`

Returns a new instance of SyntaxTree.

# File 'lib/syntax_tree.rb', line 207

def initialize(source, *)
  super

  # We keep the source around so that we can refer back to it when we're
  # generating the AST. Sometimes it's easier to just reference the source
  # string when you want to check if it contains a certain character, for
  # example.
  @source = source

  # Similarly, we keep the lines of the source string around to be able to
  # check if certain lines contain certain characters. For example, we'll use
  # this to generate the content that goes after the __END__ keyword. Or we'll
  # use this to check if a comment has other content on its line.
  @lines = source.split(/\r?\n/)

  # This is the full set of comments that have been found by the parser. It's
  # a running list. At the end of every block of statements, they will go in
  # and attempt to grab any comments that are on their own line and turn them
  # into regular statements. So at the end of parsing the only comments left
  # in here will be comments on lines that also contain code.
  @comments = []

  # This is the current embdoc (comments that start with =begin and end with
  # =end). Since they can't be nested, there's no need for a stack here, as
  # there can only be one active. These end up getting dumped into the
  # comments list before getting picked up by the statements that surround
  # them.
  @embdoc = nil

  # This is an optional node that can be present if the __END__ keyword is
  # used in the file. In that case, this will represent the content after that
  # keyword.
  @__end__ = nil

  # Heredocs can actually be nested together if you're using interpolation, so
  # this is a stack of heredoc nodes that are currently being created. When we
  # get to the token that finishes off a heredoc node, we pop the top
  # one off. If there are others surrounding it, then the body events will now
  # be added to the correct nodes.
  @heredocs = []

  # This is a running list of tokens that have fired. It's useful
  # mostly for maintaining location information. For example, if you're inside
  # the handle of a def event, then in order to determine where the AST node
  # started, you need to look backward in the tokens to find a def
  # keyword. Most of the time, when a parser event consumes one of these
  # events, it will be deleted from the list. So ideally, this list stays
  # pretty short over the course of parsing a source string.
  @tokens = []

  # Here we're going to build up a list of SingleByteString or MultiByteString
  # objects. They're each going to represent a string in the source. They are
  # used by the `char_pos` method to determine where we are in the source
  # string.
  @line_counts = []
  last_index = 0

  @source.lines.each do |line|
    if line.size == line.bytesize
      @line_counts << SingleByteString.new(last_index)
    else
      @line_counts << MultiByteString.new(last_index, line)
    end

    last_index += line.size
  end

  # Make sure line counts is filled out with the first and last line at
  # minimum so that it has something to compare against if the parser is in a
  # lineno=2 state for an empty file.
  @line_counts << SingleByteString.new(0) if @line_counts.empty?
  @line_counts << SingleByteString.new(last_index)
end

Instance Attribute Details

#comments ⇒ `Object` (readonly)

Array[ Comment | EmbDoc ]: the list of comments that have been found while

parsing the source.



205
206
207

# File 'lib/syntax_tree.rb', line 205

def comments
  @comments
end

#line_counts ⇒ `Object` (readonly)

Array[ SingleByteString | MultiByteString ]: the list of objects that

represent the start of each line in character offsets



196
197
198

# File 'lib/syntax_tree.rb', line 196

def line_counts
  @line_counts
end

#lines ⇒ `Object` (readonly)

Array[ String ]: the list of lines in the source



192
193
194

# File 'lib/syntax_tree.rb', line 192

def lines
  @lines
end

#source ⇒ `Object` (readonly)

String: the source being parsed



189
190
191

# File 'lib/syntax_tree.rb', line 189

def source
  @source
end

#tokens ⇒ `Object` (readonly)

Array[ untyped ]: a running list of tokens that have been found in the

source. This list changes a lot as certain nodes will “consume” these tokens to determine their bounds.



201
202
203

# File 'lib/syntax_tree.rb', line 201

def tokens
  @tokens
end

Class Method Details

.format(source) ⇒ `Object`

# File 'lib/syntax_tree.rb', line 287

def self.format(source)
  output = []

  formatter = Formatter.new(source, output)
  parse(source).format(formatter)

  formatter.flush
  output.join
end

.parse(source) ⇒ `Object`

# File 'lib/syntax_tree.rb', line 281

def self.parse(source)
  parser = new(source)
  response = parser.parse
  response unless parser.error?
end

.read(filepath) ⇒ `Object`

Returns the source from the given filepath taking into account any potential magic encoding comments.

# File 'lib/syntax_tree.rb', line 299

def self.read(filepath)
  encoding =
    File.open(filepath, "r") do |file|
      header = file.readline
      header += file.readline if header.start_with?("#!")
      Ripper.new(header).tap(&:parse).encoding
    end

  File.read(filepath, encoding: encoding)
end

Class: SyntaxTree

Defined Under Namespace

Constant Summary collapse

Instance Attribute Summary collapse

Class Method Summary collapse

Instance Method Summary collapse

Constructor Details

#initialize(source) ⇒ SyntaxTree

Instance Attribute Details

#comments ⇒ Object (readonly)

#line_counts ⇒ Object (readonly)

#lines ⇒ Object (readonly)

#source ⇒ Object (readonly)

#tokens ⇒ Object (readonly)