Module: Parslet

Included in:: Context, Expression, Parser, Transform, Transform

Defined in:: lib/parslet.rb,
lib/parslet/cause.rb,
lib/parslet/source.rb,
lib/parslet/graphviz.rb,
lib/parslet/error_reporter/tree.rb,
lib/parslet/error_reporter/deepest.rb,
lib/parslet/error_reporter/contextual.rb more...

Overview

A simple parser generator library. Typical usage would look like this:

require 'parslet'

class MyParser < Parslet::Parser
  rule(:a) { str('a').repeat }
  root(:a)        
end

pp MyParser.new.parse('aaaa')   # => 'aaaa'@0
pp MyParser.new.parse('bbbb')   # => Parslet::Atoms::ParseFailed: 
                                #    Don't know what to do with bbbb at line 1 char 1.

The simple DSL allows you to define grammars in PEG-style. This kind of grammar construction does away with the ambiguities that usually comes with parsers; instead, it allows you to construct grammars that are easier to debug, since less magic is involved.

Parslet is typically used in stages:

Parsing the input string; this yields an intermediary tree, see Parslet.any, Parslet.match, Parslet.str, Parslet::ClassMethods#rule and Parslet::ClassMethods#root.
Transformation of the tree into something useful to you, see Parslet::Transform, Parslet.simple, Parslet.sequence and Parslet.subtree.

The first stage is traditionally intermingled with the second stage; output from the second stage is usually called the ‘Abstract Syntax Tree’ or AST.

The stages are completely decoupled; You can change your grammar around and use the second stage to isolate the rest of your code from the changes you’ve effected.

When things go wrong

A parse that fails will raise ParseFailed. This exception contains all the details of what went wrong, including a detailed error trace that can be printed out as an ascii tree. (Cause)

Defined Under Namespace

Modules: Accelerator, Atoms, ClassMethods, ErrorReporter, Graphable Classes: Cause, Context, DelayedMatchConstructor, Expression, GraphvizVisitor, ParseFailed, Parser, Pattern, Position, Scope, Slice, Source, Transform

Class Method Summary collapse

.any ⇒ Parslet::Atoms::Re

Returns an atom matching any character.
.dynamic(&block) ⇒ Object

Designates a piece of the parser as being dynamic.
.exp(str) ⇒ Parslet::Atoms::Base

A special kind of atom that allows embedding whole treetop expressions into parslet construction.
.included(base) ⇒ Object

Extends classes that include Parslet with the module ClassMethods.
.infix_expression(element, *operations, &reducer) ⇒ Object

Returns a parslet atom that parses infix expressions.
.match(str) ⇒ Parslet::Atoms::Re

Returns an atom matching a character class.
.scope(&block) ⇒ Object

Introduces a new capture scope.
.sequence(symbol) ⇒ Object

Returns a placeholder for a tree transformation that will only match a sequence of elements.
.simple(symbol) ⇒ Object

Returns a placeholder for a tree transformation that will only match simple elements.
.str(str) ⇒ Parslet::Atoms::Str

Returns an atom matching the str given:.
.subtree(symbol) ⇒ Object

Returns a placeholder for tree transformation patterns that will match any kind of subtree.

Class Method Details

permalink .any ⇒ `Parslet::Atoms::Re`

Returns an atom matching any character. It acts like the ‘.’ (dot) character in regular expressions.

any.parse('a')    # => 'a'

Returns:

(Parslet::Atoms::Re) —

a parslet atom

[View source]


168
169
170

# File 'lib/parslet.rb', line 168

def any
  Atoms::Re.new('.')
end

permalink .dynamic(&block) ⇒ `Object`

Designates a piece of the parser as being dynamic. Dynamic parsers can either return a parser at runtime, which will be applied on the input, or return a result from a parse.

Dynamic parse pieces are never cached and can introduce performance abnormalitites - use sparingly where other constructs fail.

Example:

# Parses either 'a' or 'b', depending on the weather
dynamic { rand() < 0.5 ? str('a') : str('b') }

[View source]


199
200
201

# File 'lib/parslet.rb', line 199

def dynamic(&block)
  Parslet::Atoms::Dynamic.new(block)
end

permalink .exp(str) ⇒ `Parslet::Atoms::Base`

A special kind of atom that allows embedding whole treetop expressions into parslet construction.

# the same as str('a') >> str('b').maybe
exp(%Q("a" "b"?))

Parameters:

str (String) —

a treetop expression

Returns:

(Parslet::Atoms::Base) —

the corresponding parslet parser

[View source]


253
254
255

# File 'lib/parslet.rb', line 253

def exp(str)
  Parslet::Expression.new(str).to_parslet
end

permalink .included(base) ⇒ `Object`

Extends classes that include Parslet with the module ClassMethods.

[View source]


52
53
54

# File 'lib/parslet.rb', line 52

def self.included(base)
  base.extend(ClassMethods)
end

permalink .infix_expression(element, *operations, &reducer) ⇒ `Object`

Returns a parslet atom that parses infix expressions. Operations are specified as a list of <atom, precedence, associativity> tuples, where atom is simply the parslet atom that matches an operator, precedence is a number and associativity is either :left or :right.

Higher precedence indicates that the operation should bind tighter than other operations with lower precedence. In common algebra, ‘+’ has lower precedence than ‘*’. So you would have a precedence of 1 for ‘+’ and a precedence of 2 for ‘*’. Only the order relation between these two counts, so any number would work.

Associativity is what decides what interpretation to take for strings that are ambiguous like ‘1 + 2 + 3’. If ‘+’ is specified as left associative, the expression would be interpreted as ‘(1 + 2) + 3’. If right associativity is chosen, it would be interpreted as ‘1 + (2 + 3)’. Note that the hash trees output reflect that choice as well.

An optional block can be provided in order to manipulate the generated tree. The block will be called on each operator and passed 3 arguments: the left operand, the operator, and the right operand.

Examples:

infix_expression(integer, [add_op, 1, :left])
# would parse things like '1 + 2'

infix_expression(integer, [add_op, 1, :left]) { |l,o,r| { :plus => [l, r] } }
# would parse '1 + 2 + 3' as:
# { :plus => [1, { :plus => [2, 3] }] }

Parameters:

element (Parslet::Atoms::Base) —

elements that take the NUMBER position in the expression
operations (Array<(Parslet::Atoms::Base, Integer, {:left, :right})>)

permalink .match(str) ⇒ `Parslet::Atoms::Re`

Returns an atom matching a character class. All regular expressions can be used, as long as they match only a single character at a time.

match('[ab]')     # will match either 'a' or 'b'
match('[\n\s]')   # will match newlines and spaces

There is also another (convenience) form of this method:

match['a-z']      # synonymous to match('[a-z]')
match['\n']       # synonymous to match('[\n]')

Returns a parslet atom.

Parameters:

str (String) —

character class to match (regexp syntax)

Returns:

(Parslet::Atoms::Re) —

a parslet atom

[View source]

# File 'lib/parslet.rb', line 142

def match(str=nil)
  return DelayedMatchConstructor.new unless str
  
  return Atoms::Re.new(str)
end

permalink .scope(&block) ⇒ `Object`

Introduces a new capture scope. This means that all old captures stay accessible, but new values stored will only be available during the block given and the old values will be restored after the block.

Example:

# :a will be available until the end of the block. Afterwards, 
# :a from the outer scope will be available again, if such a thing 
# exists. 
scope { str('a').capture(:a) }

[View source]


183
184
185

# File 'lib/parslet.rb', line 183

def scope(&block)
  Parslet::Atoms::Scope.new(block)
end

permalink .sequence(symbol) ⇒ `Object`

Returns a placeholder for a tree transformation that will only match a sequence of elements. The symbol you specify will be the key for the matched sequence in the returned dictionary.

# This would match a body element that contains several declarations.
{ :body => sequence(:declarations) }

The above example would match :body => ['a', 'b'], but not :body => 'a'.

see Transform

[View source]


270
271
272

# File 'lib/parslet.rb', line 270

def sequence(symbol)
  Pattern::SequenceBind.new(symbol)
end

permalink .simple(symbol) ⇒ `Object`

Returns a placeholder for a tree transformation that will only match simple elements. This matches everything that #sequence doesn’t match.

# Matches a single header. 
{ :header => simple(:header) }

see Transform

[View source]


284
285
286

# File 'lib/parslet.rb', line 284

def simple(symbol)
  Pattern::SimpleBind.new(symbol)
end

permalink .str(str) ⇒ `Parslet::Atoms::Str`

Returns an atom matching the str given:

str('class')      # will match 'class'

Parameters:

str (String) —

string to match verbatim

Returns:

(Parslet::Atoms::Str) —

a parslet atom

[View source]


156
157
158

# File 'lib/parslet.rb', line 156

def str(str)
  Atoms::Str.new(str)
end

permalink .subtree(symbol) ⇒ `Object`

Returns a placeholder for tree transformation patterns that will match any kind of subtree.

{ :expression => subtree(:exp) }

[View source]


294
295
296

# File 'lib/parslet.rb', line 294

def subtree(symbol)
  Pattern::SubtreeBind.new(symbol)
end

Module: Parslet

Overview

Further reading

When things go wrong

Defined Under Namespace

Class Method Summary collapse

Class Method Details

permalink .any ⇒ Parslet::Atoms::Re

permalink .dynamic(&block) ⇒ Object

permalink .exp(str) ⇒ Parslet::Atoms::Base

permalink .included(base) ⇒ Object

permalink .infix_expression(element, *operations, &reducer) ⇒ Object

permalink .match(str) ⇒ Parslet::Atoms::Re

permalink .scope(&block) ⇒ Object

permalink .sequence(symbol) ⇒ Object

permalink .simple(symbol) ⇒ Object

permalink .str(str) ⇒ Parslet::Atoms::Str

permalink .subtree(symbol) ⇒ Object

permalink .any ⇒ `Parslet::Atoms::Re`

permalink .dynamic(&block) ⇒ `Object`

permalink .exp(str) ⇒ `Parslet::Atoms::Base`

permalink .included(base) ⇒ `Object`

permalink .infix_expression(element, *operations, &reducer) ⇒ `Object`

permalink .match(str) ⇒ `Parslet::Atoms::Re`

permalink .scope(&block) ⇒ `Object`

permalink .sequence(symbol) ⇒ `Object`

permalink .simple(symbol) ⇒ `Object`

permalink .str(str) ⇒ `Parslet::Atoms::Str`

permalink .subtree(symbol) ⇒ `Object`