Class: Ripper

Inherits:
Object
  • Object
show all
Defined in:
lib/ripper.rb,
lib/ripper/core.rb,
lib/ripper/sexp.rb,
lib/ripper/lexer.rb,
lib/ripper/filter.rb,
ripper.c

Overview

Ripper is a Ruby script parser.

You can get information from the parser with event-based style. Information such as abstract syntax trees or simple lexical analysis of the Ruby program.

Usage

Ripper provides an easy interface for parsing your program into a symbolic expression tree (or S-expression).

Understanding the output of the parser may come as a challenge, it’s recommended you use PP to format the output for legibility.

require 'ripper'
require 'pp'

pp Ripper.sexp('def hello(world) "Hello, #{world}!"; end')
  #=> [:program,
       [[:def,
         [:@ident, "hello", [1, 4]],
         [:paren,
          [:params, [[:@ident, "world", [1, 10]]], nil, nil, nil, nil, nil, nil]],
         [:bodystmt,
          [[:string_literal,
            [:string_content,
             [:@tstring_content, "Hello, ", [1, 18]],
             [:string_embexpr, [[:var_ref, [:@ident, "world", [1, 27]]]]],
             [:@tstring_content, "!", [1, 33]]]]],
          nil,
          nil,
          nil]]]]

You can see in the example above, the expression starts with :program.

From here, a method definition at :def, followed by the method’s identifier :@ident. After the method’s identifier comes the parentheses :paren and the method parameters under :params.

Next is the method body, starting at :bodystmt (stmt meaning statement), which contains the full definition of the method.

In our case, we’re simply returning a String, so next we have the :string_literal expression.

Within our :string_literal you’ll notice two @tstring_content, this is the literal part for Hello, and !. Between the two @tstring_content statements is a :string_embexpr, where embexpr is an embedded expression. Our expression consists of a local variable, or var_ref, with the identifier (@ident) of world.

Resources

Requirements

  • ruby 1.9 (support CVS HEAD only)

  • bison 1.28 or later (Other yaccs do not work)

License

Ruby License.

                                              Minero Aoki
                                      aamine@loveruby.net
                                    http://i.loveruby.net

Direct Known Subclasses

Lexer, SexpBuilder, SexpBuilderPP

Defined Under Namespace

Classes: Filter, Lexer, SexpBuilder, SexpBuilderPP, TokenPattern

Constant Summary collapse

PARSER_EVENTS =

This array contains name of parser events.

PARSER_EVENT_TABLE.keys
SCANNER_EVENTS =

This array contains name of scanner events.

SCANNER_EVENT_TABLE.keys
EVENTS =

This array contains name of all ripper events.

PARSER_EVENTS + SCANNER_EVENTS
Version =
rb_usascii_str_new2(RIPPER_VERSION)

Class Method Summary collapse

Instance Method Summary collapse

Constructor Details

#new(src, filename = "(ripper)", lineno = 1) ⇒ Object

Create a new Ripper object. src must be a String, an IO, or an Object which has #gets method.

This method does not starts parsing. See also Ripper#parse and Ripper.parse.



16830
16831
16832
16833
16834
16835
16836
16837
16838
16839
16840
16841
16842
16843
16844
16845
16846
16847
16848
16849
16850
16851
16852
16853
16854
16855
16856
16857
16858
16859
16860
# File 'ripper.c', line 16830

static VALUE
ripper_initialize(int argc, VALUE *argv, VALUE self)
{
    struct parser_params *parser;
    VALUE src, fname, lineno;

    TypedData_Get_Struct(self, struct parser_params, &parser_data_type, parser);
    rb_scan_args(argc, argv, "12", &src, &fname, &lineno);
    if (RB_TYPE_P(src, T_FILE)) {
        parser->parser_lex_gets = ripper_lex_get_generic;
    }
    else {
        StringValue(src);
        parser->parser_lex_gets = lex_get_str;
    }
    parser->parser_lex_input = src;
    parser->eofp = Qfalse;
    if (NIL_P(fname)) {
        fname = STR_NEW2("(ripper)");
    }
    else {
        StringValue(fname);
    }
    parser_initialize(parser);

    parser->parser_ruby_sourcefile_string = fname;
    parser->parser_ruby_sourcefile = RSTRING_PTR(fname);
    parser->parser_ruby_sourceline = NIL_P(lineno) ? 0 : NUM2INT(lineno) - 1;

    return Qnil;
}

Class Method Details

.lex(src, filename = '-', lineno = 1) ⇒ Object

Tokenizes the Ruby program and returns an Array of an Array, which is formatted like [[lineno, column], type, token].

require 'ripper'
require 'pp'

pp Ripper.lex("def m(a) nil end")
  #=> [[[1,  0], :on_kw,     "def"],
       [[1,  3], :on_sp,     " "  ],
       [[1,  4], :on_ident,  "m"  ],
       [[1,  5], :on_lparen, "("  ],
       [[1,  6], :on_ident,  "a"  ],
       [[1,  7], :on_rparen, ")"  ],
       [[1,  8], :on_sp,     " "  ],
       [[1,  9], :on_kw,     "nil"],
       [[1, 12], :on_sp,     " "  ],
       [[1, 13], :on_kw,     "end"]]


42
43
44
# File 'lib/ripper/lexer.rb', line 42

def Ripper.lex(src, filename = '-', lineno = 1)
  Lexer.new(src, filename, lineno).lex
end

.parse(src, filename = '(ripper)', lineno = 1) ⇒ Object

Parses the given Ruby program read from src. src must be a String or an IO or a object with a #gets method.



17
18
19
# File 'lib/ripper/core.rb', line 17

def Ripper.parse(src, filename = '(ripper)', lineno = 1)
  new(src, filename, lineno).parse
end

.sexp(src, filename = '-', lineno = 1) ⇒ Object

EXPERIMENTAL

Parses src and create S-exp tree. Returns more readable tree rather than Ripper.sexp_raw. This method is mainly for developer use.

require 'ripper'
require 'pp'

pp Ripper.sexp("def m(a) nil end")
  #=> [:program,
       [[:def,
        [:@ident, "m", [1, 4]],
        [:paren, [:params, [[:@ident, "a", [1, 6]]], nil, nil, nil, nil]],
        [:bodystmt, [[:var_ref, [:@kw, "nil", [1, 9]]]], nil, nil, nil]]]]


30
31
32
# File 'lib/ripper/sexp.rb', line 30

def Ripper.sexp(src, filename = '-', lineno = 1)
  SexpBuilderPP.new(src, filename, lineno).parse
end

.sexp_raw(src, filename = '-', lineno = 1) ⇒ Object

EXPERIMENTAL

Parses src and create S-exp tree. This method is mainly for developer use.

require 'ripper'
require 'pp'

pp Ripper.sexp_raw("def m(a) nil end")
  #=> [:program,
       [:stmts_add,
        [:stmts_new],
        [:def,
         [:@ident, "m", [1, 4]],
         [:paren, [:params, [[:@ident, "a", [1, 6]]], nil, nil, nil]],
         [:bodystmt,
          [:stmts_add, [:stmts_new], [:var_ref, [:@kw, "nil", [1, 9]]]],
          nil,
          nil,
          nil]]]]


54
55
56
# File 'lib/ripper/sexp.rb', line 54

def Ripper.sexp_raw(src, filename = '-', lineno = 1)
  SexpBuilder.new(src, filename, lineno).parse
end

.slice(src, pattern, n = 0) ⇒ Object

EXPERIMENTAL

Parses src and return a string which was matched to pattern. pattern should be described as Regexp.

require 'ripper'

p Ripper.slice('def m(a) nil end', 'ident')                   #=> "m"
p Ripper.slice('def m(a) nil end', '[ident lparen rparen]+')  #=> "m(a)"
p Ripper.slice("<<EOS\nstring\nEOS",
               'heredoc_beg nl $(tstring_content*) heredoc_end', 1)
    #=> "string\n"


84
85
86
87
88
89
# File 'lib/ripper/lexer.rb', line 84

def Ripper.slice(src, pattern, n = 0)
  if m = token_match(src, pattern)
  then m.string(n)
  else nil
  end
end

.token_match(src, pattern) ⇒ Object

:nodoc:



91
92
93
# File 'lib/ripper/lexer.rb', line 91

def Ripper.token_match(src, pattern)   #:nodoc:
  TokenPattern.compile(pattern).match(src)
end

.tokenize(src, filename = '-', lineno = 1) ⇒ Object

Tokenizes the Ruby program and returns an Array of String.

p Ripper.tokenize("def m(a) nil end")
   # => ["def", " ", "m", "(", "a", ")", " ", "nil", " ", "end"]


20
21
22
# File 'lib/ripper/lexer.rb', line 20

def Ripper.tokenize(src, filename = '-', lineno = 1)
  Lexer.new(src, filename, lineno).tokenize
end

Instance Method Details

#columnInteger

Return column number of current parsing line. This number starts from 0.

Returns:

  • (Integer)


16923
16924
16925
16926
16927
16928
16929
16930
16931
16932
16933
16934
16935
16936
# File 'ripper.c', line 16923

static VALUE
ripper_column(VALUE self)
{
    struct parser_params *parser;
    long col;

    TypedData_Get_Struct(self, struct parser_params, &parser_data_type, parser);
    if (!ripper_initialized_p(parser)) {
        rb_raise(rb_eArgError, "method called for uninitialized object");
    }
    if (NIL_P(parser->parsing_thread)) return Qnil;
    col = parser->tokp - parser->parser_lex_pbeg;
    return LONG2NUM(col);
}

#encodingEncoding

Return encoding of the source.

Returns:

  • (Encoding)


16427
16428
16429
16430
16431
16432
16433
16434
# File 'ripper.c', line 16427

VALUE
rb_parser_encoding(VALUE vparser)
{
    struct parser_params *parser;

    TypedData_Get_Struct(vparser, struct parser_params, &parser_data_type, parser);
    return rb_enc_from_encoding(current_enc);
}

#end_seen?Boolean

Return true if parsed source ended by _END_.

Returns:

  • (Boolean)

Returns:

  • (Boolean)


16412
16413
16414
16415
16416
16417
16418
16419
# File 'ripper.c', line 16412

VALUE
rb_parser_end_seen_p(VALUE vparser)
{
    struct parser_params *parser;

    TypedData_Get_Struct(vparser, struct parser_params, &parser_data_type, parser);
    return ruby__end__seen ? Qtrue : Qfalse;
}

#filenameString

Return current parsing filename.

Returns:

  • (String)


16944
16945
16946
16947
16948
16949
16950
16951
16952
16953
16954
# File 'ripper.c', line 16944

static VALUE
ripper_filename(VALUE self)
{
    struct parser_params *parser;

    TypedData_Get_Struct(self, struct parser_params, &parser_data_type, parser);
    if (!ripper_initialized_p(parser)) {
        rb_raise(rb_eArgError, "method called for uninitialized object");
    }
    return parser->parser_ruby_sourcefile_string;
}

#linenoInteger

Return line number of current parsing line. This number starts from 1.

Returns:

  • (Integer)


16963
16964
16965
16966
16967
16968
16969
16970
16971
16972
16973
16974
# File 'ripper.c', line 16963

static VALUE
ripper_lineno(VALUE self)
{
    struct parser_params *parser;

    TypedData_Get_Struct(self, struct parser_params, &parser_data_type, parser);
    if (!ripper_initialized_p(parser)) {
        rb_raise(rb_eArgError, "method called for uninitialized object");
    }
    if (NIL_P(parser->parsing_thread)) return Qnil;
    return INT2NUM(parser->parser_ruby_sourceline);
}

#parseObject

Start parsing and returns the value of the root action.



16895
16896
16897
16898
16899
16900
16901
16902
16903
16904
16905
16906
16907
16908
16909
16910
16911
16912
16913
16914
# File 'ripper.c', line 16895

static VALUE
ripper_parse(VALUE self)
{
    struct parser_params *parser;

    TypedData_Get_Struct(self, struct parser_params, &parser_data_type, parser);
    if (!ripper_initialized_p(parser)) {
        rb_raise(rb_eArgError, "method called for uninitialized object");
    }
    if (!NIL_P(parser->parsing_thread)) {
        if (parser->parsing_thread == rb_thread_current())
            rb_raise(rb_eArgError, "Ripper#parse is not reentrant");
        else
            rb_raise(rb_eArgError, "Ripper#parse is not multithread-safe");
    }
    parser->parsing_thread = rb_thread_current();
    rb_ensure(ripper_parse0, self, ripper_ensure, self);

    return parser->result;
}

#yydebugBoolean

Get yydebug.

Returns:

  • (Boolean)


16442
16443
16444
16445
16446
16447
16448
16449
# File 'ripper.c', line 16442

VALUE
rb_parser_get_yydebug(VALUE self)
{
    struct parser_params *parser;

    TypedData_Get_Struct(self, struct parser_params, &parser_data_type, parser);
    return yydebug ? Qtrue : Qfalse;
}

#yydebug=(flag) ⇒ Object

Set yydebug.



16457
16458
16459
16460
16461
16462
16463
16464
16465
# File 'ripper.c', line 16457

VALUE
rb_parser_set_yydebug(VALUE self, VALUE flag)
{
    struct parser_params *parser;

    TypedData_Get_Struct(self, struct parser_params, &parser_data_type, parser);
    yydebug = RTEST(flag);
    return flag;
}