Class: Tailor::Lexer

Inherits:
Ripper::Lexer
  • Object
show all
Includes:
LogSwitch::Mixin, CompositeObservable, LexerConstants
Defined in:
lib/tailor/lexer.rb,
lib/tailor/lexer/token.rb

Overview

This is what provides the main file parsing for tailor. For every event that’s encountered, it calls the appropriate notifier method. Notifier methods are provided by CompositeObservable.

Defined Under Namespace

Classes: Token

Constant Summary

Constants included from LexerConstants

Tailor::LexerConstants::CONTINUATION_KEYWORDS, Tailor::LexerConstants::KEYWORDS_AND_MODIFIERS, Tailor::LexerConstants::KEYWORDS_TO_INDENT, Tailor::LexerConstants::LOOP_KEYWORDS, Tailor::LexerConstants::MODIFIERS, Tailor::LexerConstants::MULTILINE_OPERATORS

Instance Method Summary collapse

Methods included from CompositeObservable

define_observer

Constructor Details

#initialize(file) ⇒ Lexer

Returns a new instance of Lexer.

Parameters:

  • file (String)

    The string to lex, or name of the file to read and analyze.



21
22
23
24
25
26
27
28
29
30
31
32
33
# File 'lib/tailor/lexer.rb', line 21

def initialize(file)
  @original_file_text = if File.exists? file
    @file_name = file
    File.open(@file_name, 'r').read
  else
    @file_name = "<notafile>"
    file
  end

  @file_text = ensure_trailing_newline(@original_file_text)
  super @file_text
  @added_newline = @file_text != @original_file_text
end

Instance Method Details

#count_trailing_newlines(text) ⇒ Fixnum

Counts the number of newlines at the end of the file.

Parameters:

  • text (String)

    The file’s text.

Returns:

  • (Fixnum)

    The number of n at the end of the file.



523
524
525
526
527
528
529
530
531
532
533
534
535
536
537
538
539
# File 'lib/tailor/lexer.rb', line 523

def count_trailing_newlines(text)
  if text.end_with? "\n"
    count = 0

    text.reverse.chars do |c|
      if c == "\n"
        count += 1
      else
        break
      end
    end

    count
  else
    0
  end
end

#current_line_of_textString

The current line of text being examined.

Returns:

  • (String)

    The current line of text.



515
516
517
# File 'lib/tailor/lexer.rb', line 515

def current_line_of_text
  @file_text.split("\n").at(lineno - 1) || ''
end

#ensure_trailing_newline(file_text) ⇒ String

Adds a newline to the end of the test if one doesn’t exist. Without doing this, Ripper won’t trigger a newline event for the last line of the file, which is required for some rulers to do their thing.

Parameters:

  • file_text (String)

    The text to check.

Returns:

  • (String)

    The file text with a newline at the end.



547
548
549
# File 'lib/tailor/lexer.rb', line 547

def ensure_trailing_newline(file_text)
  count_trailing_newlines(file_text) > 0 ? file_text : (file_text + "\n")
end

#lexObject

This kicks off the process of parsing the file and publishing events as the events are discovered.



37
38
39
40
41
42
43
44
45
# File 'lib/tailor/lexer.rb', line 37

def lex
  file_beg_changed
  notify_file_beg_observers(@file_name)

  super

  file_end_changed
  notify_file_end_observers(count_trailing_newlines(@original_file_text))
end

#on___end__(token) ⇒ Object

Called when the lexer matches __END__.

Parameters:

  • token (String)

    The token that the lexer matched.



499
500
501
502
# File 'lib/tailor/lexer.rb', line 499

def on___end__(token)
  log "__END__: '#{token}'"
  super(token)
end

#on_backref(token) ⇒ Object



47
48
49
50
# File 'lib/tailor/lexer.rb', line 47

def on_backref(token)
  log "BACKREF: '#{token}'"
  super(token)
end

#on_backtick(token) ⇒ Object

Called when the lexer matches the first ‘ in a “ statement (the second matches :on_tstring_end; this may or may not be a Ruby bug).

Parameters:

  • token (String)

    The token that the lexer matched.



56
57
58
59
# File 'lib/tailor/lexer.rb', line 56

def on_backtick(token)
  log "BACKTICK: '#{token}'"
  super(token)
end

#on_CHAR(token) ⇒ Object

Called when the lexer matches CHAR.

Parameters:

  • token (String)

    The token that the lexer matched.



507
508
509
510
# File 'lib/tailor/lexer.rb', line 507

def on_CHAR(token)
  log "CHAR: '#{token}'"
  super(token)
end

#on_comma(token) ⇒ Object

Called when the lexer matches a comma.

Parameters:

  • token (String)

    The token that the lexer matched.



64
65
66
67
68
69
70
71
72
# File 'lib/tailor/lexer.rb', line 64

def on_comma(token)
  log "COMMA: #{token}"
  log "Line length: #{current_line_of_text.length}"

  comma_changed
  notify_comma_observers(current_line_of_text, lineno, column)

  super(token)
end

#on_comment(token) ⇒ Object

Called when the lexer matches a #. The token includes the # as well as the content after it.

Parameters:

  • token (String)

    The token that the lexer matched.



78
79
80
81
82
83
84
85
86
87
# File 'lib/tailor/lexer.rb', line 78

def on_comment(token)
  log "COMMENT: '#{token}'"

  l_token = Tailor::Lexer::Token.new(token)
  lexed_line = LexedLine.new(super, lineno)
  comment_changed
  notify_comment_observers(l_token, lexed_line, @file_text, lineno, column)

  super(token)
end

#on_const(token) ⇒ Object

Called when the lexer matches a constant (including class names, of course).

Parameters:

  • token (String)

    The token that the lexer matched.



93
94
95
96
97
98
99
100
101
102
# File 'lib/tailor/lexer.rb', line 93

def on_const(token)
  log "CONST: '#{token}'"

  l_token = Tailor::Lexer::Token.new(token)
  lexed_line = LexedLine.new(super, lineno)
  const_changed
  notify_const_observers(l_token, lexed_line, lineno, column)

  super(token)
end

#on_cvar(token) ⇒ Object

Called when the lexer matches a class variable.

Parameters:

  • token (String)

    The token that the lexer matched.



107
108
109
110
# File 'lib/tailor/lexer.rb', line 107

def on_cvar(token)
  log "CVAR: '#{token}'"
  super(token)
end

#on_embdoc(token) ⇒ Object

Called when the lexer matches the content inside a =begin/=end.

Parameters:

  • token (String)

    The token that the lexer matched.



115
116
117
118
# File 'lib/tailor/lexer.rb', line 115

def on_embdoc(token)
  log "EMBDOC: '#{token}'"
  super(token)
end

#on_embdoc_beg(token) ⇒ Object

Called when the lexer matches =begin.

Parameters:

  • token (String)

    The token that the lexer matched.



123
124
125
126
# File 'lib/tailor/lexer.rb', line 123

def on_embdoc_beg(token)
  log "EMBDOC_BEG: '#{token}'"
  super(token)
end

#on_embdoc_end(token) ⇒ Object

Called when the lexer matches =end.

Parameters:

  • token (String)

    The token that the lexer matched.



131
132
133
134
# File 'lib/tailor/lexer.rb', line 131

def on_embdoc_end(token)
  log "EMBDOC_BEG: '#{token}'"
  super(token)
end

#on_embexpr_beg(token) ⇒ Object

Called when the lexer matches a #{.

Parameters:

  • token (String)

    The token that the lexer matched.



139
140
141
142
143
144
# File 'lib/tailor/lexer.rb', line 139

def on_embexpr_beg(token)
  log "EMBEXPR_BEG: '#{token}'"
  embexpr_beg_changed
  notify_embexpr_beg_observers
  super(token)
end

#on_embexpr_end(token) ⇒ Object

Called when the lexer matches the } that closes a #{. Note that as of MRI 1.9.3-p125, this never gets called. Logged as a bug and fixed, but not yet released: bugs.ruby-lang.org/issues/6211.

Parameters:

  • token (String)

    The token that the lexer matched.



151
152
153
154
155
156
# File 'lib/tailor/lexer.rb', line 151

def on_embexpr_end(token)
  log "EMBEXPR_END: '#{token}'"
  embexpr_end_changed
  notify_embexpr_end_observers
  super(token)
end

#on_embvar(token) ⇒ Object



158
159
160
161
# File 'lib/tailor/lexer.rb', line 158

def on_embvar(token)
  log "EMBVAR: '#{token}'"
  super(token)
end

#on_float(token) ⇒ Object

Called when the lexer matches a Float.

Parameters:

  • token (String)

    The token that the lexer matched.



166
167
168
169
# File 'lib/tailor/lexer.rb', line 166

def on_float(token)
  log "FLOAT: '#{token}'"
  super(token)
end

#on_gvar(token) ⇒ Object

Called when the lexer matches a global variable.

Parameters:

  • token (String)

    The token that the lexer matched.



174
175
176
177
# File 'lib/tailor/lexer.rb', line 174

def on_gvar(token)
  log "GVAR: '#{token}'"
  super(token)
end

#on_heredoc_beg(token) ⇒ Object

Called when the lexer matches the beginning of a heredoc.

Parameters:

  • token (String)

    The token that the lexer matched.



182
183
184
185
# File 'lib/tailor/lexer.rb', line 182

def on_heredoc_beg(token)
  log "HEREDOC_BEG: '#{token}'"
  super(token)
end

#on_heredoc_end(token) ⇒ Object

Called when the lexer matches the end of a heredoc.

Parameters:

  • token (String)

    The token that the lexer matched.



190
191
192
193
# File 'lib/tailor/lexer.rb', line 190

def on_heredoc_end(token)
  log "HEREDOC_END: '#{token}'"
  super(token)
end

#on_ident(token) ⇒ Object

Called when the lexer matches an identifier (method name, variable, the text part of a Symbol, etc.).

Parameters:

  • token (String)

    The token that the lexer matched.



199
200
201
202
203
204
205
206
# File 'lib/tailor/lexer.rb', line 199

def on_ident(token)
  log "IDENT: '#{token}'"
  l_token = Tailor::Lexer::Token.new(token)
  lexed_line = LexedLine.new(super, lineno)
  ident_changed
  notify_ident_observers(l_token, lexed_line, lineno, column)
  super(token)
end

#on_ignored_nl(token) ⇒ Object

Called when the lexer matches a Ruby ignored newline. Ignored newlines occur when a newline is encountered, but the statement that was expressed on that line was not completed on that line.

Parameters:

  • token (String)

    The token that the lexer matched.



213
214
215
216
217
218
219
220
221
# File 'lib/tailor/lexer.rb', line 213

def on_ignored_nl(token)
  log "IGNORED_NL"

  current_line = LexedLine.new(super, lineno)
  ignored_nl_changed
  notify_ignored_nl_observers(current_line, lineno, column)

  super(token)
end

#on_int(token) ⇒ Object

Called when the lexer matches an Integer.

Parameters:

  • token (String)

    The token that the lexer matched.



226
227
228
229
# File 'lib/tailor/lexer.rb', line 226

def on_int(token)
  log "INT: '#{token}'"
  super(token)
end

#on_ivar(token) ⇒ Object

Called when the lexer matches an instance variable.

Parameters:

  • token (String)

    The token that the lexer matched.



234
235
236
237
# File 'lib/tailor/lexer.rb', line 234

def on_ivar(token)
  log "IVAR: '#{token}'"
  super(token)
end

#on_kw(token) ⇒ Object

Called when the lexer matches a Ruby keyword.

Parameters:

  • token (String)

    The token that the lexer matched.



242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
# File 'lib/tailor/lexer.rb', line 242

def on_kw(token)
  log "KW: #{token}"
  current_line = LexedLine.new(super, lineno)

  l_token = Tailor::Lexer::Token.new(token,
    {
      loop_with_do: current_line.loop_with_do?,
      full_line_of_text: current_line_of_text
    }
  )

  kw_changed
  notify_kw_observers(l_token, current_line, lineno, column)

  super(token)
end

#on_label(token) ⇒ Object

Called when the lexer matches a label (the first part in a non-rocket style Hash).

Example:

one: 1     # Matches one:

Parameters:

  • token (String)

    The token that the lexer matched.



266
267
268
269
# File 'lib/tailor/lexer.rb', line 266

def on_label(token)
  log "LABEL: '#{token}'"
  super(token)
end

#on_lbrace(token) ⇒ Object

Called when the lexer matches a {. Note a #{ match calls #on_embexpr_beg.

Parameters:

  • token (String)

    The token that the lexer matched.



275
276
277
278
279
280
281
# File 'lib/tailor/lexer.rb', line 275

def on_lbrace(token)
  log "LBRACE: '#{token}'"
  current_line = LexedLine.new(super, lineno)
  lbrace_changed
  notify_lbrace_observers(current_line, lineno, column)
  super(token)
end

#on_lbracket(token) ⇒ Object

Called when the lexer matches a [.

Parameters:

  • token (String)

    The token that the lexer matched.



286
287
288
289
290
291
292
# File 'lib/tailor/lexer.rb', line 286

def on_lbracket(token)
  log "LBRACKET: '#{token}'"
  current_line = LexedLine.new(super, lineno)
  lbracket_changed
  notify_lbracket_observers(current_line, lineno, column)
  super(token)
end

#on_lparen(token) ⇒ Object

Called when the lexer matches a (.

Parameters:

  • token (String)

    The token that the lexer matched.



297
298
299
300
301
302
# File 'lib/tailor/lexer.rb', line 297

def on_lparen(token)
  log "LPAREN: '#{token}'"
  lparen_changed
  notify_lparen_observers(lineno, column)
  super(token)
end

#on_nl(token) ⇒ Object

This is the first thing that exists on a new line–NOT the last!



305
306
307
308
309
310
311
312
313
# File 'lib/tailor/lexer.rb', line 305

def on_nl(token)
  log "NL"
  current_line = LexedLine.new(super, lineno)

  nl_changed
  notify_nl_observers(current_line, lineno, column)

  super(token)
end

#on_op(token) ⇒ Object

Called when the lexer matches an operator.

Parameters:

  • token (String)

    The token that the lexer matched.



318
319
320
321
# File 'lib/tailor/lexer.rb', line 318

def on_op(token)
  log "OP: '#{token}'"
  super(token)
end

#on_period(token) ⇒ Object

Called when the lexer matches a period.

Parameters:

  • token (String)

    The token that the lexer matched.



326
327
328
329
330
331
332
333
# File 'lib/tailor/lexer.rb', line 326

def on_period(token)
  log "PERIOD: '#{token}'"

  period_changed
  notify_period_observers(current_line_of_text.length, lineno, column)

  super(token)
end

#on_qwords_beg(token) ⇒ Object

Called when the lexer matches ‘%w’. Statement is ended by a :on_words_end.

Parameters:

  • token (String)

    The token that the lexer matched.



339
340
341
342
# File 'lib/tailor/lexer.rb', line 339

def on_qwords_beg(token)
  log "QWORDS_BEG: '#{token}'"
  super(token)
end

#on_rbrace(token) ⇒ Object

Called when the lexer matches a }.

Parameters:

  • token (String)

    The token that the lexer matched.



347
348
349
350
351
352
353
354
355
# File 'lib/tailor/lexer.rb', line 347

def on_rbrace(token)
  log "RBRACE: '#{token}'"

  current_line = LexedLine.new(super, lineno)
  rbrace_changed
  notify_rbrace_observers(current_line, lineno, column)

  super(token)
end

#on_rbracket(token) ⇒ Object

Called when the lexer matches a ].

Parameters:

  • token (String)

    The token that the lexer matched.



360
361
362
363
364
365
366
367
368
# File 'lib/tailor/lexer.rb', line 360

def on_rbracket(token)
  log "RBRACKET: '#{token}'"

  current_line = LexedLine.new(super, lineno)
  rbracket_changed
  notify_rbracket_observers(current_line, lineno, column)

  super(token)
end

#on_regexp_beg(token) ⇒ Object

Called when the lexer matches the beginning of a Regexp.

Parameters:

  • token (String)

    The token that the lexer matched.



373
374
375
376
# File 'lib/tailor/lexer.rb', line 373

def on_regexp_beg(token)
  log "REGEXP_BEG: '#{token}'"
  super(token)
end

#on_regexp_end(token) ⇒ Object

Called when the lexer matches the end of a Regexp.

Parameters:

  • token (String)

    The token that the lexer matched.



381
382
383
384
# File 'lib/tailor/lexer.rb', line 381

def on_regexp_end(token)
  log "REGEXP_END: '#{token}'"
  super(token)
end

#on_rparen(token) ⇒ Object

Called when the lexer matches a ).

Parameters:

  • token (String)

    The token that the lexer matched.



389
390
391
392
393
394
395
396
397
# File 'lib/tailor/lexer.rb', line 389

def on_rparen(token)
  log "RPAREN: '#{token}'"

  current_line = LexedLine.new(super, lineno)
  rparen_changed
  notify_rparen_observers(current_line, lineno, column)

  super(token)
end

#on_semicolon(token) ⇒ Object

Called when the lexer matches a ;.

Parameters:

  • token (String)

    The token that the lexer matched.



402
403
404
405
# File 'lib/tailor/lexer.rb', line 402

def on_semicolon(token)
  log "SEMICOLON: '#{token}'"
  super(token)
end

#on_sp(token) ⇒ Object

Called when the lexer matches any type of space character.

Parameters:

  • token (String)

    The token that the lexer matched.



410
411
412
413
414
415
416
417
418
419
420
421
422
423
# File 'lib/tailor/lexer.rb', line 410

def on_sp(token)
  log "SP: '#{token}'; size: #{token.size}"
  l_token = Tailor::Lexer::Token.new(token)
  sp_changed
  notify_sp_observers(l_token, lineno, column)

  # Deal with lines that end with \
  if token == "\\\n"
    current_line = LexedLine.new(super, lineno)
    ignored_nl_changed
    notify_ignored_nl_observers(current_line, lineno, column)
  end
  super(token)
end

#on_symbeg(token) ⇒ Object

Called when the lexer matches the : at the beginning of a Symbol.

Parameters:

  • token (String)

    The token that the lexer matched.



428
429
430
431
# File 'lib/tailor/lexer.rb', line 428

def on_symbeg(token)
  log "SYMBEG: '#{token}'"
  super(token)
end

#on_tlambda(token) ⇒ Object

Called when the lexer matches the -> as a lambda.

Parameters:

  • token (String)

    The token that the lexer matched.



436
437
438
439
# File 'lib/tailor/lexer.rb', line 436

def on_tlambda(token)
  log "TLAMBDA: '#{token}'"
  super(token)
end

#on_tlambeg(token) ⇒ Object

Called when the lexer matches the { that represents the beginning of a -> lambda.

Parameters:

  • token (String)

    The token that the lexer matched.



445
446
447
448
# File 'lib/tailor/lexer.rb', line 445

def on_tlambeg(token)
  log "TLAMBEG: '#{token}'"
  super(token)
end

#on_tstring_beg(token) ⇒ Object

Called when the lexer matches the beginning of a String.

Parameters:

  • token (String)

    The token that the lexer matched.



453
454
455
456
457
458
459
# File 'lib/tailor/lexer.rb', line 453

def on_tstring_beg(token)
  log "TSTRING_BEG: '#{token}'"
  current_line = LexedLine.new(super, lineno)
  tstring_beg_changed
  notify_tstring_beg_observers(current_line, lineno)
  super(token)
end

#on_tstring_content(token) ⇒ Object

Called when the lexer matches the content of any String.

Parameters:

  • token (String)

    The token that the lexer matched.



464
465
466
467
# File 'lib/tailor/lexer.rb', line 464

def on_tstring_content(token)
  log "TSTRING_CONTENT: '#{token}'"
  super(token)
end

#on_tstring_end(token) ⇒ Object

Called when the lexer matches the end of a String.

Parameters:

  • token (String)

    The token that the lexer matched.



472
473
474
475
476
477
# File 'lib/tailor/lexer.rb', line 472

def on_tstring_end(token)
  log "TSTRING_END: '#{token}'"
  tstring_end_changed
  notify_tstring_end_observers(lineno)
  super(token)
end

#on_words_beg(token) ⇒ Object

Called when the lexer matches ‘%W’.

Parameters:

  • token (String)

    The token that the lexer matched.



482
483
484
485
# File 'lib/tailor/lexer.rb', line 482

def on_words_beg(token)
  log "WORDS_BEG: '#{token}'"
  super(token)
end

#on_words_sep(token) ⇒ Object

Called when the lexer matches the separators in a %w or %W (by default, this is a single space).

Parameters:

  • token (String)

    The token that the lexer matched.



491
492
493
494
# File 'lib/tailor/lexer.rb', line 491

def on_words_sep(token)
  log "WORDS_SEP: '#{token}'"
  super(token)
end