Class: Tailor::Lexer

Inherits:
Ripper::Lexer
  • Object
show all
Includes:
LogSwitch::Mixin, CompositeObservable, LexerConstants
Defined in:
lib/tailor/lexer.rb,
lib/tailor/lexer/token.rb

Overview

This is what provides the main file parsing for tailor. For every event that’s encountered, it calls the appropriate notifier method. Notifier methods are provided by CompositeObservable.

Defined Under Namespace

Classes: Token

Constant Summary

Constants included from LexerConstants

Tailor::LexerConstants::CONTINUATION_KEYWORDS, Tailor::LexerConstants::KEYWORDS_AND_MODIFIERS, Tailor::LexerConstants::KEYWORDS_TO_INDENT, Tailor::LexerConstants::LOOP_KEYWORDS, Tailor::LexerConstants::MODIFIERS, Tailor::LexerConstants::MULTILINE_OPERATORS

Instance Method Summary collapse

Methods included from CompositeObservable

define_observer

Constructor Details

#initialize(file) ⇒ Lexer

Returns a new instance of Lexer.

Parameters:

  • file (String)

    The string to lex, or name of the file to read and analyze.



21
22
23
24
25
26
27
28
29
30
31
32
33
34
# File 'lib/tailor/lexer.rb', line 21

def initialize(file)
  @original_file_text = if File.exists? file
    @file_name = file
    File.open(@file_name, 'r').read
  else
    @file_name = "<notafile>"
    file
  end

  @file_text = ensure_trailing_newline(@original_file_text)
  @file_text = sub_line_ending_backslashes(@file_text)
  super @file_text
  @added_newline = @file_text != @original_file_text
end

Instance Method Details

#count_trailing_newlines(text) ⇒ Fixnum

Counts the number of newlines at the end of the file.

Parameters:

  • text (String)

    The file’s text.

Returns:

  • (Fixnum)

    The number of n at the end of the file.



524
525
526
527
528
529
530
531
532
533
534
535
536
537
538
539
540
# File 'lib/tailor/lexer.rb', line 524

def count_trailing_newlines(text)
  if text.end_with? "\n"
    count = 0

    text.reverse.chars do |c|
      if c == "\n"
        count += 1
      else
        break
      end
    end

    count
  else
    0
  end
end

#current_line_of_textString

The current line of text being examined.

Returns:

  • (String)

    The current line of text.



516
517
518
# File 'lib/tailor/lexer.rb', line 516

def current_line_of_text
  @file_text.split("\n").at(lineno - 1) || ''
end

#ensure_trailing_newline(file_text) ⇒ String

Adds a newline to the end of the test if one doesn’t exist. Without doing this, Ripper won’t trigger a newline event for the last line of the file, which is required for some rulers to do their thing.

Parameters:

  • file_text (String)

    The text to check.

Returns:

  • (String)

    The file text with a newline at the end.



548
549
550
# File 'lib/tailor/lexer.rb', line 548

def ensure_trailing_newline(file_text)
  count_trailing_newlines(file_text) > 0 ? file_text : (file_text + "\n")
end

#lexObject

This kicks off the process of parsing the file and publishing events as the events are discovered.



38
39
40
41
42
43
44
45
46
# File 'lib/tailor/lexer.rb', line 38

def lex
  file_beg_changed
  notify_file_beg_observers(@file_name)

  super

  file_end_changed
  notify_file_end_observers(count_trailing_newlines(@original_file_text))
end

#on___end__(token) ⇒ Object

Called when the lexer matches __END__.

Parameters:

  • token (String)

    The token that the lexer matched.



500
501
502
503
# File 'lib/tailor/lexer.rb', line 500

def on___end__(token)
  log "__END__: '#{token}'"
  super(token)
end

#on_backref(token) ⇒ Object



48
49
50
51
# File 'lib/tailor/lexer.rb', line 48

def on_backref(token)
  log "BACKREF: '#{token}'"
  super(token)
end

#on_backtick(token) ⇒ Object

Called when the lexer matches the first ‘ in a “ statement (the second matches :on_tstring_end; this may or may not be a Ruby bug).

Parameters:

  • token (String)

    The token that the lexer matched.



57
58
59
60
# File 'lib/tailor/lexer.rb', line 57

def on_backtick(token)
  log "BACKTICK: '#{token}'"
  super(token)
end

#on_CHAR(token) ⇒ Object

Called when the lexer matches CHAR.

Parameters:

  • token (String)

    The token that the lexer matched.



508
509
510
511
# File 'lib/tailor/lexer.rb', line 508

def on_CHAR(token)
  log "CHAR: '#{token}'"
  super(token)
end

#on_comma(token) ⇒ Object

Called when the lexer matches a comma.

Parameters:

  • token (String)

    The token that the lexer matched.



65
66
67
68
69
70
71
72
73
# File 'lib/tailor/lexer.rb', line 65

def on_comma(token)
  log "COMMA: #{token}"
  log "Line length: #{current_line_of_text.length}"

  comma_changed
  notify_comma_observers(current_line_of_text, lineno, column)

  super(token)
end

#on_comment(token) ⇒ Object

Called when the lexer matches a #. The token includes the # as well as the content after it.

Parameters:

  • token (String)

    The token that the lexer matched.



79
80
81
82
83
84
85
86
87
88
# File 'lib/tailor/lexer.rb', line 79

def on_comment(token)
  log "COMMENT: '#{token}'"

  l_token = Tailor::Lexer::Token.new(token)
  lexed_line = LexedLine.new(super, lineno)
  comment_changed
  notify_comment_observers(l_token, lexed_line, @file_text, lineno, column)

  super(token)
end

#on_const(token) ⇒ Object

Called when the lexer matches a constant (including class names, of course).

Parameters:

  • token (String)

    The token that the lexer matched.



94
95
96
97
98
99
100
101
102
103
# File 'lib/tailor/lexer.rb', line 94

def on_const(token)
  log "CONST: '#{token}'"

  l_token = Tailor::Lexer::Token.new(token)
  lexed_line = LexedLine.new(super, lineno)
  const_changed
  notify_const_observers(l_token, lexed_line, lineno, column)

  super(token)
end

#on_cvar(token) ⇒ Object

Called when the lexer matches a class variable.

Parameters:

  • token (String)

    The token that the lexer matched.



108
109
110
111
# File 'lib/tailor/lexer.rb', line 108

def on_cvar(token)
  log "CVAR: '#{token}'"
  super(token)
end

#on_embdoc(token) ⇒ Object

Called when the lexer matches the content inside a =begin/=end.

Parameters:

  • token (String)

    The token that the lexer matched.



116
117
118
119
# File 'lib/tailor/lexer.rb', line 116

def on_embdoc(token)
  log "EMBDOC: '#{token}'"
  super(token)
end

#on_embdoc_beg(token) ⇒ Object

Called when the lexer matches =begin.

Parameters:

  • token (String)

    The token that the lexer matched.



124
125
126
127
# File 'lib/tailor/lexer.rb', line 124

def on_embdoc_beg(token)
  log "EMBDOC_BEG: '#{token}'"
  super(token)
end

#on_embdoc_end(token) ⇒ Object

Called when the lexer matches =end.

Parameters:

  • token (String)

    The token that the lexer matched.



132
133
134
135
# File 'lib/tailor/lexer.rb', line 132

def on_embdoc_end(token)
  log "EMBDOC_BEG: '#{token}'"
  super(token)
end

#on_embexpr_beg(token) ⇒ Object

Called when the lexer matches a #{.

Parameters:

  • token (String)

    The token that the lexer matched.



140
141
142
143
144
145
# File 'lib/tailor/lexer.rb', line 140

def on_embexpr_beg(token)
  log "EMBEXPR_BEG: '#{token}'"
  embexpr_beg_changed
  notify_embexpr_beg_observers
  super(token)
end

#on_embexpr_end(token) ⇒ Object

Called when the lexer matches the } that closes a #{. Note that as of MRI 1.9.3-p125, this never gets called. Logged as a bug and fixed, but not yet released: bugs.ruby-lang.org/issues/6211.

Parameters:

  • token (String)

    The token that the lexer matched.



152
153
154
155
156
157
# File 'lib/tailor/lexer.rb', line 152

def on_embexpr_end(token)
  log "EMBEXPR_END: '#{token}'"
  embexpr_end_changed
  notify_embexpr_end_observers
  super(token)
end

#on_embvar(token) ⇒ Object



159
160
161
162
# File 'lib/tailor/lexer.rb', line 159

def on_embvar(token)
  log "EMBVAR: '#{token}'"
  super(token)
end

#on_float(token) ⇒ Object

Called when the lexer matches a Float.

Parameters:

  • token (String)

    The token that the lexer matched.



167
168
169
170
# File 'lib/tailor/lexer.rb', line 167

def on_float(token)
  log "FLOAT: '#{token}'"
  super(token)
end

#on_gvar(token) ⇒ Object

Called when the lexer matches a global variable.

Parameters:

  • token (String)

    The token that the lexer matched.



175
176
177
178
# File 'lib/tailor/lexer.rb', line 175

def on_gvar(token)
  log "GVAR: '#{token}'"
  super(token)
end

#on_heredoc_beg(token) ⇒ Object

Called when the lexer matches the beginning of a heredoc.

Parameters:

  • token (String)

    The token that the lexer matched.



183
184
185
186
# File 'lib/tailor/lexer.rb', line 183

def on_heredoc_beg(token)
  log "HEREDOC_BEG: '#{token}'"
  super(token)
end

#on_heredoc_end(token) ⇒ Object

Called when the lexer matches the end of a heredoc.

Parameters:

  • token (String)

    The token that the lexer matched.



191
192
193
194
# File 'lib/tailor/lexer.rb', line 191

def on_heredoc_end(token)
  log "HEREDOC_END: '#{token}'"
  super(token)
end

#on_ident(token) ⇒ Object

Called when the lexer matches an identifier (method name, variable, the text part of a Symbol, etc.).

Parameters:

  • token (String)

    The token that the lexer matched.



200
201
202
203
204
205
206
207
# File 'lib/tailor/lexer.rb', line 200

def on_ident(token)
  log "IDENT: '#{token}'"
  l_token = Tailor::Lexer::Token.new(token)
  lexed_line = LexedLine.new(super, lineno)
  ident_changed
  notify_ident_observers(l_token, lexed_line, lineno, column)
  super(token)
end

#on_ignored_nl(token) ⇒ Object

Called when the lexer matches a Ruby ignored newline. Ignored newlines occur when a newline is encountered, but the statement that was expressed on that line was not completed on that line.

Parameters:

  • token (String)

    The token that the lexer matched.



214
215
216
217
218
219
220
221
222
# File 'lib/tailor/lexer.rb', line 214

def on_ignored_nl(token)
  log "IGNORED_NL"

  current_line = LexedLine.new(super, lineno)
  ignored_nl_changed
  notify_ignored_nl_observers(current_line, lineno, column)

  super(token)
end

#on_int(token) ⇒ Object

Called when the lexer matches an Integer.

Parameters:

  • token (String)

    The token that the lexer matched.



227
228
229
230
# File 'lib/tailor/lexer.rb', line 227

def on_int(token)
  log "INT: '#{token}'"
  super(token)
end

#on_ivar(token) ⇒ Object

Called when the lexer matches an instance variable.

Parameters:

  • token (String)

    The token that the lexer matched.



235
236
237
238
# File 'lib/tailor/lexer.rb', line 235

def on_ivar(token)
  log "IVAR: '#{token}'"
  super(token)
end

#on_kw(token) ⇒ Object

Called when the lexer matches a Ruby keyword.

Parameters:

  • token (String)

    The token that the lexer matched.



243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
# File 'lib/tailor/lexer.rb', line 243

def on_kw(token)
  log "KW: #{token}"
  current_line = LexedLine.new(super, lineno)

  l_token = Tailor::Lexer::Token.new(token,
    {
      loop_with_do: current_line.loop_with_do?,
      full_line_of_text: current_line_of_text
    }
  )

  kw_changed
  notify_kw_observers(l_token, current_line, lineno, column)

  super(token)
end

#on_label(token) ⇒ Object

Called when the lexer matches a label (the first part in a non-rocket style Hash).

Example:

one: 1     # Matches one:

Parameters:

  • token (String)

    The token that the lexer matched.



267
268
269
270
# File 'lib/tailor/lexer.rb', line 267

def on_label(token)
  log "LABEL: '#{token}'"
  super(token)
end

#on_lbrace(token) ⇒ Object

Called when the lexer matches a {. Note a #{ match calls #on_embexpr_beg.

Parameters:

  • token (String)

    The token that the lexer matched.



276
277
278
279
280
281
282
# File 'lib/tailor/lexer.rb', line 276

def on_lbrace(token)
  log "LBRACE: '#{token}'"
  current_line = LexedLine.new(super, lineno)
  lbrace_changed
  notify_lbrace_observers(current_line, lineno, column)
  super(token)
end

#on_lbracket(token) ⇒ Object

Called when the lexer matches a [.

Parameters:

  • token (String)

    The token that the lexer matched.



287
288
289
290
291
292
293
# File 'lib/tailor/lexer.rb', line 287

def on_lbracket(token)
  log "LBRACKET: '#{token}'"
  current_line = LexedLine.new(super, lineno)
  lbracket_changed
  notify_lbracket_observers(current_line, lineno, column)
  super(token)
end

#on_lparen(token) ⇒ Object

Called when the lexer matches a (.

Parameters:

  • token (String)

    The token that the lexer matched.



298
299
300
301
302
303
# File 'lib/tailor/lexer.rb', line 298

def on_lparen(token)
  log "LPAREN: '#{token}'"
  lparen_changed
  notify_lparen_observers(lineno, column)
  super(token)
end

#on_nl(token) ⇒ Object

This is the first thing that exists on a new line–NOT the last!



306
307
308
309
310
311
312
313
314
# File 'lib/tailor/lexer.rb', line 306

def on_nl(token)
  log "NL"
  current_line = LexedLine.new(super, lineno)

  nl_changed
  notify_nl_observers(current_line, lineno, column)

  super(token)
end

#on_op(token) ⇒ Object

Called when the lexer matches an operator.

Parameters:

  • token (String)

    The token that the lexer matched.



319
320
321
322
# File 'lib/tailor/lexer.rb', line 319

def on_op(token)
  log "OP: '#{token}'"
  super(token)
end

#on_period(token) ⇒ Object

Called when the lexer matches a period.

Parameters:

  • token (String)

    The token that the lexer matched.



327
328
329
330
331
332
333
334
# File 'lib/tailor/lexer.rb', line 327

def on_period(token)
  log "PERIOD: '#{token}'"

  period_changed
  notify_period_observers(current_line_of_text.length, lineno, column)

  super(token)
end

#on_qwords_beg(token) ⇒ Object

Called when the lexer matches ‘%w’. Statement is ended by a :on_words_end.

Parameters:

  • token (String)

    The token that the lexer matched.



340
341
342
343
# File 'lib/tailor/lexer.rb', line 340

def on_qwords_beg(token)
  log "QWORDS_BEG: '#{token}'"
  super(token)
end

#on_rbrace(token) ⇒ Object

Called when the lexer matches a }.

Parameters:

  • token (String)

    The token that the lexer matched.



348
349
350
351
352
353
354
355
356
# File 'lib/tailor/lexer.rb', line 348

def on_rbrace(token)
  log "RBRACE: '#{token}'"

  current_line = LexedLine.new(super, lineno)
  rbrace_changed
  notify_rbrace_observers(current_line, lineno, column)

  super(token)
end

#on_rbracket(token) ⇒ Object

Called when the lexer matches a ].

Parameters:

  • token (String)

    The token that the lexer matched.



361
362
363
364
365
366
367
368
369
# File 'lib/tailor/lexer.rb', line 361

def on_rbracket(token)
  log "RBRACKET: '#{token}'"

  current_line = LexedLine.new(super, lineno)
  rbracket_changed
  notify_rbracket_observers(current_line, lineno, column)

  super(token)
end

#on_regexp_beg(token) ⇒ Object

Called when the lexer matches the beginning of a Regexp.

Parameters:

  • token (String)

    The token that the lexer matched.



374
375
376
377
# File 'lib/tailor/lexer.rb', line 374

def on_regexp_beg(token)
  log "REGEXP_BEG: '#{token}'"
  super(token)
end

#on_regexp_end(token) ⇒ Object

Called when the lexer matches the end of a Regexp.

Parameters:

  • token (String)

    The token that the lexer matched.



382
383
384
385
# File 'lib/tailor/lexer.rb', line 382

def on_regexp_end(token)
  log "REGEXP_END: '#{token}'"
  super(token)
end

#on_rparen(token) ⇒ Object

Called when the lexer matches a ).

Parameters:

  • token (String)

    The token that the lexer matched.



390
391
392
393
394
395
396
397
398
# File 'lib/tailor/lexer.rb', line 390

def on_rparen(token)
  log "RPAREN: '#{token}'"

  current_line = LexedLine.new(super, lineno)
  rparen_changed
  notify_rparen_observers(current_line, lineno, column)

  super(token)
end

#on_semicolon(token) ⇒ Object

Called when the lexer matches a ;.

Parameters:

  • token (String)

    The token that the lexer matched.



403
404
405
406
# File 'lib/tailor/lexer.rb', line 403

def on_semicolon(token)
  log "SEMICOLON: '#{token}'"
  super(token)
end

#on_sp(token) ⇒ Object

Called when the lexer matches any type of space character.

Parameters:

  • token (String)

    The token that the lexer matched.



411
412
413
414
415
416
417
418
419
420
421
422
423
424
# File 'lib/tailor/lexer.rb', line 411

def on_sp(token)
  log "SP: '#{token}'; size: #{token.size}"
  l_token = Tailor::Lexer::Token.new(token)
  sp_changed
  notify_sp_observers(l_token, lineno, column)

  # Deal with lines that end with \
  if token == "\\\n"
    current_line = LexedLine.new(super, lineno)
    ignored_nl_changed
    notify_ignored_nl_observers(current_line, lineno, column)
  end
  super(token)
end

#on_symbeg(token) ⇒ Object

Called when the lexer matches the : at the beginning of a Symbol.

Parameters:

  • token (String)

    The token that the lexer matched.



429
430
431
432
# File 'lib/tailor/lexer.rb', line 429

def on_symbeg(token)
  log "SYMBEG: '#{token}'"
  super(token)
end

#on_tlambda(token) ⇒ Object

Called when the lexer matches the -> as a lambda.

Parameters:

  • token (String)

    The token that the lexer matched.



437
438
439
440
# File 'lib/tailor/lexer.rb', line 437

def on_tlambda(token)
  log "TLAMBDA: '#{token}'"
  super(token)
end

#on_tlambeg(token) ⇒ Object

Called when the lexer matches the { that represents the beginning of a -> lambda.

Parameters:

  • token (String)

    The token that the lexer matched.



446
447
448
449
# File 'lib/tailor/lexer.rb', line 446

def on_tlambeg(token)
  log "TLAMBEG: '#{token}'"
  super(token)
end

#on_tstring_beg(token) ⇒ Object

Called when the lexer matches the beginning of a String.

Parameters:

  • token (String)

    The token that the lexer matched.



454
455
456
457
458
459
460
# File 'lib/tailor/lexer.rb', line 454

def on_tstring_beg(token)
  log "TSTRING_BEG: '#{token}'"
  current_line = LexedLine.new(super, lineno)
  tstring_beg_changed
  notify_tstring_beg_observers(current_line, lineno)
  super(token)
end

#on_tstring_content(token) ⇒ Object

Called when the lexer matches the content of any String.

Parameters:

  • token (String)

    The token that the lexer matched.



465
466
467
468
# File 'lib/tailor/lexer.rb', line 465

def on_tstring_content(token)
  log "TSTRING_CONTENT: '#{token}'"
  super(token)
end

#on_tstring_end(token) ⇒ Object

Called when the lexer matches the end of a String.

Parameters:

  • token (String)

    The token that the lexer matched.



473
474
475
476
477
478
# File 'lib/tailor/lexer.rb', line 473

def on_tstring_end(token)
  log "TSTRING_END: '#{token}'"
  tstring_end_changed
  notify_tstring_end_observers(lineno)
  super(token)
end

#on_words_beg(token) ⇒ Object

Called when the lexer matches ‘%W’.

Parameters:

  • token (String)

    The token that the lexer matched.



483
484
485
486
# File 'lib/tailor/lexer.rb', line 483

def on_words_beg(token)
  log "WORDS_BEG: '#{token}'"
  super(token)
end

#on_words_sep(token) ⇒ Object

Called when the lexer matches the separators in a %w or %W (by default, this is a single space).

Parameters:

  • token (String)

    The token that the lexer matched.



492
493
494
495
# File 'lib/tailor/lexer.rb', line 492

def on_words_sep(token)
  log "WORDS_SEP: '#{token}'"
  super(token)
end