Class: Tailor::Lexer
- Inherits:
-
Ripper::Lexer
- Object
- Ripper::Lexer
- Tailor::Lexer
- Includes:
- LogSwitch::Mixin, CompositeObservable, LexerConstants
- Defined in:
- lib/tailor/lexer.rb,
lib/tailor/lexer/token.rb
Overview
This is what provides the main file parsing for tailor. For every event that’s encountered, it calls the appropriate notifier method. Notifier methods are provided by CompositeObservable.
Defined Under Namespace
Classes: Token
Constant Summary
Constants included from LexerConstants
Tailor::LexerConstants::CONTINUATION_KEYWORDS, Tailor::LexerConstants::KEYWORDS_AND_MODIFIERS, Tailor::LexerConstants::KEYWORDS_TO_INDENT, Tailor::LexerConstants::LOOP_KEYWORDS, Tailor::LexerConstants::MODIFIERS, Tailor::LexerConstants::MULTILINE_OPERATORS
Instance Method Summary collapse
-
#count_trailing_newlines(text) ⇒ Fixnum
Counts the number of newlines at the end of the file.
-
#current_line_of_text ⇒ String
The current line of text being examined.
-
#ensure_trailing_newline(file_text) ⇒ String
Adds a newline to the end of the test if one doesn’t exist.
-
#initialize(file) ⇒ Lexer
constructor
A new instance of Lexer.
-
#lex ⇒ Object
This kicks off the process of parsing the file and publishing events as the events are discovered.
-
#on___end__(token) ⇒ Object
Called when the lexer matches __END__.
- #on_backref(token) ⇒ Object
-
#on_backtick(token) ⇒ Object
Called when the lexer matches the first ‘ in a “ statement (the second matches :on_tstring_end; this may or may not be a Ruby bug).
-
#on_CHAR(token) ⇒ Object
Called when the lexer matches CHAR.
-
#on_comma(token) ⇒ Object
Called when the lexer matches a comma.
-
#on_comment(token) ⇒ Object
Called when the lexer matches a #.
-
#on_const(token) ⇒ Object
Called when the lexer matches a constant (including class names, of course).
-
#on_cvar(token) ⇒ Object
Called when the lexer matches a class variable.
-
#on_embdoc(token) ⇒ Object
Called when the lexer matches the content inside a =begin/=end.
-
#on_embdoc_beg(token) ⇒ Object
Called when the lexer matches =begin.
-
#on_embdoc_end(token) ⇒ Object
Called when the lexer matches =end.
-
#on_embexpr_beg(token) ⇒ Object
Called when the lexer matches a #{..
-
#on_embexpr_end(token) ⇒ Object
Called when the lexer matches the } that closes a #{.
- #on_embvar(token) ⇒ Object
-
#on_float(token) ⇒ Object
Called when the lexer matches a Float.
-
#on_gvar(token) ⇒ Object
Called when the lexer matches a global variable.
-
#on_heredoc_beg(token) ⇒ Object
Called when the lexer matches the beginning of a heredoc.
-
#on_heredoc_end(token) ⇒ Object
Called when the lexer matches the end of a heredoc.
-
#on_ident(token) ⇒ Object
Called when the lexer matches an identifier (method name, variable, the text part of a Symbol, etc.).
-
#on_ignored_nl(token) ⇒ Object
Called when the lexer matches a Ruby ignored newline.
-
#on_int(token) ⇒ Object
Called when the lexer matches an Integer.
-
#on_ivar(token) ⇒ Object
Called when the lexer matches an instance variable.
-
#on_kw(token) ⇒ Object
Called when the lexer matches a Ruby keyword.
-
#on_label(token) ⇒ Object
Called when the lexer matches a label (the first part in a non-rocket style Hash).
-
#on_lbrace(token) ⇒ Object
Called when the lexer matches a {. Note a #{ match calls
#on_embexpr_beg
.. -
#on_lbracket(token) ⇒ Object
Called when the lexer matches a [..
-
#on_lparen(token) ⇒ Object
Called when the lexer matches a (..
-
#on_nl(token) ⇒ Object
This is the first thing that exists on a new line–NOT the last!.
-
#on_op(token) ⇒ Object
Called when the lexer matches an operator.
-
#on_period(token) ⇒ Object
Called when the lexer matches a period.
-
#on_qwords_beg(token) ⇒ Object
Called when the lexer matches ‘%w’.
-
#on_rbrace(token) ⇒ Object
Called when the lexer matches a }.
-
#on_rbracket(token) ⇒ Object
Called when the lexer matches a ].
-
#on_regexp_beg(token) ⇒ Object
Called when the lexer matches the beginning of a Regexp.
-
#on_regexp_end(token) ⇒ Object
Called when the lexer matches the end of a Regexp.
-
#on_rparen(token) ⇒ Object
Called when the lexer matches a ).
-
#on_semicolon(token) ⇒ Object
Called when the lexer matches a ;.
-
#on_sp(token) ⇒ Object
Called when the lexer matches any type of space character.
-
#on_symbeg(token) ⇒ Object
Called when the lexer matches the : at the beginning of a Symbol.
-
#on_tlambda(token) ⇒ Object
Called when the lexer matches the -> as a lambda.
-
#on_tlambeg(token) ⇒ Object
Called when the lexer matches the { that represents the beginning of a -> lambda..
-
#on_tstring_beg(token) ⇒ Object
Called when the lexer matches the beginning of a String.
-
#on_tstring_content(token) ⇒ Object
Called when the lexer matches the content of any String.
-
#on_tstring_end(token) ⇒ Object
Called when the lexer matches the end of a String.
-
#on_words_beg(token) ⇒ Object
Called when the lexer matches ‘%W’.
-
#on_words_sep(token) ⇒ Object
Called when the lexer matches the separators in a %w or %W (by default, this is a single space).
Methods included from CompositeObservable
Constructor Details
#initialize(file) ⇒ Lexer
Returns a new instance of Lexer.
21 22 23 24 25 26 27 28 29 30 31 32 33 34 |
# File 'lib/tailor/lexer.rb', line 21 def initialize(file) @original_file_text = if File.exists? file @file_name = file File.open(@file_name, 'r').read else @file_name = "<notafile>" file end @file_text = ensure_trailing_newline(@original_file_text) @file_text = sub_line_ending_backslashes(@file_text) super @file_text @added_newline = @file_text != @original_file_text end |
Instance Method Details
#count_trailing_newlines(text) ⇒ Fixnum
Counts the number of newlines at the end of the file.
524 525 526 527 528 529 530 531 532 533 534 535 536 537 538 539 540 |
# File 'lib/tailor/lexer.rb', line 524 def count_trailing_newlines(text) if text.end_with? "\n" count = 0 text.reverse.chars do |c| if c == "\n" count += 1 else break end end count else 0 end end |
#current_line_of_text ⇒ String
The current line of text being examined.
516 517 518 |
# File 'lib/tailor/lexer.rb', line 516 def current_line_of_text @file_text.split("\n").at(lineno - 1) || '' end |
#ensure_trailing_newline(file_text) ⇒ String
Adds a newline to the end of the test if one doesn’t exist. Without doing this, Ripper won’t trigger a newline event for the last line of the file, which is required for some rulers to do their thing.
548 549 550 |
# File 'lib/tailor/lexer.rb', line 548 def ensure_trailing_newline(file_text) count_trailing_newlines(file_text) > 0 ? file_text : (file_text + "\n") end |
#lex ⇒ Object
This kicks off the process of parsing the file and publishing events as the events are discovered.
38 39 40 41 42 43 44 45 46 |
# File 'lib/tailor/lexer.rb', line 38 def lex file_beg_changed notify_file_beg_observers(@file_name) super file_end_changed notify_file_end_observers(count_trailing_newlines(@original_file_text)) end |
#on___end__(token) ⇒ Object
Called when the lexer matches __END__.
500 501 502 503 |
# File 'lib/tailor/lexer.rb', line 500 def on___end__(token) log "__END__: '#{token}'" super(token) end |
#on_backref(token) ⇒ Object
48 49 50 51 |
# File 'lib/tailor/lexer.rb', line 48 def on_backref(token) log "BACKREF: '#{token}'" super(token) end |
#on_backtick(token) ⇒ Object
Called when the lexer matches the first ‘ in a “ statement (the second matches :on_tstring_end; this may or may not be a Ruby bug).
57 58 59 60 |
# File 'lib/tailor/lexer.rb', line 57 def on_backtick(token) log "BACKTICK: '#{token}'" super(token) end |
#on_CHAR(token) ⇒ Object
Called when the lexer matches CHAR.
508 509 510 511 |
# File 'lib/tailor/lexer.rb', line 508 def on_CHAR(token) log "CHAR: '#{token}'" super(token) end |
#on_comma(token) ⇒ Object
Called when the lexer matches a comma.
65 66 67 68 69 70 71 72 73 |
# File 'lib/tailor/lexer.rb', line 65 def on_comma(token) log "COMMA: #{token}" log "Line length: #{current_line_of_text.length}" comma_changed notify_comma_observers(current_line_of_text, lineno, column) super(token) end |
#on_comment(token) ⇒ Object
Called when the lexer matches a #. The token includes the # as well as the content after it.
79 80 81 82 83 84 85 86 87 88 |
# File 'lib/tailor/lexer.rb', line 79 def on_comment(token) log "COMMENT: '#{token}'" l_token = Tailor::Lexer::Token.new(token) lexed_line = LexedLine.new(super, lineno) comment_changed notify_comment_observers(l_token, lexed_line, @file_text, lineno, column) super(token) end |
#on_const(token) ⇒ Object
Called when the lexer matches a constant (including class names, of course).
94 95 96 97 98 99 100 101 102 103 |
# File 'lib/tailor/lexer.rb', line 94 def on_const(token) log "CONST: '#{token}'" l_token = Tailor::Lexer::Token.new(token) lexed_line = LexedLine.new(super, lineno) const_changed notify_const_observers(l_token, lexed_line, lineno, column) super(token) end |
#on_cvar(token) ⇒ Object
Called when the lexer matches a class variable.
108 109 110 111 |
# File 'lib/tailor/lexer.rb', line 108 def on_cvar(token) log "CVAR: '#{token}'" super(token) end |
#on_embdoc(token) ⇒ Object
Called when the lexer matches the content inside a =begin/=end.
116 117 118 119 |
# File 'lib/tailor/lexer.rb', line 116 def on_embdoc(token) log "EMBDOC: '#{token}'" super(token) end |
#on_embdoc_beg(token) ⇒ Object
Called when the lexer matches =begin.
124 125 126 127 |
# File 'lib/tailor/lexer.rb', line 124 def on_embdoc_beg(token) log "EMBDOC_BEG: '#{token}'" super(token) end |
#on_embdoc_end(token) ⇒ Object
Called when the lexer matches =end.
132 133 134 135 |
# File 'lib/tailor/lexer.rb', line 132 def on_embdoc_end(token) log "EMBDOC_BEG: '#{token}'" super(token) end |
#on_embexpr_beg(token) ⇒ Object
Called when the lexer matches a #{.
140 141 142 143 144 145 |
# File 'lib/tailor/lexer.rb', line 140 def on_embexpr_beg(token) log "EMBEXPR_BEG: '#{token}'" embexpr_beg_changed notify_embexpr_beg_observers super(token) end |
#on_embexpr_end(token) ⇒ Object
Called when the lexer matches the } that closes a #{. Note that as of MRI 1.9.3-p125, this never gets called. Logged as a bug and fixed, but not yet released: bugs.ruby-lang.org/issues/6211.
152 153 154 155 156 157 |
# File 'lib/tailor/lexer.rb', line 152 def on_embexpr_end(token) log "EMBEXPR_END: '#{token}'" embexpr_end_changed notify_embexpr_end_observers super(token) end |
#on_embvar(token) ⇒ Object
159 160 161 162 |
# File 'lib/tailor/lexer.rb', line 159 def on_embvar(token) log "EMBVAR: '#{token}'" super(token) end |
#on_float(token) ⇒ Object
Called when the lexer matches a Float.
167 168 169 170 |
# File 'lib/tailor/lexer.rb', line 167 def on_float(token) log "FLOAT: '#{token}'" super(token) end |
#on_gvar(token) ⇒ Object
Called when the lexer matches a global variable.
175 176 177 178 |
# File 'lib/tailor/lexer.rb', line 175 def on_gvar(token) log "GVAR: '#{token}'" super(token) end |
#on_heredoc_beg(token) ⇒ Object
Called when the lexer matches the beginning of a heredoc.
183 184 185 186 |
# File 'lib/tailor/lexer.rb', line 183 def on_heredoc_beg(token) log "HEREDOC_BEG: '#{token}'" super(token) end |
#on_heredoc_end(token) ⇒ Object
Called when the lexer matches the end of a heredoc.
191 192 193 194 |
# File 'lib/tailor/lexer.rb', line 191 def on_heredoc_end(token) log "HEREDOC_END: '#{token}'" super(token) end |
#on_ident(token) ⇒ Object
Called when the lexer matches an identifier (method name, variable, the text part of a Symbol, etc.).
200 201 202 203 204 205 206 207 |
# File 'lib/tailor/lexer.rb', line 200 def on_ident(token) log "IDENT: '#{token}'" l_token = Tailor::Lexer::Token.new(token) lexed_line = LexedLine.new(super, lineno) ident_changed notify_ident_observers(l_token, lexed_line, lineno, column) super(token) end |
#on_ignored_nl(token) ⇒ Object
Called when the lexer matches a Ruby ignored newline. Ignored newlines occur when a newline is encountered, but the statement that was expressed on that line was not completed on that line.
214 215 216 217 218 219 220 221 222 |
# File 'lib/tailor/lexer.rb', line 214 def on_ignored_nl(token) log "IGNORED_NL" current_line = LexedLine.new(super, lineno) ignored_nl_changed notify_ignored_nl_observers(current_line, lineno, column) super(token) end |
#on_int(token) ⇒ Object
Called when the lexer matches an Integer.
227 228 229 230 |
# File 'lib/tailor/lexer.rb', line 227 def on_int(token) log "INT: '#{token}'" super(token) end |
#on_ivar(token) ⇒ Object
Called when the lexer matches an instance variable.
235 236 237 238 |
# File 'lib/tailor/lexer.rb', line 235 def on_ivar(token) log "IVAR: '#{token}'" super(token) end |
#on_kw(token) ⇒ Object
Called when the lexer matches a Ruby keyword.
243 244 245 246 247 248 249 250 251 252 253 254 255 256 257 258 |
# File 'lib/tailor/lexer.rb', line 243 def on_kw(token) log "KW: #{token}" current_line = LexedLine.new(super, lineno) l_token = Tailor::Lexer::Token.new(token, { loop_with_do: current_line.loop_with_do?, full_line_of_text: current_line_of_text } ) kw_changed notify_kw_observers(l_token, current_line, lineno, column) super(token) end |
#on_label(token) ⇒ Object
Called when the lexer matches a label (the first part in a non-rocket style Hash).
Example:
one: 1 # Matches one:
267 268 269 270 |
# File 'lib/tailor/lexer.rb', line 267 def on_label(token) log "LABEL: '#{token}'" super(token) end |
#on_lbrace(token) ⇒ Object
Called when the lexer matches a {. Note a #{ match calls #on_embexpr_beg
.
276 277 278 279 280 281 282 |
# File 'lib/tailor/lexer.rb', line 276 def on_lbrace(token) log "LBRACE: '#{token}'" current_line = LexedLine.new(super, lineno) lbrace_changed notify_lbrace_observers(current_line, lineno, column) super(token) end |
#on_lbracket(token) ⇒ Object
Called when the lexer matches a [.
287 288 289 290 291 292 293 |
# File 'lib/tailor/lexer.rb', line 287 def on_lbracket(token) log "LBRACKET: '#{token}'" current_line = LexedLine.new(super, lineno) lbracket_changed notify_lbracket_observers(current_line, lineno, column) super(token) end |
#on_lparen(token) ⇒ Object
Called when the lexer matches a (.
298 299 300 301 302 303 |
# File 'lib/tailor/lexer.rb', line 298 def on_lparen(token) log "LPAREN: '#{token}'" lparen_changed notify_lparen_observers(lineno, column) super(token) end |
#on_nl(token) ⇒ Object
This is the first thing that exists on a new line–NOT the last!
306 307 308 309 310 311 312 313 314 |
# File 'lib/tailor/lexer.rb', line 306 def on_nl(token) log "NL" current_line = LexedLine.new(super, lineno) nl_changed notify_nl_observers(current_line, lineno, column) super(token) end |
#on_op(token) ⇒ Object
Called when the lexer matches an operator.
319 320 321 322 |
# File 'lib/tailor/lexer.rb', line 319 def on_op(token) log "OP: '#{token}'" super(token) end |
#on_period(token) ⇒ Object
Called when the lexer matches a period.
327 328 329 330 331 332 333 334 |
# File 'lib/tailor/lexer.rb', line 327 def on_period(token) log "PERIOD: '#{token}'" period_changed notify_period_observers(current_line_of_text.length, lineno, column) super(token) end |
#on_qwords_beg(token) ⇒ Object
Called when the lexer matches ‘%w’. Statement is ended by a :on_words_end
.
340 341 342 343 |
# File 'lib/tailor/lexer.rb', line 340 def on_qwords_beg(token) log "QWORDS_BEG: '#{token}'" super(token) end |
#on_rbrace(token) ⇒ Object
Called when the lexer matches a }.
348 349 350 351 352 353 354 355 356 |
# File 'lib/tailor/lexer.rb', line 348 def on_rbrace(token) log "RBRACE: '#{token}'" current_line = LexedLine.new(super, lineno) rbrace_changed notify_rbrace_observers(current_line, lineno, column) super(token) end |
#on_rbracket(token) ⇒ Object
Called when the lexer matches a ].
361 362 363 364 365 366 367 368 369 |
# File 'lib/tailor/lexer.rb', line 361 def on_rbracket(token) log "RBRACKET: '#{token}'" current_line = LexedLine.new(super, lineno) rbracket_changed notify_rbracket_observers(current_line, lineno, column) super(token) end |
#on_regexp_beg(token) ⇒ Object
Called when the lexer matches the beginning of a Regexp.
374 375 376 377 |
# File 'lib/tailor/lexer.rb', line 374 def on_regexp_beg(token) log "REGEXP_BEG: '#{token}'" super(token) end |
#on_regexp_end(token) ⇒ Object
Called when the lexer matches the end of a Regexp.
382 383 384 385 |
# File 'lib/tailor/lexer.rb', line 382 def on_regexp_end(token) log "REGEXP_END: '#{token}'" super(token) end |
#on_rparen(token) ⇒ Object
Called when the lexer matches a ).
390 391 392 393 394 395 396 397 398 |
# File 'lib/tailor/lexer.rb', line 390 def on_rparen(token) log "RPAREN: '#{token}'" current_line = LexedLine.new(super, lineno) rparen_changed notify_rparen_observers(current_line, lineno, column) super(token) end |
#on_semicolon(token) ⇒ Object
Called when the lexer matches a ;.
403 404 405 406 |
# File 'lib/tailor/lexer.rb', line 403 def on_semicolon(token) log "SEMICOLON: '#{token}'" super(token) end |
#on_sp(token) ⇒ Object
Called when the lexer matches any type of space character.
411 412 413 414 415 416 417 418 419 420 421 422 423 424 |
# File 'lib/tailor/lexer.rb', line 411 def on_sp(token) log "SP: '#{token}'; size: #{token.size}" l_token = Tailor::Lexer::Token.new(token) sp_changed notify_sp_observers(l_token, lineno, column) # Deal with lines that end with \ if token == "\\\n" current_line = LexedLine.new(super, lineno) ignored_nl_changed notify_ignored_nl_observers(current_line, lineno, column) end super(token) end |
#on_symbeg(token) ⇒ Object
Called when the lexer matches the : at the beginning of a Symbol.
429 430 431 432 |
# File 'lib/tailor/lexer.rb', line 429 def on_symbeg(token) log "SYMBEG: '#{token}'" super(token) end |
#on_tlambda(token) ⇒ Object
Called when the lexer matches the -> as a lambda.
437 438 439 440 |
# File 'lib/tailor/lexer.rb', line 437 def on_tlambda(token) log "TLAMBDA: '#{token}'" super(token) end |
#on_tlambeg(token) ⇒ Object
Called when the lexer matches the { that represents the beginning of a -> lambda.
446 447 448 449 |
# File 'lib/tailor/lexer.rb', line 446 def on_tlambeg(token) log "TLAMBEG: '#{token}'" super(token) end |
#on_tstring_beg(token) ⇒ Object
Called when the lexer matches the beginning of a String.
454 455 456 457 458 459 460 |
# File 'lib/tailor/lexer.rb', line 454 def on_tstring_beg(token) log "TSTRING_BEG: '#{token}'" current_line = LexedLine.new(super, lineno) tstring_beg_changed notify_tstring_beg_observers(current_line, lineno) super(token) end |
#on_tstring_content(token) ⇒ Object
Called when the lexer matches the content of any String.
465 466 467 468 |
# File 'lib/tailor/lexer.rb', line 465 def on_tstring_content(token) log "TSTRING_CONTENT: '#{token}'" super(token) end |
#on_tstring_end(token) ⇒ Object
Called when the lexer matches the end of a String.
473 474 475 476 477 478 |
# File 'lib/tailor/lexer.rb', line 473 def on_tstring_end(token) log "TSTRING_END: '#{token}'" tstring_end_changed notify_tstring_end_observers(lineno) super(token) end |
#on_words_beg(token) ⇒ Object
Called when the lexer matches ‘%W’.
483 484 485 486 |
# File 'lib/tailor/lexer.rb', line 483 def on_words_beg(token) log "WORDS_BEG: '#{token}'" super(token) end |
#on_words_sep(token) ⇒ Object
Called when the lexer matches the separators in a %w or %W (by default, this is a single space).
492 493 494 495 |
# File 'lib/tailor/lexer.rb', line 492 def on_words_sep(token) log "WORDS_SEP: '#{token}'" super(token) end |