Class: Tailor::Lexer
- Inherits:
-
Ripper::Lexer
- Object
- Ripper::Lexer
- Tailor::Lexer
- Includes:
- LogSwitch::Mixin, CompositeObservable, LexerConstants
- Defined in:
- lib/tailor/lexer.rb,
lib/tailor/lexer/token.rb
Overview
This is what provides the main file parsing for tailor. For every event that’s encountered, it calls the appropriate notifier method. Notifier methods are provided by CompositeObservable.
Defined Under Namespace
Classes: Token
Constant Summary
Constants included from LexerConstants
Tailor::LexerConstants::CONTINUATION_KEYWORDS, Tailor::LexerConstants::KEYWORDS_AND_MODIFIERS, Tailor::LexerConstants::KEYWORDS_TO_INDENT, Tailor::LexerConstants::LOOP_KEYWORDS, Tailor::LexerConstants::MODIFIERS, Tailor::LexerConstants::MULTILINE_OPERATORS
Instance Method Summary collapse
-
#count_trailing_newlines(text) ⇒ Fixnum
Counts the number of newlines at the end of the file.
-
#current_line_of_text ⇒ String
The current line of text being examined.
-
#ensure_trailing_newline(file_text) ⇒ String
Adds a newline to the end of the test if one doesn’t exist.
-
#initialize(file) ⇒ Lexer
constructor
A new instance of Lexer.
-
#lex ⇒ Object
This kicks off the process of parsing the file and publishing events as the events are discovered.
-
#on___end__(token) ⇒ Object
Called when the lexer matches __END__.
- #on_backref(token) ⇒ Object
-
#on_backtick(token) ⇒ Object
Called when the lexer matches the first ‘ in a “ statement (the second matches :on_tstring_end; this may or may not be a Ruby bug).
-
#on_CHAR(token) ⇒ Object
Called when the lexer matches CHAR.
-
#on_comma(token) ⇒ Object
Called when the lexer matches a comma.
-
#on_comment(token) ⇒ Object
Called when the lexer matches a #.
-
#on_const(token) ⇒ Object
Called when the lexer matches a constant (including class names, of course).
-
#on_cvar(token) ⇒ Object
Called when the lexer matches a class variable.
-
#on_embdoc(token) ⇒ Object
Called when the lexer matches the content inside a =begin/=end.
-
#on_embdoc_beg(token) ⇒ Object
Called when the lexer matches =begin.
-
#on_embdoc_end(token) ⇒ Object
Called when the lexer matches =end.
-
#on_embexpr_beg(token) ⇒ Object
Called when the lexer matches a #{..
-
#on_embexpr_end(token) ⇒ Object
Called when the lexer matches the } that closes a #{.
- #on_embvar(token) ⇒ Object
-
#on_float(token) ⇒ Object
Called when the lexer matches a Float.
-
#on_gvar(token) ⇒ Object
Called when the lexer matches a global variable.
-
#on_heredoc_beg(token) ⇒ Object
Called when the lexer matches the beginning of a heredoc.
-
#on_heredoc_end(token) ⇒ Object
Called when the lexer matches the end of a heredoc.
-
#on_ident(token) ⇒ Object
Called when the lexer matches an identifier (method name, variable, the text part of a Symbol, etc.).
-
#on_ignored_nl(token) ⇒ Object
Called when the lexer matches a Ruby ignored newline.
-
#on_int(token) ⇒ Object
Called when the lexer matches an Integer.
-
#on_ivar(token) ⇒ Object
Called when the lexer matches an instance variable.
-
#on_kw(token) ⇒ Object
Called when the lexer matches a Ruby keyword.
-
#on_label(token) ⇒ Object
Called when the lexer matches a label (the first part in a non-rocket style Hash).
-
#on_lbrace(token) ⇒ Object
Called when the lexer matches a {. Note a #{ match calls
#on_embexpr_beg
.. -
#on_lbracket(token) ⇒ Object
Called when the lexer matches a [..
-
#on_lparen(token) ⇒ Object
Called when the lexer matches a (..
-
#on_nl(token) ⇒ Object
This is the first thing that exists on a new line–NOT the last!.
-
#on_op(token) ⇒ Object
Called when the lexer matches an operator.
-
#on_period(token) ⇒ Object
Called when the lexer matches a period.
-
#on_qwords_beg(token) ⇒ Object
Called when the lexer matches ‘%w’.
-
#on_rbrace(token) ⇒ Object
Called when the lexer matches a }.
-
#on_rbracket(token) ⇒ Object
Called when the lexer matches a ].
-
#on_regexp_beg(token) ⇒ Object
Called when the lexer matches the beginning of a Regexp.
-
#on_regexp_end(token) ⇒ Object
Called when the lexer matches the end of a Regexp.
-
#on_rparen(token) ⇒ Object
Called when the lexer matches a ).
-
#on_semicolon(token) ⇒ Object
Called when the lexer matches a ;.
-
#on_sp(token) ⇒ Object
Called when the lexer matches any type of space character.
-
#on_symbeg(token) ⇒ Object
Called when the lexer matches the : at the beginning of a Symbol.
-
#on_tlambda(token) ⇒ Object
Called when the lexer matches the -> as a lambda.
-
#on_tlambeg(token) ⇒ Object
Called when the lexer matches the { that represents the beginning of a -> lambda..
-
#on_tstring_beg(token) ⇒ Object
Called when the lexer matches the beginning of a String.
-
#on_tstring_content(token) ⇒ Object
Called when the lexer matches the content of any String.
-
#on_tstring_end(token) ⇒ Object
Called when the lexer matches the end of a String.
-
#on_words_beg(token) ⇒ Object
Called when the lexer matches ‘%W’.
-
#on_words_sep(token) ⇒ Object
Called when the lexer matches the separators in a %w or %W (by default, this is a single space).
Methods included from CompositeObservable
Constructor Details
#initialize(file) ⇒ Lexer
Returns a new instance of Lexer.
21 22 23 24 25 26 27 28 29 30 31 32 33 34 |
# File 'lib/tailor/lexer.rb', line 21 def initialize(file) @original_file_text = if File.exists? file @file_name = file File.open(@file_name, 'r').read else @file_name = "<notafile>" file end @file_text = ensure_trailing_newline(@original_file_text) @file_text = sub_line_ending_backslashes(@file_text) super @file_text @added_newline = @file_text != @original_file_text end |
Instance Method Details
#count_trailing_newlines(text) ⇒ Fixnum
Counts the number of newlines at the end of the file.
526 527 528 529 530 531 532 533 534 535 536 537 538 539 540 541 542 |
# File 'lib/tailor/lexer.rb', line 526 def count_trailing_newlines(text) if text.end_with? "\n" count = 0 text.reverse.chars do |c| if c == "\n" count += 1 else break end end count else 0 end end |
#current_line_of_text ⇒ String
The current line of text being examined.
518 519 520 |
# File 'lib/tailor/lexer.rb', line 518 def current_line_of_text @file_text.split("\n").at(lineno - 1) || '' end |
#ensure_trailing_newline(file_text) ⇒ String
Adds a newline to the end of the test if one doesn’t exist. Without doing this, Ripper won’t trigger a newline event for the last line of the file, which is required for some rulers to do their thing.
550 551 552 |
# File 'lib/tailor/lexer.rb', line 550 def ensure_trailing_newline(file_text) count_trailing_newlines(file_text) > 0 ? file_text : (file_text + "\n") end |
#lex ⇒ Object
This kicks off the process of parsing the file and publishing events as the events are discovered.
38 39 40 41 42 43 44 45 46 |
# File 'lib/tailor/lexer.rb', line 38 def lex file_beg_changed notify_file_beg_observers(@file_name) super file_end_changed notify_file_end_observers(count_trailing_newlines(@original_file_text)) end |
#on___end__(token) ⇒ Object
Called when the lexer matches __END__.
502 503 504 505 |
# File 'lib/tailor/lexer.rb', line 502 def on___end__(token) log "__END__: '#{token}'" super(token) end |
#on_backref(token) ⇒ Object
48 49 50 51 |
# File 'lib/tailor/lexer.rb', line 48 def on_backref(token) log "BACKREF: '#{token}'" super(token) end |
#on_backtick(token) ⇒ Object
Called when the lexer matches the first ‘ in a “ statement (the second matches :on_tstring_end; this may or may not be a Ruby bug).
57 58 59 60 |
# File 'lib/tailor/lexer.rb', line 57 def on_backtick(token) log "BACKTICK: '#{token}'" super(token) end |
#on_CHAR(token) ⇒ Object
Called when the lexer matches CHAR.
510 511 512 513 |
# File 'lib/tailor/lexer.rb', line 510 def on_CHAR(token) log "CHAR: '#{token}'" super(token) end |
#on_comma(token) ⇒ Object
Called when the lexer matches a comma.
65 66 67 68 69 70 71 72 73 |
# File 'lib/tailor/lexer.rb', line 65 def on_comma(token) log "COMMA: #{token}" log "Line length: #{current_line_of_text.length}" comma_changed notify_comma_observers(current_line_of_text, lineno, column) super(token) end |
#on_comment(token) ⇒ Object
Called when the lexer matches a #. The token includes the # as well as the content after it.
79 80 81 82 83 84 85 86 87 88 |
# File 'lib/tailor/lexer.rb', line 79 def on_comment(token) log "COMMENT: '#{token}'" l_token = Tailor::Lexer::Token.new(token) lexed_line = LexedLine.new(super, lineno) comment_changed notify_comment_observers(l_token, lexed_line, @file_text, lineno, column) super(token) end |
#on_const(token) ⇒ Object
Called when the lexer matches a constant (including class names, of course).
94 95 96 97 98 99 100 101 102 103 |
# File 'lib/tailor/lexer.rb', line 94 def on_const(token) log "CONST: '#{token}'" l_token = Tailor::Lexer::Token.new(token) lexed_line = LexedLine.new(super, lineno) const_changed notify_const_observers(l_token, lexed_line, lineno, column) super(token) end |
#on_cvar(token) ⇒ Object
Called when the lexer matches a class variable.
108 109 110 111 |
# File 'lib/tailor/lexer.rb', line 108 def on_cvar(token) log "CVAR: '#{token}'" super(token) end |
#on_embdoc(token) ⇒ Object
Called when the lexer matches the content inside a =begin/=end.
116 117 118 119 |
# File 'lib/tailor/lexer.rb', line 116 def on_embdoc(token) log "EMBDOC: '#{token}'" super(token) end |
#on_embdoc_beg(token) ⇒ Object
Called when the lexer matches =begin.
124 125 126 127 |
# File 'lib/tailor/lexer.rb', line 124 def on_embdoc_beg(token) log "EMBDOC_BEG: '#{token}'" super(token) end |
#on_embdoc_end(token) ⇒ Object
Called when the lexer matches =end.
132 133 134 135 |
# File 'lib/tailor/lexer.rb', line 132 def on_embdoc_end(token) log "EMBDOC_BEG: '#{token}'" super(token) end |
#on_embexpr_beg(token) ⇒ Object
Called when the lexer matches a #{.
140 141 142 143 144 145 146 |
# File 'lib/tailor/lexer.rb', line 140 def on_embexpr_beg(token) log "EMBEXPR_BEG: '#{token}'" current_line = LexedLine.new(super, lineno) embexpr_beg_changed notify_embexpr_beg_observers(current_line, lineno, column) super(token) end |
#on_embexpr_end(token) ⇒ Object
Called when the lexer matches the } that closes a #{. Note that as of MRI 1.9.3-p125, this never gets called. Logged as a bug and fixed in ruby 2.0.0-p0: bugs.ruby-lang.org/issues/6211.
153 154 155 156 157 158 159 |
# File 'lib/tailor/lexer.rb', line 153 def on_embexpr_end(token) log "EMBEXPR_END: '#{token}'" current_line = LexedLine.new(super, lineno) embexpr_end_changed notify_embexpr_end_observers(current_line, lineno, column) super(token) end |
#on_embvar(token) ⇒ Object
161 162 163 164 |
# File 'lib/tailor/lexer.rb', line 161 def on_embvar(token) log "EMBVAR: '#{token}'" super(token) end |
#on_float(token) ⇒ Object
Called when the lexer matches a Float.
169 170 171 172 |
# File 'lib/tailor/lexer.rb', line 169 def on_float(token) log "FLOAT: '#{token}'" super(token) end |
#on_gvar(token) ⇒ Object
Called when the lexer matches a global variable.
177 178 179 180 |
# File 'lib/tailor/lexer.rb', line 177 def on_gvar(token) log "GVAR: '#{token}'" super(token) end |
#on_heredoc_beg(token) ⇒ Object
Called when the lexer matches the beginning of a heredoc.
185 186 187 188 |
# File 'lib/tailor/lexer.rb', line 185 def on_heredoc_beg(token) log "HEREDOC_BEG: '#{token}'" super(token) end |
#on_heredoc_end(token) ⇒ Object
Called when the lexer matches the end of a heredoc.
193 194 195 196 |
# File 'lib/tailor/lexer.rb', line 193 def on_heredoc_end(token) log "HEREDOC_END: '#{token}'" super(token) end |
#on_ident(token) ⇒ Object
Called when the lexer matches an identifier (method name, variable, the text part of a Symbol, etc.).
202 203 204 205 206 207 208 209 |
# File 'lib/tailor/lexer.rb', line 202 def on_ident(token) log "IDENT: '#{token}'" l_token = Tailor::Lexer::Token.new(token) lexed_line = LexedLine.new(super, lineno) ident_changed notify_ident_observers(l_token, lexed_line, lineno, column) super(token) end |
#on_ignored_nl(token) ⇒ Object
Called when the lexer matches a Ruby ignored newline. Ignored newlines occur when a newline is encountered, but the statement that was expressed on that line was not completed on that line.
216 217 218 219 220 221 222 223 224 |
# File 'lib/tailor/lexer.rb', line 216 def on_ignored_nl(token) log "IGNORED_NL" current_line = LexedLine.new(super, lineno) ignored_nl_changed notify_ignored_nl_observers(current_line, lineno, column) super(token) end |
#on_int(token) ⇒ Object
Called when the lexer matches an Integer.
229 230 231 232 |
# File 'lib/tailor/lexer.rb', line 229 def on_int(token) log "INT: '#{token}'" super(token) end |
#on_ivar(token) ⇒ Object
Called when the lexer matches an instance variable.
237 238 239 240 |
# File 'lib/tailor/lexer.rb', line 237 def on_ivar(token) log "IVAR: '#{token}'" super(token) end |
#on_kw(token) ⇒ Object
Called when the lexer matches a Ruby keyword.
245 246 247 248 249 250 251 252 253 254 255 256 257 258 259 260 |
# File 'lib/tailor/lexer.rb', line 245 def on_kw(token) log "KW: #{token}" current_line = LexedLine.new(super, lineno) l_token = Tailor::Lexer::Token.new(token, { loop_with_do: current_line.loop_with_do?, full_line_of_text: current_line_of_text } ) kw_changed notify_kw_observers(l_token, current_line, lineno, column) super(token) end |
#on_label(token) ⇒ Object
Called when the lexer matches a label (the first part in a non-rocket style Hash).
Example:
one: 1 # Matches one:
269 270 271 272 |
# File 'lib/tailor/lexer.rb', line 269 def on_label(token) log "LABEL: '#{token}'" super(token) end |
#on_lbrace(token) ⇒ Object
Called when the lexer matches a {. Note a #{ match calls #on_embexpr_beg
.
278 279 280 281 282 283 284 |
# File 'lib/tailor/lexer.rb', line 278 def on_lbrace(token) log "LBRACE: '#{token}'" current_line = LexedLine.new(super, lineno) lbrace_changed notify_lbrace_observers(current_line, lineno, column) super(token) end |
#on_lbracket(token) ⇒ Object
Called when the lexer matches a [.
289 290 291 292 293 294 295 |
# File 'lib/tailor/lexer.rb', line 289 def on_lbracket(token) log "LBRACKET: '#{token}'" current_line = LexedLine.new(super, lineno) lbracket_changed notify_lbracket_observers(current_line, lineno, column) super(token) end |
#on_lparen(token) ⇒ Object
Called when the lexer matches a (.
300 301 302 303 304 305 |
# File 'lib/tailor/lexer.rb', line 300 def on_lparen(token) log "LPAREN: '#{token}'" lparen_changed notify_lparen_observers(lineno, column) super(token) end |
#on_nl(token) ⇒ Object
This is the first thing that exists on a new line–NOT the last!
308 309 310 311 312 313 314 315 316 |
# File 'lib/tailor/lexer.rb', line 308 def on_nl(token) log "NL" current_line = LexedLine.new(super, lineno) nl_changed notify_nl_observers(current_line, lineno, column) super(token) end |
#on_op(token) ⇒ Object
Called when the lexer matches an operator.
321 322 323 324 |
# File 'lib/tailor/lexer.rb', line 321 def on_op(token) log "OP: '#{token}'" super(token) end |
#on_period(token) ⇒ Object
Called when the lexer matches a period.
329 330 331 332 333 334 335 336 |
# File 'lib/tailor/lexer.rb', line 329 def on_period(token) log "PERIOD: '#{token}'" period_changed notify_period_observers(current_line_of_text.length, lineno, column) super(token) end |
#on_qwords_beg(token) ⇒ Object
Called when the lexer matches ‘%w’. Statement is ended by a :on_words_end
.
342 343 344 345 |
# File 'lib/tailor/lexer.rb', line 342 def on_qwords_beg(token) log "QWORDS_BEG: '#{token}'" super(token) end |
#on_rbrace(token) ⇒ Object
Called when the lexer matches a }.
350 351 352 353 354 355 356 357 358 |
# File 'lib/tailor/lexer.rb', line 350 def on_rbrace(token) log "RBRACE: '#{token}'" current_line = LexedLine.new(super, lineno) rbrace_changed notify_rbrace_observers(current_line, lineno, column) super(token) end |
#on_rbracket(token) ⇒ Object
Called when the lexer matches a ].
363 364 365 366 367 368 369 370 371 |
# File 'lib/tailor/lexer.rb', line 363 def on_rbracket(token) log "RBRACKET: '#{token}'" current_line = LexedLine.new(super, lineno) rbracket_changed notify_rbracket_observers(current_line, lineno, column) super(token) end |
#on_regexp_beg(token) ⇒ Object
Called when the lexer matches the beginning of a Regexp.
376 377 378 379 |
# File 'lib/tailor/lexer.rb', line 376 def on_regexp_beg(token) log "REGEXP_BEG: '#{token}'" super(token) end |
#on_regexp_end(token) ⇒ Object
Called when the lexer matches the end of a Regexp.
384 385 386 387 |
# File 'lib/tailor/lexer.rb', line 384 def on_regexp_end(token) log "REGEXP_END: '#{token}'" super(token) end |
#on_rparen(token) ⇒ Object
Called when the lexer matches a ).
392 393 394 395 396 397 398 399 400 |
# File 'lib/tailor/lexer.rb', line 392 def on_rparen(token) log "RPAREN: '#{token}'" current_line = LexedLine.new(super, lineno) rparen_changed notify_rparen_observers(current_line, lineno, column) super(token) end |
#on_semicolon(token) ⇒ Object
Called when the lexer matches a ;.
405 406 407 408 |
# File 'lib/tailor/lexer.rb', line 405 def on_semicolon(token) log "SEMICOLON: '#{token}'" super(token) end |
#on_sp(token) ⇒ Object
Called when the lexer matches any type of space character.
413 414 415 416 417 418 419 420 421 422 423 424 425 426 |
# File 'lib/tailor/lexer.rb', line 413 def on_sp(token) log "SP: '#{token}'; size: #{token.size}" l_token = Tailor::Lexer::Token.new(token) sp_changed notify_sp_observers(l_token, lineno, column) # Deal with lines that end with \ if token == "\\\n" current_line = LexedLine.new(super, lineno) ignored_nl_changed notify_ignored_nl_observers(current_line, lineno, column) end super(token) end |
#on_symbeg(token) ⇒ Object
Called when the lexer matches the : at the beginning of a Symbol.
431 432 433 434 |
# File 'lib/tailor/lexer.rb', line 431 def on_symbeg(token) log "SYMBEG: '#{token}'" super(token) end |
#on_tlambda(token) ⇒ Object
Called when the lexer matches the -> as a lambda.
439 440 441 442 |
# File 'lib/tailor/lexer.rb', line 439 def on_tlambda(token) log "TLAMBDA: '#{token}'" super(token) end |
#on_tlambeg(token) ⇒ Object
Called when the lexer matches the { that represents the beginning of a -> lambda.
448 449 450 451 |
# File 'lib/tailor/lexer.rb', line 448 def on_tlambeg(token) log "TLAMBEG: '#{token}'" super(token) end |
#on_tstring_beg(token) ⇒ Object
Called when the lexer matches the beginning of a String.
456 457 458 459 460 461 462 |
# File 'lib/tailor/lexer.rb', line 456 def on_tstring_beg(token) log "TSTRING_BEG: '#{token}'" current_line = LexedLine.new(super, lineno) tstring_beg_changed notify_tstring_beg_observers(current_line, lineno) super(token) end |
#on_tstring_content(token) ⇒ Object
Called when the lexer matches the content of any String.
467 468 469 470 |
# File 'lib/tailor/lexer.rb', line 467 def on_tstring_content(token) log "TSTRING_CONTENT: '#{token}'" super(token) end |
#on_tstring_end(token) ⇒ Object
Called when the lexer matches the end of a String.
475 476 477 478 479 480 |
# File 'lib/tailor/lexer.rb', line 475 def on_tstring_end(token) log "TSTRING_END: '#{token}'" tstring_end_changed notify_tstring_end_observers(lineno) super(token) end |
#on_words_beg(token) ⇒ Object
Called when the lexer matches ‘%W’.
485 486 487 488 |
# File 'lib/tailor/lexer.rb', line 485 def on_words_beg(token) log "WORDS_BEG: '#{token}'" super(token) end |
#on_words_sep(token) ⇒ Object
Called when the lexer matches the separators in a %w or %W (by default, this is a single space).
494 495 496 497 |
# File 'lib/tailor/lexer.rb', line 494 def on_words_sep(token) log "WORDS_SEP: '#{token}'" super(token) end |