Class: Tailor::Lexer
- Inherits:
-
Ripper::Lexer
- Object
- Ripper::Lexer
- Tailor::Lexer
- Includes:
- LogSwitch::Mixin, CompositeObservable, LexerConstants
- Defined in:
- lib/tailor/lexer.rb,
lib/tailor/lexer/token.rb
Overview
This is what provides the main file parsing for tailor. For every event that’s encountered, it calls the appropriate notifier method. Notifier methods are provided by CompositeObservable.
Defined Under Namespace
Classes: Token
Constant Summary
Constants included from LexerConstants
Tailor::LexerConstants::CONTINUATION_KEYWORDS, Tailor::LexerConstants::KEYWORDS_AND_MODIFIERS, Tailor::LexerConstants::KEYWORDS_TO_INDENT, Tailor::LexerConstants::LOOP_KEYWORDS, Tailor::LexerConstants::MODIFIERS, Tailor::LexerConstants::MULTILINE_OPERATORS
Instance Method Summary collapse
-
#count_trailing_newlines(text) ⇒ Fixnum
Counts the number of newlines at the end of the file.
-
#current_line_of_text ⇒ String
The current line of text being examined.
-
#ensure_trailing_newline(file_text) ⇒ String
Adds a newline to the end of the test if one doesn’t exist.
-
#initialize(file) ⇒ Lexer
constructor
A new instance of Lexer.
-
#lex ⇒ Object
This kicks off the process of parsing the file and publishing events as the events are discovered.
-
#on___end__(token) ⇒ Object
Called when the lexer matches __END__.
- #on_backref(token) ⇒ Object
-
#on_backtick(token) ⇒ Object
Called when the lexer matches the first ‘ in a “ statement (the second matches :on_tstring_end; this may or may not be a Ruby bug).
-
#on_CHAR(token) ⇒ Object
Called when the lexer matches CHAR.
-
#on_comma(token) ⇒ Object
Called when the lexer matches a comma.
-
#on_comment(token) ⇒ Object
Called when the lexer matches a #.
-
#on_const(token) ⇒ Object
Called when the lexer matches a constant (including class names, of course).
-
#on_cvar(token) ⇒ Object
Called when the lexer matches a class variable.
-
#on_embdoc(token) ⇒ Object
Called when the lexer matches the content inside a =begin/=end.
-
#on_embdoc_beg(token) ⇒ Object
Called when the lexer matches =begin.
-
#on_embdoc_end(token) ⇒ Object
Called when the lexer matches =end.
-
#on_embexpr_beg(token) ⇒ Object
Called when the lexer matches a #{..
-
#on_embexpr_end(token) ⇒ Object
Called when the lexer matches the } that closes a #{.
- #on_embvar(token) ⇒ Object
-
#on_float(token) ⇒ Object
Called when the lexer matches a Float.
-
#on_gvar(token) ⇒ Object
Called when the lexer matches a global variable.
-
#on_heredoc_beg(token) ⇒ Object
Called when the lexer matches the beginning of a heredoc.
-
#on_heredoc_end(token) ⇒ Object
Called when the lexer matches the end of a heredoc.
-
#on_ident(token) ⇒ Object
Called when the lexer matches an identifier (method name, variable, the text part of a Symbol, etc.).
-
#on_ignored_nl(token) ⇒ Object
Called when the lexer matches a Ruby ignored newline.
-
#on_int(token) ⇒ Object
Called when the lexer matches an Integer.
-
#on_ivar(token) ⇒ Object
Called when the lexer matches an instance variable.
-
#on_kw(token) ⇒ Object
Called when the lexer matches a Ruby keyword.
-
#on_label(token) ⇒ Object
Called when the lexer matches a label (the first part in a non-rocket style Hash).
-
#on_lbrace(token) ⇒ Object
Called when the lexer matches a {. Note a #{ match calls
#on_embexpr_beg
.. -
#on_lbracket(token) ⇒ Object
Called when the lexer matches a [..
-
#on_lparen(token) ⇒ Object
Called when the lexer matches a (..
-
#on_nl(token) ⇒ Object
This is the first thing that exists on a new line–NOT the last!.
-
#on_op(token) ⇒ Object
Called when the lexer matches an operator.
-
#on_period(token) ⇒ Object
Called when the lexer matches a period.
-
#on_qwords_beg(token) ⇒ Object
Called when the lexer matches ‘%w’.
-
#on_rbrace(token) ⇒ Object
Called when the lexer matches a }.
-
#on_rbracket(token) ⇒ Object
Called when the lexer matches a ].
-
#on_regexp_beg(token) ⇒ Object
Called when the lexer matches the beginning of a Regexp.
-
#on_regexp_end(token) ⇒ Object
Called when the lexer matches the end of a Regexp.
-
#on_rparen(token) ⇒ Object
Called when the lexer matches a ).
-
#on_semicolon(token) ⇒ Object
Called when the lexer matches a ;.
-
#on_sp(token) ⇒ Object
Called when the lexer matches any type of space character.
-
#on_symbeg(token) ⇒ Object
Called when the lexer matches the : at the beginning of a Symbol.
-
#on_tlambda(token) ⇒ Object
Called when the lexer matches the -> as a lambda.
-
#on_tlambeg(token) ⇒ Object
Called when the lexer matches the { that represents the beginning of a -> lambda..
-
#on_tstring_beg(token) ⇒ Object
Called when the lexer matches the beginning of a String.
-
#on_tstring_content(token) ⇒ Object
Called when the lexer matches the content of any String.
-
#on_tstring_end(token) ⇒ Object
Called when the lexer matches the end of a String.
-
#on_words_beg(token) ⇒ Object
Called when the lexer matches ‘%W’.
-
#on_words_sep(token) ⇒ Object
Called when the lexer matches the separators in a %w or %W (by default, this is a single space).
Methods included from CompositeObservable
Constructor Details
#initialize(file) ⇒ Lexer
Returns a new instance of Lexer.
21 22 23 24 25 26 27 28 29 30 31 32 33 |
# File 'lib/tailor/lexer.rb', line 21 def initialize(file) @original_file_text = if File.exists? file @file_name = file File.open(@file_name, 'r').read else @file_name = "<notafile>" file end @file_text = ensure_trailing_newline(@original_file_text) super @file_text @added_newline = @file_text != @original_file_text end |
Instance Method Details
#count_trailing_newlines(text) ⇒ Fixnum
Counts the number of newlines at the end of the file.
523 524 525 526 527 528 529 530 531 532 533 534 535 536 537 538 539 |
# File 'lib/tailor/lexer.rb', line 523 def count_trailing_newlines(text) if text.end_with? "\n" count = 0 text.reverse.chars do |c| if c == "\n" count += 1 else break end end count else 0 end end |
#current_line_of_text ⇒ String
The current line of text being examined.
515 516 517 |
# File 'lib/tailor/lexer.rb', line 515 def current_line_of_text @file_text.split("\n").at(lineno - 1) || '' end |
#ensure_trailing_newline(file_text) ⇒ String
Adds a newline to the end of the test if one doesn’t exist. Without doing this, Ripper won’t trigger a newline event for the last line of the file, which is required for some rulers to do their thing.
547 548 549 |
# File 'lib/tailor/lexer.rb', line 547 def ensure_trailing_newline(file_text) count_trailing_newlines(file_text) > 0 ? file_text : (file_text + "\n") end |
#lex ⇒ Object
This kicks off the process of parsing the file and publishing events as the events are discovered.
37 38 39 40 41 42 43 44 45 |
# File 'lib/tailor/lexer.rb', line 37 def lex file_beg_changed notify_file_beg_observers(@file_name) super file_end_changed notify_file_end_observers(count_trailing_newlines(@original_file_text)) end |
#on___end__(token) ⇒ Object
Called when the lexer matches __END__.
499 500 501 502 |
# File 'lib/tailor/lexer.rb', line 499 def on___end__(token) log "__END__: '#{token}'" super(token) end |
#on_backref(token) ⇒ Object
47 48 49 50 |
# File 'lib/tailor/lexer.rb', line 47 def on_backref(token) log "BACKREF: '#{token}'" super(token) end |
#on_backtick(token) ⇒ Object
Called when the lexer matches the first ‘ in a “ statement (the second matches :on_tstring_end; this may or may not be a Ruby bug).
56 57 58 59 |
# File 'lib/tailor/lexer.rb', line 56 def on_backtick(token) log "BACKTICK: '#{token}'" super(token) end |
#on_CHAR(token) ⇒ Object
Called when the lexer matches CHAR.
507 508 509 510 |
# File 'lib/tailor/lexer.rb', line 507 def on_CHAR(token) log "CHAR: '#{token}'" super(token) end |
#on_comma(token) ⇒ Object
Called when the lexer matches a comma.
64 65 66 67 68 69 70 71 72 |
# File 'lib/tailor/lexer.rb', line 64 def on_comma(token) log "COMMA: #{token}" log "Line length: #{current_line_of_text.length}" comma_changed notify_comma_observers(current_line_of_text, lineno, column) super(token) end |
#on_comment(token) ⇒ Object
Called when the lexer matches a #. The token includes the # as well as the content after it.
78 79 80 81 82 83 84 85 86 87 |
# File 'lib/tailor/lexer.rb', line 78 def on_comment(token) log "COMMENT: '#{token}'" l_token = Tailor::Lexer::Token.new(token) lexed_line = LexedLine.new(super, lineno) comment_changed notify_comment_observers(l_token, lexed_line, @file_text, lineno, column) super(token) end |
#on_const(token) ⇒ Object
Called when the lexer matches a constant (including class names, of course).
93 94 95 96 97 98 99 100 101 102 |
# File 'lib/tailor/lexer.rb', line 93 def on_const(token) log "CONST: '#{token}'" l_token = Tailor::Lexer::Token.new(token) lexed_line = LexedLine.new(super, lineno) const_changed notify_const_observers(l_token, lexed_line, lineno, column) super(token) end |
#on_cvar(token) ⇒ Object
Called when the lexer matches a class variable.
107 108 109 110 |
# File 'lib/tailor/lexer.rb', line 107 def on_cvar(token) log "CVAR: '#{token}'" super(token) end |
#on_embdoc(token) ⇒ Object
Called when the lexer matches the content inside a =begin/=end.
115 116 117 118 |
# File 'lib/tailor/lexer.rb', line 115 def on_embdoc(token) log "EMBDOC: '#{token}'" super(token) end |
#on_embdoc_beg(token) ⇒ Object
Called when the lexer matches =begin.
123 124 125 126 |
# File 'lib/tailor/lexer.rb', line 123 def on_embdoc_beg(token) log "EMBDOC_BEG: '#{token}'" super(token) end |
#on_embdoc_end(token) ⇒ Object
Called when the lexer matches =end.
131 132 133 134 |
# File 'lib/tailor/lexer.rb', line 131 def on_embdoc_end(token) log "EMBDOC_BEG: '#{token}'" super(token) end |
#on_embexpr_beg(token) ⇒ Object
Called when the lexer matches a #{.
139 140 141 142 143 144 |
# File 'lib/tailor/lexer.rb', line 139 def on_embexpr_beg(token) log "EMBEXPR_BEG: '#{token}'" embexpr_beg_changed notify_embexpr_beg_observers super(token) end |
#on_embexpr_end(token) ⇒ Object
Called when the lexer matches the } that closes a #{. Note that as of MRI 1.9.3-p125, this never gets called. Logged as a bug and fixed, but not yet released: bugs.ruby-lang.org/issues/6211.
151 152 153 154 155 156 |
# File 'lib/tailor/lexer.rb', line 151 def on_embexpr_end(token) log "EMBEXPR_END: '#{token}'" embexpr_end_changed notify_embexpr_end_observers super(token) end |
#on_embvar(token) ⇒ Object
158 159 160 161 |
# File 'lib/tailor/lexer.rb', line 158 def on_embvar(token) log "EMBVAR: '#{token}'" super(token) end |
#on_float(token) ⇒ Object
Called when the lexer matches a Float.
166 167 168 169 |
# File 'lib/tailor/lexer.rb', line 166 def on_float(token) log "FLOAT: '#{token}'" super(token) end |
#on_gvar(token) ⇒ Object
Called when the lexer matches a global variable.
174 175 176 177 |
# File 'lib/tailor/lexer.rb', line 174 def on_gvar(token) log "GVAR: '#{token}'" super(token) end |
#on_heredoc_beg(token) ⇒ Object
Called when the lexer matches the beginning of a heredoc.
182 183 184 185 |
# File 'lib/tailor/lexer.rb', line 182 def on_heredoc_beg(token) log "HEREDOC_BEG: '#{token}'" super(token) end |
#on_heredoc_end(token) ⇒ Object
Called when the lexer matches the end of a heredoc.
190 191 192 193 |
# File 'lib/tailor/lexer.rb', line 190 def on_heredoc_end(token) log "HEREDOC_END: '#{token}'" super(token) end |
#on_ident(token) ⇒ Object
Called when the lexer matches an identifier (method name, variable, the text part of a Symbol, etc.).
199 200 201 202 203 204 205 206 |
# File 'lib/tailor/lexer.rb', line 199 def on_ident(token) log "IDENT: '#{token}'" l_token = Tailor::Lexer::Token.new(token) lexed_line = LexedLine.new(super, lineno) ident_changed notify_ident_observers(l_token, lexed_line, lineno, column) super(token) end |
#on_ignored_nl(token) ⇒ Object
Called when the lexer matches a Ruby ignored newline. Ignored newlines occur when a newline is encountered, but the statement that was expressed on that line was not completed on that line.
213 214 215 216 217 218 219 220 221 |
# File 'lib/tailor/lexer.rb', line 213 def on_ignored_nl(token) log "IGNORED_NL" current_line = LexedLine.new(super, lineno) ignored_nl_changed notify_ignored_nl_observers(current_line, lineno, column) super(token) end |
#on_int(token) ⇒ Object
Called when the lexer matches an Integer.
226 227 228 229 |
# File 'lib/tailor/lexer.rb', line 226 def on_int(token) log "INT: '#{token}'" super(token) end |
#on_ivar(token) ⇒ Object
Called when the lexer matches an instance variable.
234 235 236 237 |
# File 'lib/tailor/lexer.rb', line 234 def on_ivar(token) log "IVAR: '#{token}'" super(token) end |
#on_kw(token) ⇒ Object
Called when the lexer matches a Ruby keyword.
242 243 244 245 246 247 248 249 250 251 252 253 254 255 256 257 |
# File 'lib/tailor/lexer.rb', line 242 def on_kw(token) log "KW: #{token}" current_line = LexedLine.new(super, lineno) l_token = Tailor::Lexer::Token.new(token, { loop_with_do: current_line.loop_with_do?, full_line_of_text: current_line_of_text } ) kw_changed notify_kw_observers(l_token, current_line, lineno, column) super(token) end |
#on_label(token) ⇒ Object
Called when the lexer matches a label (the first part in a non-rocket style Hash).
Example:
one: 1 # Matches one:
266 267 268 269 |
# File 'lib/tailor/lexer.rb', line 266 def on_label(token) log "LABEL: '#{token}'" super(token) end |
#on_lbrace(token) ⇒ Object
Called when the lexer matches a {. Note a #{ match calls #on_embexpr_beg
.
275 276 277 278 279 280 281 |
# File 'lib/tailor/lexer.rb', line 275 def on_lbrace(token) log "LBRACE: '#{token}'" current_line = LexedLine.new(super, lineno) lbrace_changed notify_lbrace_observers(current_line, lineno, column) super(token) end |
#on_lbracket(token) ⇒ Object
Called when the lexer matches a [.
286 287 288 289 290 291 292 |
# File 'lib/tailor/lexer.rb', line 286 def on_lbracket(token) log "LBRACKET: '#{token}'" current_line = LexedLine.new(super, lineno) lbracket_changed notify_lbracket_observers(current_line, lineno, column) super(token) end |
#on_lparen(token) ⇒ Object
Called when the lexer matches a (.
297 298 299 300 301 302 |
# File 'lib/tailor/lexer.rb', line 297 def on_lparen(token) log "LPAREN: '#{token}'" lparen_changed notify_lparen_observers(lineno, column) super(token) end |
#on_nl(token) ⇒ Object
This is the first thing that exists on a new line–NOT the last!
305 306 307 308 309 310 311 312 313 |
# File 'lib/tailor/lexer.rb', line 305 def on_nl(token) log "NL" current_line = LexedLine.new(super, lineno) nl_changed notify_nl_observers(current_line, lineno, column) super(token) end |
#on_op(token) ⇒ Object
Called when the lexer matches an operator.
318 319 320 321 |
# File 'lib/tailor/lexer.rb', line 318 def on_op(token) log "OP: '#{token}'" super(token) end |
#on_period(token) ⇒ Object
Called when the lexer matches a period.
326 327 328 329 330 331 332 333 |
# File 'lib/tailor/lexer.rb', line 326 def on_period(token) log "PERIOD: '#{token}'" period_changed notify_period_observers(current_line_of_text.length, lineno, column) super(token) end |
#on_qwords_beg(token) ⇒ Object
Called when the lexer matches ‘%w’. Statement is ended by a :on_words_end
.
339 340 341 342 |
# File 'lib/tailor/lexer.rb', line 339 def on_qwords_beg(token) log "QWORDS_BEG: '#{token}'" super(token) end |
#on_rbrace(token) ⇒ Object
Called when the lexer matches a }.
347 348 349 350 351 352 353 354 355 |
# File 'lib/tailor/lexer.rb', line 347 def on_rbrace(token) log "RBRACE: '#{token}'" current_line = LexedLine.new(super, lineno) rbrace_changed notify_rbrace_observers(current_line, lineno, column) super(token) end |
#on_rbracket(token) ⇒ Object
Called when the lexer matches a ].
360 361 362 363 364 365 366 367 368 |
# File 'lib/tailor/lexer.rb', line 360 def on_rbracket(token) log "RBRACKET: '#{token}'" current_line = LexedLine.new(super, lineno) rbracket_changed notify_rbracket_observers(current_line, lineno, column) super(token) end |
#on_regexp_beg(token) ⇒ Object
Called when the lexer matches the beginning of a Regexp.
373 374 375 376 |
# File 'lib/tailor/lexer.rb', line 373 def on_regexp_beg(token) log "REGEXP_BEG: '#{token}'" super(token) end |
#on_regexp_end(token) ⇒ Object
Called when the lexer matches the end of a Regexp.
381 382 383 384 |
# File 'lib/tailor/lexer.rb', line 381 def on_regexp_end(token) log "REGEXP_END: '#{token}'" super(token) end |
#on_rparen(token) ⇒ Object
Called when the lexer matches a ).
389 390 391 392 393 394 395 396 397 |
# File 'lib/tailor/lexer.rb', line 389 def on_rparen(token) log "RPAREN: '#{token}'" current_line = LexedLine.new(super, lineno) rparen_changed notify_rparen_observers(current_line, lineno, column) super(token) end |
#on_semicolon(token) ⇒ Object
Called when the lexer matches a ;.
402 403 404 405 |
# File 'lib/tailor/lexer.rb', line 402 def on_semicolon(token) log "SEMICOLON: '#{token}'" super(token) end |
#on_sp(token) ⇒ Object
Called when the lexer matches any type of space character.
410 411 412 413 414 415 416 417 418 419 420 421 422 423 |
# File 'lib/tailor/lexer.rb', line 410 def on_sp(token) log "SP: '#{token}'; size: #{token.size}" l_token = Tailor::Lexer::Token.new(token) sp_changed notify_sp_observers(l_token, lineno, column) # Deal with lines that end with \ if token == "\\\n" current_line = LexedLine.new(super, lineno) ignored_nl_changed notify_ignored_nl_observers(current_line, lineno, column) end super(token) end |
#on_symbeg(token) ⇒ Object
Called when the lexer matches the : at the beginning of a Symbol.
428 429 430 431 |
# File 'lib/tailor/lexer.rb', line 428 def on_symbeg(token) log "SYMBEG: '#{token}'" super(token) end |
#on_tlambda(token) ⇒ Object
Called when the lexer matches the -> as a lambda.
436 437 438 439 |
# File 'lib/tailor/lexer.rb', line 436 def on_tlambda(token) log "TLAMBDA: '#{token}'" super(token) end |
#on_tlambeg(token) ⇒ Object
Called when the lexer matches the { that represents the beginning of a -> lambda.
445 446 447 448 |
# File 'lib/tailor/lexer.rb', line 445 def on_tlambeg(token) log "TLAMBEG: '#{token}'" super(token) end |
#on_tstring_beg(token) ⇒ Object
Called when the lexer matches the beginning of a String.
453 454 455 456 457 458 459 |
# File 'lib/tailor/lexer.rb', line 453 def on_tstring_beg(token) log "TSTRING_BEG: '#{token}'" current_line = LexedLine.new(super, lineno) tstring_beg_changed notify_tstring_beg_observers(current_line, lineno) super(token) end |
#on_tstring_content(token) ⇒ Object
Called when the lexer matches the content of any String.
464 465 466 467 |
# File 'lib/tailor/lexer.rb', line 464 def on_tstring_content(token) log "TSTRING_CONTENT: '#{token}'" super(token) end |
#on_tstring_end(token) ⇒ Object
Called when the lexer matches the end of a String.
472 473 474 475 476 477 |
# File 'lib/tailor/lexer.rb', line 472 def on_tstring_end(token) log "TSTRING_END: '#{token}'" tstring_end_changed notify_tstring_end_observers(lineno) super(token) end |
#on_words_beg(token) ⇒ Object
Called when the lexer matches ‘%W’.
482 483 484 485 |
# File 'lib/tailor/lexer.rb', line 482 def on_words_beg(token) log "WORDS_BEG: '#{token}'" super(token) end |
#on_words_sep(token) ⇒ Object
Called when the lexer matches the separators in a %w or %W (by default, this is a single space).
491 492 493 494 |
# File 'lib/tailor/lexer.rb', line 491 def on_words_sep(token) log "WORDS_SEP: '#{token}'" super(token) end |