Module: Yoga::Scanner
- Defined in:
- lib/yoga/scanner.rb
Overview
A scanner. This performs scanning over a series of tokens. It is built to lazily scan whenever it is required, instead of all at once. This integrates nicely with the parser.
Constant Summary collapse
- LINE =
A regular expression to match all kinds of lines. All of them.
/\r\n|\n\r|\n|\r/
Instance Attribute Summary collapse
-
#file ⇒ ::String
readonly
The file of the scanner.
Instance Method Summary collapse
- #call {|eof_token| ... } ⇒ Object
-
#current_line ⇒ ::Numeric
protected
Returns the number of lines that have been covered so far in the scanner.
-
#emit(kind, source = @scanner[0]) ⇒ Yoga::Token
protected
Creates a scanner token with the given name and source.
-
#eof_token ⇒ Yoga::Token
protected
Returns a token that denotes that the scanner is done scanning.
-
#initialize(source, file = "<anon>") ⇒ Object
Initializes the scanner with the given source.
-
#location(size = 0) ⇒ Yoga::Location
protected
Returns a location at the given location.
-
#match(matcher, kind = :"#{matcher}") ⇒ Yoga::Token?
protected
Attempts to match the given token.
-
#match_line(kind = false) ⇒ Boolean
protected
Matches a line.
-
#scan ⇒ Yoga::Token, true
abstract
The scanning method.
-
#symbol_negative_assertion ⇒ #to_s
protected
The negative assertion used for converting a symbol matcher to a regular expression.
-
#update_line_information ⇒ void
protected
private
Updates the line information for the scanner.
Instance Attribute Details
#file ⇒ ::String (readonly)
The file of the scanner. This can be overwritten to provide a descriptor for the file.
15 16 17 |
# File 'lib/yoga/scanner.rb', line 15 def file @file end |
Instance Method Details
#call {|token| ... } ⇒ self #call ⇒ ::Enumerable<Scanner::Token>
38 39 40 41 42 43 44 45 46 47 48 49 50 |
# File 'lib/yoga/scanner.rb', line 38 def call return to_enum(:call) unless block_given? @scanner = StringScanner.new(@source) @line = 1 until @scanner.eos? value = scan yield value unless value == true || !value end yield eof_token self end |
#current_line ⇒ ::Numeric (protected)
Returns the number of lines that have been covered so far in the scanner. I recommend replacing this with an instance variable that caches the result of it, so that whenever you scan a new line, it just increments the line count.
143 144 145 146 |
# File 'lib/yoga/scanner.rb', line 143 def current_line # @scanner.string[[email protected]].scan(/\A|\r\n|\n\r|\n|\r/).size @line end |
#emit(kind, source = @scanner[0]) ⇒ Yoga::Token (protected)
Creates a scanner token with the given name and source. This grabs the location using #location, setting the size to the size of the source text. The source is frozen before initializing the token.
92 93 94 |
# File 'lib/yoga/scanner.rb', line 92 def emit(kind, source = @scanner[0]) Token.new(kind.freeze, source.freeze, location(source.length)) end |
#eof_token ⇒ Yoga::Token (protected)
Returns a token that denotes that the scanner is done scanning.
162 163 164 |
# File 'lib/yoga/scanner.rb', line 162 def eof_token emit(:EOF, "") end |
#initialize(source, file = "<anon>") ⇒ Object
Initializes the scanner with the given source. Once the source is set, it shouldn't be changed.
22 23 24 25 26 27 |
# File 'lib/yoga/scanner.rb', line 22 def initialize(source, file = "<anon>") @source = source @file = file @line = 1 @last_line_at = 0 end |
#location(size = 0) ⇒ Yoga::Location (protected)
Returns a location at the given location. If a size is given, it reduces the column number by the size and returns the size from that.
79 80 81 82 83 |
# File 'lib/yoga/scanner.rb', line 79 def location(size = 0) start = (@scanner.charpos - @last_line_at) + 1 column = (start - size)..start Location.new(file, current_line, column) end |
#match(matcher, kind = :"#{matcher}") ⇒ Yoga::Token? (protected)
Attempts to match the given token. The first argument can be a string,
a symbol, or a regular expression. If the matcher is a symbol, it's
coerced into a regular expression, with a forward negative assertion for
any alphanumeric characters, to prevent partial matches (see
#symbol_negative_assertion). If the matcher is a regular expression,
it is left alone. Otherwise, #to_s
is called and passed to
Regexp.escape
. If the text is matched at the current position, a token
is returned; otherwise, nil is returned. If a newline is matched within
a match, the scanner automatically updates the line and column
information.
116 117 118 119 120 121 122 123 124 125 126 127 |
# File 'lib/yoga/scanner.rb', line 116 def match(matcher, kind = :"#{matcher}") matcher = case matcher when ::Symbol then /#{::Regexp.escape(matcher.to_s)}#{symbol_negative_assertion}/ when ::Regexp then matcher else /#{::Regexp.escape(matcher.to_s)}/ end return unless @scanner.scan(matcher) update_line_information ((kind && emit(kind)) || true) end |
#match_line(kind = false) ⇒ Boolean (protected)
Matches a line. This is separate in order to allow internal logic, such as line counting and caching, to be performed.
133 134 135 |
# File 'lib/yoga/scanner.rb', line 133 def match_line(kind = false) match(LINE, kind) end |
#scan ⇒ Yoga::Token, true
Please implement this method in order to make the class a scanner.
The scanning method. This should return one of two values: a Token,
or true
. nil
should never be returned. This performs an
incremental scan of the document; it returns one token at a time. If
something matched, but should not emit a token, true
should be
returned. The implementing class should mark this as private or
protected.
62 63 64 |
# File 'lib/yoga/scanner.rb', line 62 def scan fail NotImplementedError, "Please implement #{self.class}#scan" end |
#symbol_negative_assertion ⇒ #to_s (protected)
The negative assertion used for converting a symbol matcher to a regular
expression. This is used to prevent premature matching of other
identifiers. For example, if module
is a keyword, and moduleA
is
an identifier, this negative assertion allows the following expression
to properly match as such: match(:module) || module(/[a-zA-Z], :IDENT)
.
155 156 157 |
# File 'lib/yoga/scanner.rb', line 155 def symbol_negative_assertion "(?![a-zA-Z])" end |
#update_line_information ⇒ void (protected)
This method is part of a private API. You should avoid using this method if possible, as it may be removed or be changed in the future.
This method returns an undefined value.
Updates the line information for the scanner. This is called for any successful matches.
171 172 173 174 175 176 |
# File 'lib/yoga/scanner.rb', line 171 def update_line_information return unless (lines = @scanner[0].scan(LINE)).any? @line += lines.size line_index = @scanner.string.rindex(LINE, @scanner.charpos) @last_line_at = line_index < 0 ? 0 : line_index + 1 end |