Class: Treebank::TokenStream
- Inherits:
-
Object
- Object
- Treebank::TokenStream
- Includes:
- Enumerable
- Defined in:
- lib/treebank.rb
Overview
An enumerable list of tokens in a string representation of a tree
This class provides a way of enumerating over a source to produce tokens that can be used in parsing a string representation of a tree. The source is an enumerable object whose each function returns a sequence of String objects, for example a file or a single String. Each returned string is delimited by left and right brackets and whitespace. The default brackets are ‘(’ and ‘)’, but different delimiters may be specified in the constructor.
Treebank::TokenStream.new('(A (B c) (D))').collect
=> ["(", "A", "(", "B", "c", ")", "(", "D", ")", ")"]
Instance Attribute Summary collapse
-
#left ⇒ Object
readonly
The left delimiter.
-
#right ⇒ Object
readonly
The right delimiter.
Instance Method Summary collapse
-
#each ⇒ Object
Enumerate the tokens in the source.
-
#initialize(source, left = '(', right = ')') ⇒ TokenStream
constructor
Create a stream of tokens from an enumerable source.
Constructor Details
#initialize(source, left = '(', right = ')') ⇒ TokenStream
Create a stream of tokens from an enumerable source.
- source
-
The string stream to tokenize
- left
-
Left bracket symbol
- right
-
Right bracket symbol
50 51 52 53 54 55 56 57 58 59 60 |
# File 'lib/treebank.rb', line 50 def initialize(source, left = '(', right = ')') @source = source @left = left @right = right # Escape the '[' and ']' characters in the character class # regular expression. cc_left = (left == '[') ? "\\#{left}" : left cc_right = (right == ']') ? "\\#{right}" : right # Delimit by left and right brackets, e.g. /\(|\)|[^()]/ @s_regex = Regexp.new("\\#{@left}|\\#{@right}|[^#{cc_left}#{cc_right}]+") end |
Instance Attribute Details
#left ⇒ Object (readonly)
The left delimiter
40 41 42 |
# File 'lib/treebank.rb', line 40 def left @left end |
#right ⇒ Object (readonly)
The right delimiter
43 44 45 |
# File 'lib/treebank.rb', line 43 def right @right end |
Instance Method Details
#each ⇒ Object
Enumerate the tokens in the source.
63 64 65 66 67 |
# File 'lib/treebank.rb', line 63 def each @source.each do |string| tokenize_string(string) {|token| yield token} end end |