Class: Tokipona::Tokenizer
- Inherits:
-
Object
- Object
- Tokipona::Tokenizer
- Defined in:
- lib/tokipona/tokenizer.rb
Overview
Splits text into tokens (words and punctuations).
Class Method Summary collapse
Instance Method Summary collapse
-
#initialize(text) ⇒ Tokenizer
constructor
A new instance of Tokenizer.
- #tokenize ⇒ Object
Constructor Details
#initialize(text) ⇒ Tokenizer
Returns a new instance of Tokenizer.
15 16 17 |
# File 'lib/tokipona/tokenizer.rb', line 15 def initialize(text) @text = text end |
Class Method Details
.tokenize(text) ⇒ Array<String>
11 12 13 |
# File 'lib/tokipona/tokenizer.rb', line 11 def self.tokenize(text) new(text).tokenize end |
Instance Method Details
#tokenize ⇒ Object
19 20 21 |
# File 'lib/tokipona/tokenizer.rb', line 19 def tokenize @text.scan(/\w+|[^\s]/) end |