Class: Ferret::Analysis::AsciiWhiteSpaceTokenizer

Inherits:
Object
  • Object
show all
Defined in:
ext/r_analysis.c

Overview

Summary

A WhiteSpaceTokenizer is a tokenizer that divides text at white-space. Adjacent sequences of non-WhiteSpace characters form tokens.

Example

"Dave's résumé, at http://www.davebalmain.com/ 1234"
  => ["Dave's", "résumé,", "at", "http://www.davebalmain.com", "1234"]