Class: Ferret::Analysis::AsciiWhiteSpaceAnalyzer

Inherits:
Object
  • Object
show all
Defined in:
ext/r_analysis.c

Overview

Summary

The AsciiWhiteSpaceAnalyzer recognizes tokens as maximal strings of non-whitespace characters. If implemented in Ruby the AsciiWhiteSpaceAnalyzer would look like;

class AsciiWhiteSpaceAnalyzer
  def initialize(lower = true)
    @lower = lower
  end

  def token_stream(field, str)
    if @lower
      return AsciiLowerCaseFilter.new(AsciiWhiteSpaceTokenizer.new(str))
    else
      return AsciiWhiteSpaceTokenizer.new(str)
    end
  end
end

As you can see it makes use of the AsciiWhiteSpaceTokenizer. You should use WhiteSpaceAnalyzer if you want to recognize multibyte encodings such as “UTF-8”.