Class: MiniSearch::Pipeline
- Inherits:
-
Object
- Object
- MiniSearch::Pipeline
- Defined in:
- lib/mini_search/pipeline.rb
Overview
All the transformations and normalizations we need to do when indexing a document or searching
Instance Method Summary collapse
- #execute(string) ⇒ Object
-
#initialize(tokenizer, filters) ⇒ Pipeline
constructor
A new instance of Pipeline.
Constructor Details
#initialize(tokenizer, filters) ⇒ Pipeline
Returns a new instance of Pipeline.
7 8 9 10 11 |
# File 'lib/mini_search/pipeline.rb', line 7 def initialize(tokenizer, filters) @standard_tokenizer = MiniSearch::StandardWhitespaceTokenizer.new @tokenizer = tokenizer @filters = filters end |
Instance Method Details
#execute(string) ⇒ Object
13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 |
# File 'lib/mini_search/pipeline.rb', line 13 def execute(string) # Since the filter model expects tokens that are tokenized by # the standard tokenizer, let's use that first. tokens = @standard_tokenizer.execute(string) # Apply filters filters_applied = @filters.reduce(tokens) do |filtered_tokens, filter| filter.execute(filtered_tokens) end # Return if our selected tokenizer is the standard tokenizer return filters_applied if @tokenizer.is_a? MiniSearch::StandardWhitespaceTokenizer # Execute non-standard tokenization after rejoining the tokens # that were tokenized with the StandardWhitespaceTokenizer @tokenizer.execute(filters_applied.join(' ')) end |