Class: Preprocessor::Stemming

Inherits:
Simple
  • Object
show all
Defined in:
lib/svm_helper/preprocessors/stemming.rb

Overview

Preprocessor Base Class

Author:

  • Andreas Eger

Constant Summary

Constants inherited from Simple

Preprocessor::Simple::CODE_TOKEN_FILTER, Preprocessor::Simple::EMAIL_FILTER, Preprocessor::Simple::GENDER_FILTER, Preprocessor::Simple::NEW_LINES, Preprocessor::Simple::STOPWORD_LOCATION, Preprocessor::Simple::SYMBOL_FILTER, Preprocessor::Simple::URL_FILTER, Preprocessor::Simple::WHITESPACE, Preprocessor::Simple::WORDS_IN_BRACKETS, Preprocessor::Simple::XML_TAG_FILTER

Constants included from ParallelHelper

ParallelHelper::THREAD_COUNT

Instance Attribute Summary

Attributes inherited from Simple

#language

Instance Method Summary collapse

Methods inherited from Simple

#clean_title, #process, #strip_stopwords

Methods included from ParallelHelper

#p_map, #p_map_with_index, #parallel?

Constructor Details

#initialize(args = {}) ⇒ Stemming

Returns a new instance of Stemming.



11
12
13
14
# File 'lib/svm_helper/preprocessors/stemming.rb', line 11

def initialize(args={})
  super
  @stemmer = Lingua::Stemmer.new(language: @language)
end

Instance Method Details

#clean_description(desc) ⇒ Object



19
20
21
# File 'lib/svm_helper/preprocessors/stemming.rb', line 19

def clean_description desc
  super.map{|w| @stemmer.stem(w) }
end

#labelObject



15
16
17
# File 'lib/svm_helper/preprocessors/stemming.rb', line 15

def label
  "with_stemming"
end