Module: Wordlist::Lexer::StopWords

Defined in:
lib/wordlist/lexer/stop_words.rb

Overview

Stop words for various languages.

Since:

  • 1.0.0

Constant Summary collapse

DIRECTORY =

The directory containing the stop words .txt files.

Since:

  • 1.0.0

::File.expand_path(::File.join(__dir__,'..','..','..','data','stop_words'))

Class Method Summary collapse

Class Method Details

.[](lang) ⇒ Array<String>

Lazy loads the stop words for the given language.

Parameters:

  • lang (Symbol)

    The language to load.

Returns:

  • (Array<String>)

Since:

  • 1.0.0



62
63
64
65
66
# File 'lib/wordlist/lexer/stop_words.rb', line 62

def self.[](lang)
  @mutex.synchronize do
    @stop_words[lang] ||= read(lang)
  end
end

.path_for(lang) ⇒ String

The path to the stop words .txt file.

Parameters:

  • lang (Symbol)

    The language to load.

Returns:

  • (String)

Since:

  • 1.0.0



25
26
27
# File 'lib/wordlist/lexer/stop_words.rb', line 25

def self.path_for(lang)
  ::File.join(DIRECTORY,"#{lang}.txt")
end

.read(lang) ⇒ Array<String>

Reads the stop words.

Parameters:

  • lang (Symbol)

    The language to load.

Returns:

  • (Array<String>)

Raises:

Since:

  • 1.0.0



39
40
41
42
43
44
45
46
47
48
49
# File 'lib/wordlist/lexer/stop_words.rb', line 39

def self.read(lang)
  path = path_for(lang)

  unless ::File.file?(path)
    raise(UnsupportedLanguage,"unsupported language: #{lang}")
  end

  lines = ::File.readlines(path)
  lines.each(&:chomp!)
  lines
end