Class: CorpusProcessor::Generators::StanfordNer

Inherits:
Object
  • Object
show all
Defined in:
lib/corpus-processor/generators/stanford_ner.rb

Overview

The generator for Stanford NER corpus.

Generates corpus in the format used by Stanford NER training.

Instance Method Summary collapse

Constructor Details

#initialize(categories = CorpusProcessor::Categories.default) ⇒ StanfordNer

Returns a new instance of StanfordNer.

Parameters:

  • categories (Hash) (defaults to: CorpusProcessor::Categories.default)

    the categories definitions loaded by Categories.



8
9
10
# File 'lib/corpus-processor/generators/stanford_ner.rb', line 8

def initialize categories = CorpusProcessor::Categories.default
  @categories = categories.fetch :output
end

Instance Method Details

#generate(tokens) ⇒ String

Generate the corpus from tokens.

Parameters:

Returns:

  • (String)

    the generated corpus.



17
18
19
20
21
# File 'lib/corpus-processor/generators/stanford_ner.rb', line 17

def generate tokens
  tokens.map { |token|
    "#{ token.word }\t#{ @categories[token.category] }"
  }.join("\n") + "\n"
end