Class: Corpus
- Inherits:
-
Object
- Object
- Corpus
- Defined in:
- lib/engine/corpus.rb
Instance Method Summary collapse
- #add(document) ⇒ Object
- #entry_count ⇒ Object
-
#initialize ⇒ Corpus
constructor
A new instance of Corpus.
- #load_from_directory(directory) ⇒ Object
- #token_count(token) ⇒ Object
Constructor Details
#initialize ⇒ Corpus
Returns a new instance of Corpus.
5 6 7 |
# File 'lib/engine/corpus.rb', line 5 def initialize @tokens = {} end |
Instance Method Details
#add(document) ⇒ Object
13 14 15 16 17 |
# File 'lib/engine/corpus.rb', line 13 def add document document.each_token do |token| @tokens[token] = token_count(token) + 1 end end |
#entry_count ⇒ Object
9 10 11 |
# File 'lib/engine/corpus.rb', line 9 def entry_count @tokens.values.inject(0, :+) end |
#load_from_directory(directory) ⇒ Object
19 20 21 22 23 24 25 |
# File 'lib/engine/corpus.rb', line 19 def load_from_directory directory Dir.glob("#{directory}/*.txt") do |entry| IO.foreach(entry, encoding: Encoding::UTF_8) do |line| add Document.new(line) end end end |
#token_count(token) ⇒ Object
27 28 29 |
# File 'lib/engine/corpus.rb', line 27 def token_count token @tokens[token] || 0 end |