Class: Linnaeus::Persistence

Inherits:
Linnaeus show all
Defined in:
lib/linnaeus/persistence.rb

Overview

The redis persistence layer.

Instance Attribute Summary collapse

Instance Method Summary collapse

Methods inherited from Linnaeus

#count_word_occurrences

Constructor Details

#initialize(opts = {}) ⇒ Persistence

Returns a new instance of Persistence.



5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
# File 'lib/linnaeus/persistence.rb', line 5

def initialize(opts = {})
  options = {
    redis_host: '127.0.0.1',
    redis_port: '6379',
    redis_db: 0,
    redis_scheme: "redis",
    redis_path: nil,
    redis_timeout: 5.0,
    redis_password: nil,
    redis_id: nil,
    redis_tcp_keepalive: 0,
    scope: nil
  }.merge(opts)

  @scope = options[:scope]

  if options[:redis_connection]
    @redis = options[:redis_connection]
  else
    @redis = Redis.new(
      host: options[:redis_host],
      port: options[:redis_port],
      db: options[:redis_db],
      scheme: options[:redis_scheme],
      path: options[:redis_path],
      timeout: options[:redis_timeout],
      password: options[:redis_password],
      id: options[:redis_id],
      tcp_keepalive: options[:redis_tcp_keepalive]
    )
  end

  self
end

Instance Attribute Details

#redisObject

Returns the value of attribute redis.



3
4
5
# File 'lib/linnaeus/persistence.rb', line 3

def redis
  @redis
end

Instance Method Details

#add_categories(categories) ⇒ Object

Add categories to the bayesian corpus.

Parameters

categories

A string or array of categories.



45
46
47
# File 'lib/linnaeus/persistence.rb', line 45

def add_categories(categories)
  @redis.sadd category_collection_key, categories
end

#cleanup_empty_words_in_category(category) ⇒ Object

Clean out words with a count of zero in a category. Used during untraining.

Parameters

category

A string representing a category.



124
125
126
127
128
129
130
131
132
133
134
# File 'lib/linnaeus/persistence.rb', line 124

def cleanup_empty_words_in_category(category)
  word_counts = @redis.hgetall base_category_key + category
  empty_words = word_counts.select{|word, count| count.to_i <= 0}
  if empty_words == word_counts
    @redis.del base_category_key + category
  else
    if empty_words.any?
      @redis.hdel base_category_key + category, empty_words.keys
    end
  end
end

#clear_all_training_dataObject

Clear all training data from the backend.



80
81
82
# File 'lib/linnaeus/persistence.rb', line 80

def clear_all_training_data
  @redis.flushdb
end

#clear_training_dataObject

Clear training data for the scope associated with this instance.



85
86
87
88
89
90
91
# File 'lib/linnaeus/persistence.rb', line 85

def clear_training_data
   keys = @redis.keys(base_key.join(':') + '*')

   keys.each do |key|
     @redis.del key
   end
end

#decrement_word_counts_for_category(category, word_occurrences) ⇒ Object

Decrement word counts within a category. This is used when removing a document from the corpus.

Parameters

category

A string representing a category.

word_occurrences

A hash containing a count of the number of word occurences in a document



113
114
115
116
117
# File 'lib/linnaeus/persistence.rb', line 113

def decrement_word_counts_for_category(category, word_occurrences)
  word_occurrences.each do|word,count|
    @redis.hincrby base_category_key + category, word, - count
  end
end

#get_categoriesObject

Get categories from the bayesian corpus

Parameters

categories

A string or array of categories.



63
64
65
# File 'lib/linnaeus/persistence.rb', line 63

def get_categories
  @redis.smembers category_collection_key
end

#get_words_with_count_for_category(category) ⇒ Object

Get a list of words with their number of occurrences.

Parameters

category

A string representing a category.

Returns

A hash with the word counts for this category.



75
76
77
# File 'lib/linnaeus/persistence.rb', line 75

def get_words_with_count_for_category(category)
  @redis.hgetall base_category_key + category
end

#increment_word_counts_for_category(category, word_occurrences) ⇒ Object

Increment word counts within a category

Parameters

category

A string representing a category.

word_occurrences

A hash containing a count of the number of word occurences in a document



100
101
102
103
104
# File 'lib/linnaeus/persistence.rb', line 100

def increment_word_counts_for_category(category, word_occurrences)
  word_occurrences.each do|word,count|
    @redis.hincrby base_category_key + category, word, count
  end
end

#remove_category(category) ⇒ Object

Remove categories from the bayesian corpus

Parameters

categories

A string or array of categories.



54
55
56
# File 'lib/linnaeus/persistence.rb', line 54

def remove_category(category)
  @redis.srem category_collection_key, category
end