Class: Clusterer::Bayes

Inherits:
Object
  • Object
show all
Defined in:
lib/clusterer/bayes.rb

Overview

The class Bayes is the base class for implementing different types of Naive Bayes classifier. The initialize method of this class is protected, so objects of this class cannot be instantiated. The Bayesian Formula is P(y|x) = P(x/y) * P(y) / P(x) posterior = likelhood * prior / evidence Given the evidence, we have to predict the posterior. The different Bayesian variants given below calculate likelihood using different methods. While calculating the posterior since the evidence value is same for all the categories, this values is not calculated. Also, posterior distribution over all possible categories sum upto 1.

Direct Known Subclasses

ComplementBayes, MultinomialBayes

Instance Attribute Summary collapse

Instance Method Summary collapse

Dynamic Method Handling

This class handles dynamic methods through the method_missing method

#method_missing(name, *args) ⇒ Object

This method missing helps in having training and untraining method which have the category appended to their front. For example:

train_good document


91
92
93
94
95
96
97
98
# File 'lib/clusterer/bayes.rb', line 91

def method_missing (name, *args)
  if name.to_s =~ /^(un)?train_/
    category = name.to_s.gsub(/(un)?train_/, '').to_sym
    send("#{$1}train",args[0],category)
   else
    super
  end
end

Instance Attribute Details

#categoriesObject

an attribute for storing the different types of classes or categories



35
36
37
# File 'lib/clusterer/bayes.rb', line 35

def categories
  @categories
end

Instance Method Details

#classify(document, weight = nil) ⇒ Object

For an input document returns the prediction in favor of class with the highest probability.



81
82
83
84
# File 'lib/clusterer/bayes.rb', line 81

def classify(document, weight = nil)
  posterior = distribution(document)
  @categories[(0..(@categories.size - 1)).max {|i,j| posterior[i] <=> posterior[j]}]
end