Class: Knn
- Inherits:
-
Object
- Object
- Knn
- Defined in:
- lib/phisher/knn.rb
Overview
Knn : K-Nearest-Neighbor
the KNN algorithm is very simple Given a set of labeled training data, <x,f(x)> , a new input will be compared with each x to determine the distance. After this the class of the k-closest distances will be chosen
Usage Example:
knn = Knn.new
print “training Knn… ” 10.times do |i|
klazz = 0
klazz = 1 if i >= 5
knn.train([i],klazz)
end
puts “[done]” knn.data_set.each_with_index {|klass,index| p “class #{index}: #klass”}
puts “Classifying a few inputs” 20.times do |i|
test = i.to_f/2
print "#{test} =>"
puts knn.classify([test])
end
Instance Attribute Summary collapse
-
#default_distance ⇒ Object
readonly
Returns the value of attribute default_distance.
-
#training_set ⇒ Object
readonly
Returns the value of attribute training_set.
Instance Method Summary collapse
-
#classify(data, k, &distance) ⇒ Object
Returns the class closest to the data point for a given K.
-
#initialize ⇒ Knn
constructor
A new instance of Knn.
-
#train(data, label) ⇒ Object
Classifies an array with the given label.
Constructor Details
#initialize ⇒ Knn
Returns a new instance of Knn.
37 38 39 40 41 42 43 44 45 |
# File 'lib/phisher/knn.rb', line 37 def initialize() @training_set = [] @default_distance = lambda do |array1, array2| squares_sum = array1.zip(array2).map do |item| (item[0] - item[1])**2 end Math.sqrt(squares_sum.reduce(:+)) end end |
Instance Attribute Details
#default_distance ⇒ Object (readonly)
Returns the value of attribute default_distance.
35 36 37 |
# File 'lib/phisher/knn.rb', line 35 def default_distance @default_distance end |
#training_set ⇒ Object (readonly)
Returns the value of attribute training_set.
34 35 36 |
# File 'lib/phisher/knn.rb', line 34 def training_set @training_set end |
Instance Method Details
#classify(data, k, &distance) ⇒ Object
Returns the class closest to the data point for a given K
Arguments:
{Array} data an array
{integer} k the number of classes to consider
{block} distance an optional block in case you want
to provide a custom distance function
Returns:
The class that the data array should belong to
59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 |
# File 'lib/phisher/knn.rb', line 59 def classify(data, k, &distance) if distance == nil distance = @default_distance end distances = @training_set.map do |training_point| [ distance.call(training_point.data, data), training_point.label ] end sorted_distances = distances.sort nearest_neightbors = sorted_distances.first(k) classes = nearest_neightbors.map { |neighbor| neighbor[1] } class_frequencies = get_class_frequencies(classes) most_frequent(class_frequencies) end |
#train(data, label) ⇒ Object
Classifies an array with the given label.
Arguments:
{Array} data the array that will be labeled
{symbol} label an identifier for the label
Returns:
An instance of the training set
83 84 85 86 |
# File 'lib/phisher/knn.rb', line 83 def train(data, label) training_point = TrainingPoint.new data, label @training_set.push training_point end |