Class: SVMKit::NearestNeighbors::KNeighborsClassifier

Inherits:
Object
  • Object
show all
Includes:
Base::BaseEstimator, Base::Classifier
Defined in:
lib/svmkit/nearest_neighbors/k_neighbors_classifier.rb

Overview

KNeighborsClassifier is a class that implements the classifier with the k-nearest neighbors rule. The current implementation uses the Euclidean distance for finding the neighbors.

Examples:

estimator =
  SVMKit::NearestNeighbor::KNeighborsClassifier.new(n_neighbors = 5)
estimator.fit(training_samples, traininig_labels)
results = estimator.predict(testing_samples)

Instance Attribute Summary collapse

Attributes included from Base::BaseEstimator

#params

Instance Method Summary collapse

Constructor Details

#initialize(n_neighbors: 5) ⇒ KNeighborsClassifier

Create a new classifier with the nearest neighbor rule.

Parameters:

  • n_neighbors (Integer) (defaults to: 5)

    The number of neighbors.



35
36
37
38
39
40
41
# File 'lib/svmkit/nearest_neighbors/k_neighbors_classifier.rb', line 35

def initialize(n_neighbors: 5)
  @params = {}
  @params[:n_neighbors] = n_neighbors
  @prototypes = nil
  @labels = nil
  @classes = nil
end

Instance Attribute Details

#classesNumo::Int32 (readonly)

Return the class labels.

Returns:

  • (Numo::Int32)

    (size: n_classes)



30
31
32
# File 'lib/svmkit/nearest_neighbors/k_neighbors_classifier.rb', line 30

def classes
  @classes
end

#labelsNumo::Int32 (readonly)

Return the labels of the prototypes

Returns:

  • (Numo::Int32)

    (size: n_samples)



26
27
28
# File 'lib/svmkit/nearest_neighbors/k_neighbors_classifier.rb', line 26

def labels
  @labels
end

#prototypesNumo::DFloat (readonly)

Return the prototypes for the nearest neighbor classifier.

Returns:

  • (Numo::DFloat)

    (shape: [n_samples, n_features])



22
23
24
# File 'lib/svmkit/nearest_neighbors/k_neighbors_classifier.rb', line 22

def prototypes
  @prototypes
end

Instance Method Details

#decision_function(x) ⇒ Numo::DFloat

Calculate confidence scores for samples.

Parameters:

  • x (Numo::DFloat)

    (shape: [n_samples, n_features]) The samples to compute the scores.

Returns:

  • (Numo::DFloat)

    (shape: [n_samples, n_classes]) Confidence scores per sample for each class.



59
60
61
62
63
64
65
66
67
68
69
70
# File 'lib/svmkit/nearest_neighbors/k_neighbors_classifier.rb', line 59

def decision_function(x)
  distance_matrix = PairwiseMetric.euclidean_distance(x, @prototypes)
  n_samples, n_prototypes = distance_matrix.shape
  n_classes = @classes.size
  n_neighbors = [@params[:n_neighbors], n_prototypes].min
  scores = Numo::DFloat.zeros(n_samples, n_classes)
  n_samples.times do |m|
    neighbor_ids = distance_matrix[m, true].to_a.each_with_index.sort.map(&:last)[0...n_neighbors]
    neighbor_ids.each { |n| scores[m, @classes.to_a.index(@labels[n])] += 1.0 }
  end
  scores
end

#fit(x, y) ⇒ KNeighborsClassifier

Fit the model with given training data.

Parameters:

  • x (Numo::DFloat)

    (shape: [n_samples, n_features]) The training data to be used for fitting the model.

  • y (Numo::Int32)

    (shape: [n_samples]) The labels to be used for fitting the model.

Returns:



48
49
50
51
52
53
# File 'lib/svmkit/nearest_neighbors/k_neighbors_classifier.rb', line 48

def fit(x, y)
  @prototypes = Numo::DFloat.asarray(x.to_a)
  @labels = Numo::Int32.asarray(y.to_a)
  @classes = Numo::Int32.asarray(y.to_a.uniq.sort)
  self
end

#marshal_dumpHash

Dump marshal data.

Returns:

  • (Hash)

    The marshal data about KNeighborsClassifier.



95
96
97
98
99
100
# File 'lib/svmkit/nearest_neighbors/k_neighbors_classifier.rb', line 95

def marshal_dump
  { params: params,
    prototypes: @prototypes,
    labels: @labels,
    classes: @classes }
end

#marshal_load(obj) ⇒ nil

Load marshal data.

Returns:

  • (nil)


104
105
106
107
108
109
110
# File 'lib/svmkit/nearest_neighbors/k_neighbors_classifier.rb', line 104

def marshal_load(obj)
  @params = obj[:params]
  @prototypes = obj[:prototypes]
  @labels = obj[:labels]
  @classes = obj[:classes]
  nil
end

#predict(x) ⇒ Numo::Int32

Predict class labels for samples.

Parameters:

  • x (Numo::DFloat)

    (shape: [n_samples, n_features]) The samples to predict the labels.

Returns:

  • (Numo::Int32)

    (shape: [n_samples]) Predicted class label per sample.



76
77
78
79
80
# File 'lib/svmkit/nearest_neighbors/k_neighbors_classifier.rb', line 76

def predict(x)
  n_samples = x.shape.first
  decision_values = decision_function(x)
  Numo::Int32.asarray(Array.new(n_samples) { |n| @classes[decision_values[n, true].max_index] })
end

#score(x, y) ⇒ Float

Claculate the mean accuracy of the given testing data.

Parameters:

  • x (Numo::DFloat)

    (shape: [n_samples, n_features]) Testing data.

  • y (Numo::Int32)

    (shape: [n_samples]) True labels for testing data.

Returns:

  • (Float)

    Mean accuracy



87
88
89
90
91
# File 'lib/svmkit/nearest_neighbors/k_neighbors_classifier.rb', line 87

def score(x, y)
  p = predict(x)
  n_hits = (y.to_a.map.with_index { |l, n| l == p[n] ? 1 : 0 }).inject(:+)
  n_hits / y.size.to_f
end