Class: Annoy::AnnoyIndex

Inherits:
Object
  • Object
show all
Defined in:
lib/annoy.rb

Overview

AnnoyIndex is a class that provides functions for k-nearest neighbors search. The methods in this class are implemented similarly to Annoy’s Python API (github.com/spotify/annoy#full-python-api).

Examples:

require 'annoy'

index = AnnoyIndex.new(n_features: 100, metric: 'euclidean')

5000.times do |item_id|
  item_vec = Array.new(100) { rand - 0.5 }
  index.add_item(item_id, item_vec)
end

index.build(10)

index.get_nns_by_item(0, 100)

Instance Attribute Summary collapse

Instance Method Summary collapse

Constructor Details

#initialize(n_features:, metric: 'angular') ⇒ AnnoyIndex

Create a new search index.

Parameters:

  • n_features (Integer)

    The number of features (dimensions) of stored vector.

  • metric (String) (defaults to: 'angular')

    The distance metric between vectors (‘angular’, ‘dot’, ‘hamming’, ‘euclidean’, or ‘manhattan’).

Raises:

  • (ArgumentError)


37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
# File 'lib/annoy.rb', line 37

def initialize(n_features:, metric: 'angular')
  raise ArgumentError, 'Expect n_features to be Integer.' unless n_features.is_a?(Numeric)

  @n_features = n_features.to_i
  @metric = metric

  @index = case @metric
           when 'angular'
             AnnoyIndexAngular.new(@n_features)
           when 'dot'
             AnnoyIndexDotProduct.new(@n_features)
           when 'hamming'
             AnnoyIndexHamming.new(@n_features)
           when 'euclidean'
             AnnoyIndexEuclidean.new(@n_features)
           when 'manhattan'
             AnnoyIndexManhattan.new(@n_features)
           else
             raise ArgumentError, "No such metric: #{@metric}."
           end
end

Instance Attribute Details

#metricString (readonly)

Returns the metric of index.

Returns:

  • (String)


31
32
33
# File 'lib/annoy.rb', line 31

def metric
  @metric
end

#n_featuresInteger (readonly)

Returns the number of features of indexed item.

Returns:

  • (Integer)


27
28
29
# File 'lib/annoy.rb', line 27

def n_features
  @n_features
end

Instance Method Details

#add_item(i, v) ⇒ Boolean

Add item to be indexed.

Parameters:

  • i (Integer)

    The ID of item.

  • v (Array)

    The vector of item.

Returns:

  • (Boolean)


64
65
66
# File 'lib/annoy.rb', line 64

def add_item(i, v)
  @index.add_item(i, v)
end

#build(n_trees, n_jobs: -1)) ⇒ Boolean

Build a forest of index trees. After building, no more items can be added.

Parameters:

  • n_trees (Integer)

    The number of trees. More trees gives higher search precision.

  • n_jobs (Integer) (defaults to: -1))

    The number of threads used to build the trees. If -1 is given, uses all available CPU cores.

Returns:

  • (Boolean)


73
74
75
# File 'lib/annoy.rb', line 73

def build(n_trees, n_jobs: -1)
  @index.build(n_trees, n_jobs)
end

#get_distance(i, j) ⇒ Float or Integer

Calculate the distances between items.

Parameters:

  • i (Integer)

    The ID of item.

  • j (Integer)

    The ID of item.

Returns:

  • (Float or Integer)


136
137
138
# File 'lib/annoy.rb', line 136

def get_distance(i, j)
  @index.get_distance(i, j)
end

#get_item(i) ⇒ Array

Return the item vector.

Parameters:

  • i (Integer)

    The ID of item.

Returns:

  • (Array)


127
128
129
# File 'lib/annoy.rb', line 127

def get_item(i)
  @index.get_item(i)
end

#get_nns_by_item(i, n, search_k: -1,, include_distances: false) ⇒ Array<Integer> or Array<Array<Integer>, Array<Float>>

Search the n closest items.

Parameters:

  • i (Integer)

    The ID of query item.

  • n (Integer)

    The number of nearest neighbors.

  • search_k (Integer) (defaults to: -1,)

    The maximum number of nodes inspected during the search. If -1 is given, it sets to n * n_trees.

  • include_distances (Boolean) (defaults to: false)

    The flag indicating whether to returns all corresponding distances.

Returns:

  • (Array<Integer> or Array<Array<Integer>, Array<Float>>)


108
109
110
# File 'lib/annoy.rb', line 108

def get_nns_by_item(i, n, search_k: -1, include_distances: false)
  @index.get_nns_by_item(i, n, search_k, include_distances)
end

#get_nns_by_vector(v, n, search_k: -1,, include_distances: false) ⇒ Array<Integer> or Array<Array<Integer>, Array<Float>>

Search the n closest items.

Parameters:

  • v (Array)

    The vector of query item.

  • n (Integer)

    The number of nearest neighbors.

  • search_k (Integer) (defaults to: -1,)

    The maximum number of nodes inspected during the search. If -1 is given, it sets to n * n_trees.

  • include_distances (Boolean) (defaults to: false)

    The flag indicating whether to returns all corresponding distances.

Returns:

  • (Array<Integer> or Array<Array<Integer>, Array<Float>>)


119
120
121
# File 'lib/annoy.rb', line 119

def get_nns_by_vector(v, n, search_k: -1, include_distances: false)
  @index.get_nns_by_vector(v, n, search_k, include_distances)
end

#load(filename, prefault: false) ⇒ Boolean

Load a search index from disk.

Parameters:

  • filename (String)

    The filename of search index.

  • prefault (Boolean) (defaults to: false)

    The flag indicating whether to pre-read the entire file into memory.

Returns:

  • (Boolean)


90
91
92
# File 'lib/annoy.rb', line 90

def load(filename, prefault: false)
  @index.load(filename, prefault)
end

#n_itemsInteger

Return the number of items in the search index.

Returns:

  • (Integer)


142
143
144
# File 'lib/annoy.rb', line 142

def n_items
  @index.get_n_items
end

#n_treesInteger

Return the number of trees in the search index.

Returns:

  • (Integer)


148
149
150
# File 'lib/annoy.rb', line 148

def n_trees
  @index.get_n_trees
end

#on_disk_build(filename) ⇒ Boolean

Prepare annoy to build the index in the specified file instead of RAM. (call this method before adding items, no need to save after building).

Parameters:

  • filename (String)

    The filename of search index.

Returns:

  • (Boolean)


157
158
159
# File 'lib/annoy.rb', line 157

def on_disk_build(filename)
  @index.on_disk_build(filename)
end

#save(filename, prefault: false) ⇒ Boolean

Save the search index to disk. After saving, no more items can be added.

Parameters:

  • filename (String)

    The filename of search index.

Returns:

  • (Boolean)


81
82
83
# File 'lib/annoy.rb', line 81

def save(filename, prefault: false)
  @index.save(filename, prefault)
end

#seed(s) ⇒ Object

Set seed for the random number generator.

Parameters:

  • s (Integer)


171
172
173
# File 'lib/annoy.rb', line 171

def seed(s)
  @index.set_seed(s)
end

#unloadBoolean

Unload the search index.

Returns:

  • (Boolean)


97
98
99
# File 'lib/annoy.rb', line 97

def unload
  @index.unload
end

#verbose(flag) ⇒ Object

Set to verbose mode.

Parameters:

  • flag (Boolean)


164
165
166
# File 'lib/annoy.rb', line 164

def verbose(flag)
  @index.verbose(flag)
end