Class: Langchain::Vectorsearch::Chroma
- Defined in:
- lib/langchain/vectorsearch/chroma.rb
Constant Summary
Constants inherited from Base
Instance Attribute Summary
Attributes inherited from Base
Instance Method Summary collapse
-
#add_texts(texts:, ids: [], metadatas: []) ⇒ Hash
Add a list of texts to the index.
-
#ask(question:, k: 4) {|String| ... } ⇒ String
Ask a question and return the answer.
-
#create_default_schema ⇒ ::Chroma::Resources::Collection
Create the collection with the default schema.
-
#destroy_default_schema ⇒ bool
Delete the default schema.
-
#get_default_schema ⇒ ::Chroma::Resources::Collection
Get the default schema.
-
#initialize(url:, index_name:, llm:) ⇒ Chroma
constructor
Initialize the Chroma client.
-
#remove_texts(ids:) ⇒ Hash
Remove a list of texts from the index.
-
#similarity_search(query:, k: 4) ⇒ Chroma::Resources::Embedding
Search for similar texts.
-
#similarity_search_by_vector(embedding:, k: 4) ⇒ Chroma::Resources::Embedding
Search for similar texts by embedding.
- #update_texts(texts:, ids:, metadatas: []) ⇒ Object
Methods inherited from Base
#add_data, #generate_hyde_prompt, #generate_rag_prompt, #similarity_search_with_hyde
Methods included from DependencyHelper
Constructor Details
#initialize(url:, index_name:, llm:) ⇒ Chroma
Initialize the Chroma client
19 20 21 22 23 24 25 26 27 28 29 |
# File 'lib/langchain/vectorsearch/chroma.rb', line 19 def initialize(url:, index_name:, llm:) depends_on "chroma-db" ::Chroma.connect_host = url ::Chroma.logger = Langchain.logger ::Chroma.log_level = Langchain.logger.level @index_name = index_name super(llm: llm) end |
Instance Method Details
#add_texts(texts:, ids: [], metadatas: []) ⇒ Hash
Add a list of texts to the index
36 37 38 39 40 41 42 43 44 45 46 47 48 |
# File 'lib/langchain/vectorsearch/chroma.rb', line 36 def add_texts(texts:, ids: [], metadatas: []) = Array(texts).map.with_index do |text, i| ::Chroma::Resources::Embedding.new( id: ids[i] ? ids[i].to_s : SecureRandom.uuid, embedding: llm.(text: text)., metadata: [i] || {}, document: text # Do we actually need to store the whole original document? ) end collection = ::Chroma::Resources::Collection.get(index_name) collection.add() end |
#ask(question:, k: 4) {|String| ... } ⇒ String
Ask a question and return the answer
127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 |
# File 'lib/langchain/vectorsearch/chroma.rb', line 127 def ask(question:, k: 4, &block) search_results = similarity_search(query: question, k: k) context = search_results.map do |result| result.document end context = context.join("\n---\n") prompt = generate_rag_prompt(question: question, context: context) = [{role: "user", content: prompt}] response = llm.chat(messages: , &block) response.context = context response end |
#create_default_schema ⇒ ::Chroma::Resources::Collection
Create the collection with the default schema
74 75 76 |
# File 'lib/langchain/vectorsearch/chroma.rb', line 74 def create_default_schema ::Chroma::Resources::Collection.create(index_name) end |
#destroy_default_schema ⇒ bool
Delete the default schema
86 87 88 |
# File 'lib/langchain/vectorsearch/chroma.rb', line 86 def destroy_default_schema ::Chroma::Resources::Collection.delete(index_name) end |
#get_default_schema ⇒ ::Chroma::Resources::Collection
Get the default schema
80 81 82 |
# File 'lib/langchain/vectorsearch/chroma.rb', line 80 def get_default_schema ::Chroma::Resources::Collection.get(index_name) end |
#remove_texts(ids:) ⇒ Hash
Remove a list of texts from the index
66 67 68 69 70 |
# File 'lib/langchain/vectorsearch/chroma.rb', line 66 def remove_texts(ids:) collection.delete( ids: ids.map(&:to_s) ) end |
#similarity_search(query:, k: 4) ⇒ Chroma::Resources::Embedding
Search for similar texts
94 95 96 97 98 99 100 101 102 103 104 |
# File 'lib/langchain/vectorsearch/chroma.rb', line 94 def similarity_search( query:, k: 4 ) = llm.(text: query). similarity_search_by_vector( embedding: , k: k ) end |
#similarity_search_by_vector(embedding:, k: 4) ⇒ Chroma::Resources::Embedding
Search for similar texts by embedding
110 111 112 113 114 115 116 117 118 119 120 |
# File 'lib/langchain/vectorsearch/chroma.rb', line 110 def similarity_search_by_vector( embedding:, k: 4 ) # Requesting more results than the number of documents in the collection currently throws an error in Chroma DB # Temporary fix inspired by this comment: https://github.com/chroma-core/chroma/issues/301#issuecomment-1520494512 count = collection.count n_results = [count, k].min collection.query(query_embeddings: [], results: n_results) end |
#update_texts(texts:, ids:, metadatas: []) ⇒ Object
50 51 52 53 54 55 56 57 58 59 60 61 |
# File 'lib/langchain/vectorsearch/chroma.rb', line 50 def update_texts(texts:, ids:, metadatas: []) = Array(texts).map.with_index do |text, i| ::Chroma::Resources::Embedding.new( id: ids[i].to_s, embedding: llm.(text: text)., metadata: [i] || {}, document: text # Do we actually need to store the whole original document? ) end collection.update() end |