Class: BxBuilderChain::Vectorsearch::Base
- Inherits:
-
Object
- Object
- BxBuilderChain::Vectorsearch::Base
- Includes:
- DependencyHelper
- Defined in:
- lib/bx_builder_chain/vectorsearch/base.rb
Overview
Vector Databases
A vector database a type of database that stores data as high-dimensional vectors, which are mathematical representations of features or attributes. Each vector has a certain number of dimensions, which can range from tens to thousands, depending on the complexity and granularity of the data.
Available vector databases
Usage
-
Pick a vector database from list.
-
Review its documentation to install the required gems, and create an account, get an API key, etc
-
Instantiate the vector database class:
weaviate = BxBuilderChain::Vectorsearch::Weaviate.new( url: ENV["WEAVIATE_URL"], api_key: ENV["WEAVIATE_API_KEY"], table_name: "Documents", llm: :openai, # or :cohere, :hugging_face, :google_palm, or :replicate llm_api_key: ENV["OPENAI_API_KEY"] # API key for the selected LLM ) # You can instantiate other supported vector databases the same way: milvus = BxBuilderChain::Vectorsearch::Milvus.new(...) qdrant = BxBuilderChain::Vectorsearch::Qdrant.new(...) pinecone = BxBuilderChain::Vectorsearch::Pinecone.new(...) chrome = BxBuilderChain::Vectorsearch::Chroma.new(...) pgvector = BxBuilderChain::Vectorsearch::Pgvector.new(...)
Schema Creation
‘create_default_schema()` creates default schema in your vector database.
search.create_default_schema
(We plan on offering customizable schema creation shortly)
Adding Data
You can add data with:
-
‘add_data(path:, paths:)` to add any kind of data type
my_pdf = BxBuilderChain.root.join("path/to/my.pdf") my_text = BxBuilderChain.root.join("path/to/my.txt") my_docx = BxBuilderChain.root.join("path/to/my.docx") my_csv = BxBuilderChain.root.join("path/to/my.csv") search.add_data(paths: [my_pdf, my_text, my_docx, my_csv])
-
‘add_texts(texts:)` to only add textual data
search.add_texts( texts: [ "Lorem Ipsum is simply dummy text of the printing and typesetting industry.", "Lorem Ipsum has been the industry's standard dummy text ever since the 1500s" ] )
Retrieving Data
‘similarity_search_by_vector(embedding:, k:)` searches the vector database for the closest `k` number of embeddings.
search.similarity_search_by_vector(
embedding: ...,
k: # number of results to be retrieved
)
‘vector_store.similarity_search(query:, k:)` generates an embedding for the query and searches the vector database for the closest `k` number of embeddings.
search.similarity_search_by_vector(
embedding: ...,
k: # number of results to be retrieved
)
‘ask(question:)` generates an embedding for the passed-in question, searches the vector database for closest embeddings and then passes these as context to the LLM to generate an answer to the question.
search.ask(question: "What is lorem ipsum?")
Direct Known Subclasses
Constant Summary collapse
- DEFAULT_METRIC =
"cosine"
Instance Attribute Summary collapse
-
#client ⇒ Object
readonly
Returns the value of attribute client.
-
#llm ⇒ Object
readonly
Returns the value of attribute llm.
-
#table_name ⇒ Object
readonly
Returns the value of attribute table_name.
Class Method Summary collapse
Instance Method Summary collapse
- #add_data(paths:) ⇒ Object
-
#add_texts(**kwargs) ⇒ Object
Method supported by Vectorsearch DB to add a list of texts to the index.
-
#ask(**kwargs) ⇒ Object
Method supported by Vectorsearch DB to answer a question given a context (data) pulled from your Vectorsearch DB.
-
#create_default_schema ⇒ Object
Method supported by Vectorsearch DB to create a default schema.
-
#destroy_default_schema ⇒ Object
Method supported by Vectorsearch DB to delete the default schema.
- #generate_prompt(question:, context:, prompt_template: nil) ⇒ Object
-
#get_default_schema ⇒ Object
Method supported by Vectorsearch DB to retrieve a default schema.
-
#initialize(llm:) ⇒ Base
constructor
A new instance of Base.
-
#similarity_search(**kwargs) ⇒ Object
Method supported by Vectorsearch DB to search for similar texts in the index.
-
#similarity_search_by_vector(**kwargs) ⇒ Object
Method supported by Vectorsearch DB to search for similar texts in the index by the passed in vector.
-
#update_texts(**kwargs) ⇒ Object
Method supported by Vectorsearch DB to update a list of texts to the index.
Methods included from DependencyHelper
Constructor Details
#initialize(llm:) ⇒ Base
Returns a new instance of Base.
89 90 91 |
# File 'lib/bx_builder_chain/vectorsearch/base.rb', line 89 def initialize(llm:) @llm = llm end |
Instance Attribute Details
#client ⇒ Object (readonly)
Returns the value of attribute client.
84 85 86 |
# File 'lib/bx_builder_chain/vectorsearch/base.rb', line 84 def client @client end |
#llm ⇒ Object (readonly)
Returns the value of attribute llm.
84 85 86 |
# File 'lib/bx_builder_chain/vectorsearch/base.rb', line 84 def llm @llm end |
#table_name ⇒ Object (readonly)
Returns the value of attribute table_name.
84 85 86 |
# File 'lib/bx_builder_chain/vectorsearch/base.rb', line 84 def table_name @table_name end |
Class Method Details
.logger_options ⇒ Object
154 155 156 157 158 |
# File 'lib/bx_builder_chain/vectorsearch/base.rb', line 154 def self. { color: :blue } end |
Instance Method Details
#add_data(paths:) ⇒ Object
139 140 141 142 143 144 145 146 147 148 149 150 151 152 |
# File 'lib/bx_builder_chain/vectorsearch/base.rb', line 139 def add_data(paths:) raise ArgumentError, "Paths must be provided" if Array(paths).empty? texts = Array(paths) .flatten .map do |path| data = BxBuilderChain::Loader.new(path)&.load&.chunks data.map { |chunk| chunk[:text] } end texts.flatten! add_texts(texts: texts) end |
#add_texts(**kwargs) ⇒ Object
Method supported by Vectorsearch DB to add a list of texts to the index
109 110 111 |
# File 'lib/bx_builder_chain/vectorsearch/base.rb', line 109 def add_texts(**kwargs) raise NotImplementedError, "#{self.class.name} does not support adding texts" end |
#ask(**kwargs) ⇒ Object
Method supported by Vectorsearch DB to answer a question given a context (data) pulled from your Vectorsearch DB.
130 131 132 |
# File 'lib/bx_builder_chain/vectorsearch/base.rb', line 130 def ask(**kwargs) raise NotImplementedError, "#{self.class.name} does not support asking questions" end |
#create_default_schema ⇒ Object
Method supported by Vectorsearch DB to create a default schema
99 100 101 |
# File 'lib/bx_builder_chain/vectorsearch/base.rb', line 99 def create_default_schema raise NotImplementedError, "#{self.class.name} does not support creating a default schema" end |
#destroy_default_schema ⇒ Object
Method supported by Vectorsearch DB to delete the default schema
104 105 106 |
# File 'lib/bx_builder_chain/vectorsearch/base.rb', line 104 def destroy_default_schema raise NotImplementedError, "#{self.class.name} does not support deleting a default schema" end |
#generate_prompt(question:, context:, prompt_template: nil) ⇒ Object
134 135 136 137 |
# File 'lib/bx_builder_chain/vectorsearch/base.rb', line 134 def generate_prompt(question:, context:, prompt_template: nil) template = prompt_template || BxBuilderChain.configuration.default_prompt_template template % {context: context, question: question} end |
#get_default_schema ⇒ Object
Method supported by Vectorsearch DB to retrieve a default schema
94 95 96 |
# File 'lib/bx_builder_chain/vectorsearch/base.rb', line 94 def get_default_schema raise NotImplementedError, "#{self.class.name} does not support retrieving a default schema" end |
#similarity_search(**kwargs) ⇒ Object
Method supported by Vectorsearch DB to search for similar texts in the index
119 120 121 |
# File 'lib/bx_builder_chain/vectorsearch/base.rb', line 119 def similarity_search(**kwargs) raise NotImplementedError, "#{self.class.name} does not support similarity search" end |
#similarity_search_by_vector(**kwargs) ⇒ Object
Method supported by Vectorsearch DB to search for similar texts in the index by the passed in vector. You must generate your own vector using the same LLM that generated the embeddings stored in the Vectorsearch DB.
125 126 127 |
# File 'lib/bx_builder_chain/vectorsearch/base.rb', line 125 def similarity_search_by_vector(**kwargs) raise NotImplementedError, "#{self.class.name} does not support similarity search by vector" end |
#update_texts(**kwargs) ⇒ Object
Method supported by Vectorsearch DB to update a list of texts to the index
114 115 116 |
# File 'lib/bx_builder_chain/vectorsearch/base.rb', line 114 def update_texts(**kwargs) raise NotImplementedError, "#{self.class.name} does not support updating texts" end |