Class: Langchain::LLM::Ollama

Inherits:

Base

Object
Base
Langchain::LLM::Ollama

show all

Defined in:: lib/langchain/llm/ollama.rb

Overview

Interface to Ollama API. Available models: ollama.ai/library

Usage:

llm = Langchain::LLM::Ollama.new(url: ENV["OLLAMA_URL"], default_options: {})

Constant Summary collapse

DEFAULTS =

{
  temperature: 0.0,
  completion_model: "llama3.2",
  embedding_model: "llama3.2",
  chat_model: "llama3.2",
  options: {}
}.freeze

EMBEDDING_SIZES =

{
  codellama: 4_096,
  "dolphin-mixtral": 4_096,
  llama2: 4_096,
  llama3: 4_096,
  "llama3.1": 4_096,
  "llama3.2": 4_096,
  llava: 4_096,
  mistral: 4_096,
  "mistral-openorca": 4_096,
  mixtral: 4_096,
  tinydolphin: 2_048
}.freeze

Instance Attribute Summary collapse

#defaults ⇒ Object readonly

Returns the value of attribute defaults.
#url ⇒ Object readonly

Returns the value of attribute url.

Instance Method Summary collapse

#chat(messages:, model: nil, **params, &block) ⇒ Langchain::LLM::OllamaResponse

Generate a chat completion.
#complete(prompt:, model: defaults[:completion_model], images: nil, format: nil, system: nil, template: nil, context: nil, raw: nil, mirostat: nil, mirostat_eta: nil, mirostat_tau: nil, num_ctx: nil, num_gqa: nil, num_gpu: nil, num_thread: nil, repeat_last_n: nil, repeat_penalty: nil, temperature: defaults[:temperature], seed: nil, stop: nil, tfs_z: nil, num_predict: nil, top_k: nil, top_p: nil, stop_sequences: nil, &block) ⇒ Langchain::LLM::OllamaResponse

Generate the completion for a given prompt.
#default_dimensions ⇒ Integer

Returns the # of vector dimensions for the embeddings.
#embed(text:, model: , mirostat: nil, mirostat_eta: nil, mirostat_tau: nil, num_ctx: nil, num_gqa: nil, num_gpu: nil, num_thread: nil, repeat_last_n: nil, repeat_penalty: nil, temperature: , seed: nil, stop: nil, tfs_z: nil, num_predict: nil, top_k: nil, top_p: nil) ⇒ Langchain::LLM::OllamaResponse

Generate an embedding for a given text.
#initialize(url: "http://localhost:11434", api_key: nil, default_options: {}) ⇒ Ollama constructor

Initialize the Ollama client.
#summarize(text:) ⇒ String

Generate a summary for a given text.

Methods inherited from Base

#chat_parameters, #default_dimension

Methods included from DependencyHelper

#depends_on

Constructor Details

#initialize(url: "http://localhost:11434", api_key: nil, default_options: {}) ⇒ `Ollama`

Initialize the Ollama client

Parameters:

url (String) (defaults to: "http://localhost:11434") —

The URL of the Ollama instance
api_key (String) (defaults to: nil) —

The API key to use. This is optional and used when you expose Ollama API using Open WebUI
default_options (Hash) (defaults to: {}) —

The default options to use

# File 'lib/langchain/llm/ollama.rb', line 40

def initialize(url: "http://localhost:11434", api_key: nil, default_options: {})
  depends_on "faraday"
  @url = url
  @api_key = api_key
  @defaults = DEFAULTS.merge(default_options)
  chat_parameters.update(
    model: {default: @defaults[:chat_model]},
    temperature: {default: @defaults[:temperature]},
    template: {},
    stream: {default: false},
    response_format: {default: @defaults[:response_format]},
    options: {default: @defaults[:options]}
  )
  chat_parameters.remap(response_format: :format)
end

Instance Attribute Details

#defaults ⇒ `Object` (readonly)

Returns the value of attribute defaults.



11
12
13

# File 'lib/langchain/llm/ollama.rb', line 11

def defaults
  @defaults
end

#url ⇒ `Object` (readonly)

Returns the value of attribute url.



11
12
13

# File 'lib/langchain/llm/ollama.rb', line 11

def url
  @url
end

Instance Method Details

#chat(messages:, model: nil, **params, &block) ⇒ `Langchain::LLM::OllamaResponse`

Generate a chat completion

Example:

final_resp = ollama.chat(messages:) { |resp| print resp.chat_completion }
final_resp.total_tokens

The message object has the following fields:

role: the role of the message, either system, user or assistant
content: the content of the message
images (optional): a list of images to include in the message (for multimodal models such as llava)

Parameters:

messages (Array) —

The chat messages
model (String) (defaults to: nil) —

The model to use
params (Hash) —

Unified chat parmeters from [Langchain::LLM::Parameters::Chat::SCHEMA]
block (Hash) —

a customizable set of options

Options Hash (**params):

:messages (Array<Hash>) —

Array of messages
:model (String) —

Model name
:format (String) —

Format to return a response in. Currently the only accepted value is ‘json`
:temperature (Float) —

The temperature to use
:template (String) —

The prompt template to use (overrides what is defined in the ‘Modelfile`)

Returns:

(Langchain::LLM::OllamaResponse) —

Response object

# File 'lib/langchain/llm/ollama.rb', line 180

def chat(messages:, model: nil, **params, &block)
  parameters = chat_parameters.to_params(params.merge(messages:, model:, stream: block_given?)) # rubocop:disable Performance/BlockGivenWithExplicitBlock
  responses_stream = []

  client.post("api/chat", parameters) do |req|
    req.options.on_data = json_responses_chunk_handler do |parsed_chunk|
      responses_stream << parsed_chunk

      block&.call(OllamaResponse.new(parsed_chunk, model: parameters[:model]))
    end
  end

  generate_final_chat_completion_response(responses_stream, parameters[:model])
end

#complete(prompt:, model: defaults[:completion_model], images: nil, format: nil, system: nil, template: nil, context: nil, raw: nil, mirostat: nil, mirostat_eta: nil, mirostat_tau: nil, num_ctx: nil, num_gqa: nil, num_gpu: nil, num_thread: nil, repeat_last_n: nil, repeat_penalty: nil, temperature: defaults[:temperature], seed: nil, stop: nil, tfs_z: nil, num_predict: nil, top_k: nil, top_p: nil, stop_sequences: nil, &block) ⇒ `Langchain::LLM::OllamaResponse`

Generate the completion for a given prompt

Example:

final_resp = ollama.complete(prompt:) { |resp| print resp.completion }
final_resp.total_tokens

Parameters:

prompt (String) —

The prompt to complete
model (String) (defaults to: defaults[:completion_model]) —

The model to use For a list of valid parameters and values, see: github.com/jmorganca/ollama/blob/main/docs/modelfile.md#valid-parameters-and-values
block (Hash) —

a customizable set of options

Returns:

(Langchain::LLM::OllamaResponse) —

Response object

# File 'lib/langchain/llm/ollama.rb', line 81

def complete(
  prompt:,
  model: defaults[:completion_model],
  images: nil,
  format: nil,
  system: nil,
  template: nil,
  context: nil,
  raw: nil,
  mirostat: nil,
  mirostat_eta: nil,
  mirostat_tau: nil,
  num_ctx: nil,
  num_gqa: nil,
  num_gpu: nil,
  num_thread: nil,
  repeat_last_n: nil,
  repeat_penalty: nil,
  temperature: defaults[:temperature],
  seed: nil,
  stop: nil,
  tfs_z: nil,
  num_predict: nil,
  top_k: nil,
  top_p: nil,
  stop_sequences: nil,
  &block
)
  if stop_sequences
    stop = stop_sequences
  end

  parameters = {
    prompt: prompt,
    model: model,
    images: images,
    format: format,
    system: system,
    template: template,
    context: context,
    stream: block_given?, # rubocop:disable Performance/BlockGivenWithExplicitBlock
    raw: raw
  }.compact

  llm_parameters = {
    mirostat: mirostat,
    mirostat_eta: mirostat_eta,
    mirostat_tau: mirostat_tau,
    num_ctx: num_ctx,
    num_gqa: num_gqa,
    num_gpu: num_gpu,
    num_thread: num_thread,
    repeat_last_n: repeat_last_n,
    repeat_penalty: repeat_penalty,
    temperature: temperature,
    seed: seed,
    stop: stop,
    tfs_z: tfs_z,
    num_predict: num_predict,
    top_k: top_k,
    top_p: top_p
  }

  parameters[:options] = llm_parameters.compact
  responses_stream = []

  client.post("api/generate", parameters) do |req|
    req.options.on_data = json_responses_chunk_handler do |parsed_chunk|
      responses_stream << parsed_chunk

      block&.call(OllamaResponse.new(parsed_chunk, model: parameters[:model]))
    end
  end

  generate_final_completion_response(responses_stream, parameters[:model])
end

#default_dimensions ⇒ `Integer`

Returns the # of vector dimensions for the embeddings

Returns:

(Integer) —

The # of vector dimensions

# File 'lib/langchain/llm/ollama.rb', line 58

def default_dimensions
  # since Ollama can run multiple models, look it up or generate an embedding and return the size
  @default_dimensions ||=
    EMBEDDING_SIZES.fetch(defaults[:embedding_model].to_sym) do
      embed(text: "test").embedding.size
    end
end

#embed(text:, model: , mirostat: nil, mirostat_eta: nil, mirostat_tau: nil, num_ctx: nil, num_gqa: nil, num_gpu: nil, num_thread: nil, repeat_last_n: nil, repeat_penalty: nil, temperature: , seed: nil, stop: nil, tfs_z: nil, num_predict: nil, top_k: nil, top_p: nil) ⇒ `Langchain::LLM::OllamaResponse`

Generate an embedding for a given text

Parameters:

text (String) —

The text to generate an embedding for
model (String) (defaults to: ) —

The model to use
options (Hash) —

The options to use

Returns:

(Langchain::LLM::OllamaResponse) —

Response object

# File 'lib/langchain/llm/ollama.rb', line 203

def embed(
  text:,
  model: defaults[:embedding_model],
  mirostat: nil,
  mirostat_eta: nil,
  mirostat_tau: nil,
  num_ctx: nil,
  num_gqa: nil,
  num_gpu: nil,
  num_thread: nil,
  repeat_last_n: nil,
  repeat_penalty: nil,
  temperature: defaults[:temperature],
  seed: nil,
  stop: nil,
  tfs_z: nil,
  num_predict: nil,
  top_k: nil,
  top_p: nil
)
  parameters = {
    model: model,
    input: Array(text)
  }.compact

  llm_parameters = {
    mirostat: mirostat,
    mirostat_eta: mirostat_eta,
    mirostat_tau: mirostat_tau,
    num_ctx: num_ctx,
    num_gqa: num_gqa,
    num_gpu: num_gpu,
    num_thread: num_thread,
    repeat_last_n: repeat_last_n,
    repeat_penalty: repeat_penalty,
    temperature: temperature,
    seed: seed,
    stop: stop,
    tfs_z: tfs_z,
    num_predict: num_predict,
    top_k: top_k,
    top_p: top_p
  }

  parameters[:options] = llm_parameters.compact

  response = client.post("api/embed") do |req|
    req.body = parameters
  end

  OllamaResponse.new(response.body, model: parameters[:model])
end

#summarize(text:) ⇒ `String`

Generate a summary for a given text

Parameters:

text (String) —

The text to generate a summary for

Returns:

(String) —

The summary

# File 'lib/langchain/llm/ollama.rb', line 260

def summarize(text:)
  prompt_template = Langchain::Prompt.load_from_path(
    file_path: Langchain.root.join("langchain/llm/prompts/ollama/summarize_template.yaml")
  )
  prompt = prompt_template.format(text: text)

  complete(prompt: prompt)
end

Class: Langchain::LLM::Ollama

Overview

Constant Summary collapse

Instance Attribute Summary collapse

Instance Method Summary collapse

Methods inherited from Base

Methods included from DependencyHelper

Constructor Details

#initialize(url: "http://localhost:11434", api_key: nil, default_options: {}) ⇒ Ollama

Instance Attribute Details

#defaults ⇒ Object (readonly)

#url ⇒ Object (readonly)

Instance Method Details

#chat(messages:, model: nil, **params, &block) ⇒ Langchain::LLM::OllamaResponse

#default_dimensions ⇒ Integer

#summarize(text:) ⇒ String

#initialize(url: "http://localhost:11434", api_key: nil, default_options: {}) ⇒ `Ollama`

#defaults ⇒ `Object` (readonly)

#url ⇒ `Object` (readonly)

#chat(messages:, model: nil, **params, &block) ⇒ `Langchain::LLM::OllamaResponse`

#default_dimensions ⇒ `Integer`

#summarize(text:) ⇒ `String`