Class: Langchain::LLM::Ollama

Inherits:

Object
Base
Langchain::LLM::Ollama

Defined in:: lib/langchain/llm/ollama.rb

Overview

Interface to Ollama API. Available models: ollama.ai/library

Usage:

llm = Langchain::LLM::Ollama.new
llm = Langchain::LLM::Ollama.new(url: ENV["OLLAMA_URL"], default_options: {})

Constant Summary collapse

DEFAULTS =

{
  temperature: 0.8,
  completion_model_name: "llama3",
  embeddings_model_name: "llama3",
  chat_completion_model_name: "llama3"
}.freeze

EMBEDDING_SIZES =

{
  codellama: 4_096,
  "dolphin-mixtral": 4_096,
  llama2: 4_096,
  llama3: 4_096,
  llava: 4_096,
  mistral: 4_096,
  "mistral-openorca": 4_096,
  mixtral: 4_096
}.freeze

Instance Attribute Summary collapse

#defaults ⇒ Object readonly

Returns the value of attribute defaults.
#url ⇒ Object readonly

Returns the value of attribute url.

Instance Method Summary collapse

#chat(params = {}) ⇒ Object

Generate a chat completion.
#complete(prompt:, model: defaults[:completion_model_name], images: nil, format: nil, system: nil, template: nil, context: nil, stream: nil, raw: nil, mirostat: nil, mirostat_eta: nil, mirostat_tau: nil, num_ctx: nil, num_gqa: nil, num_gpu: nil, num_thread: nil, repeat_last_n: nil, repeat_penalty: nil, temperature: defaults[:temperature], seed: nil, stop: nil, tfs_z: nil, num_predict: nil, top_k: nil, top_p: nil, stop_sequences: nil, &block) ⇒ Langchain::LLM::OllamaResponse

Generate the completion for a given prompt.
#default_dimensions ⇒ Integer

Returns the # of vector dimensions for the embeddings.
#embed(text:, model: , mirostat: nil, mirostat_eta: nil, mirostat_tau: nil, num_ctx: nil, num_gqa: nil, num_gpu: nil, num_thread: nil, repeat_last_n: nil, repeat_penalty: nil, temperature: , seed: nil, stop: nil, tfs_z: nil, num_predict: nil, top_k: nil, top_p: nil) ⇒ Langchain::LLM::OllamaResponse

Generate an embedding for a given text.
#initialize(url: "http://localhost:11434", default_options: {}) ⇒ Ollama constructor

Initialize the Ollama client.
#summarize(text:) ⇒ String

Generate a summary for a given text.

Methods inherited from Base

#chat_parameters, #default_dimension

Methods included from DependencyHelper

#depends_on

Constructor Details

#initialize(url: "http://localhost:11434", default_options: {}) ⇒ `Ollama`

Initialize the Ollama client

Parameters:

url (String) (defaults to: "http://localhost:11434") —

The URL of the Ollama instance
default_options (Hash) (defaults to: {}) —

The default options to use

# File 'lib/langchain/llm/ollama.rb', line 38

def initialize(url: "http://localhost:11434", default_options: {})
  depends_on "faraday"
  @url = url
  @defaults = DEFAULTS.deep_merge(default_options)
  chat_parameters.update(
    model: {default: @defaults[:chat_completion_model_name]},
    temperature: {default: @defaults[:temperature]},
    template: {},
    stream: {default: false}
  )
  chat_parameters.remap(response_format: :format)
end

Instance Attribute Details

#defaults ⇒ `Object` (readonly)

Returns the value of attribute defaults.



14
15
16

# File 'lib/langchain/llm/ollama.rb', line 14

def defaults
  @defaults
end

#url ⇒ `Object` (readonly)

Returns the value of attribute url.



14
15
16

# File 'lib/langchain/llm/ollama.rb', line 14

def url
  @url
end

Instance Method Details

#chat(params = {}) ⇒ `Object`

Generate a chat completion

The message object has the following fields:

role: the role of the message, either system, user or assistant
content: the content of the message
images (optional): a list of images to include in the message (for multimodal models such as llava)

Parameters:

params (Hash) (defaults to: {}) —

unified chat parmeters from [Langchain::LLM::Parameters::Chat::SCHEMA]

Options Hash (params):

:model (String) —

Model name
:messages (Array<Hash>) —

Array of messages
:format (String) —

Format to return a response in. Currently the only accepted value is ‘json`
:temperature (Float) —

The temperature to use
:template (String) —

The prompt template to use (overrides what is defined in the ‘Modelfile`)
:stream (Boolean) —

Streaming the response. If false the response will be returned as a single response object, rather than a stream of objects

# File 'lib/langchain/llm/ollama.rb', line 174

def chat(params = {})
  parameters = chat_parameters.to_params(params)

  response = client.post("api/chat") do |req|
    req.body = parameters
  end

  Langchain::LLM::OllamaResponse.new(response.body, model: parameters[:model])
end

#complete(prompt:, model: defaults[:completion_model_name], images: nil, format: nil, system: nil, template: nil, context: nil, stream: nil, raw: nil, mirostat: nil, mirostat_eta: nil, mirostat_tau: nil, num_ctx: nil, num_gqa: nil, num_gpu: nil, num_thread: nil, repeat_last_n: nil, repeat_penalty: nil, temperature: defaults[:temperature], seed: nil, stop: nil, tfs_z: nil, num_predict: nil, top_k: nil, top_p: nil, stop_sequences: nil, &block) ⇒ `Langchain::LLM::OllamaResponse`

Generate the completion for a given prompt

Parameters:

prompt (String) —

The prompt to complete
model (String) (defaults to: defaults[:completion_model_name]) —

The model to use For a list of valid parameters and values, see: github.com/jmorganca/ollama/blob/main/docs/modelfile.md#valid-parameters-and-values

Returns:

(Langchain::LLM::OllamaResponse) —

Response object

# File 'lib/langchain/llm/ollama.rb', line 70

def complete(
  prompt:,
  model: defaults[:completion_model_name],
  images: nil,
  format: nil,
  system: nil,
  template: nil,
  context: nil,
  stream: nil,
  raw: nil,
  mirostat: nil,
  mirostat_eta: nil,
  mirostat_tau: nil,
  num_ctx: nil,
  num_gqa: nil,
  num_gpu: nil,
  num_thread: nil,
  repeat_last_n: nil,
  repeat_penalty: nil,
  temperature: defaults[:temperature],
  seed: nil,
  stop: nil,
  tfs_z: nil,
  num_predict: nil,
  top_k: nil,
  top_p: nil,
  stop_sequences: nil,
  &block
)
  if stop_sequences
    stop = stop_sequences
  end

  parameters = {
    prompt: prompt,
    model: model,
    images: images,
    format: format,
    system: system,
    template: template,
    context: context,
    stream: stream,
    raw: raw
  }.compact

  llm_parameters = {
    mirostat: mirostat,
    mirostat_eta: mirostat_eta,
    mirostat_tau: mirostat_tau,
    num_ctx: num_ctx,
    num_gqa: num_gqa,
    num_gpu: num_gpu,
    num_thread: num_thread,
    repeat_last_n: repeat_last_n,
    repeat_penalty: repeat_penalty,
    temperature: temperature,
    seed: seed,
    stop: stop,
    tfs_z: tfs_z,
    num_predict: num_predict,
    top_k: top_k,
    top_p: top_p
  }

  parameters[:options] = llm_parameters.compact

  response = ""

  client.post("api/generate") do |req|
    req.body = parameters

    req.options.on_data = proc do |chunk, size|
      chunk.split("\n").each do |line_chunk|
        json_chunk = begin
          JSON.parse(line_chunk)
        # In some instance the chunk exceeds the buffer size and the JSON parser fails
        rescue JSON::ParserError
          nil
        end

        response += json_chunk.dig("response") unless json_chunk.blank?
      end

      yield json_chunk, size if block
    end
  end

  Langchain::LLM::OllamaResponse.new(response, model: parameters[:model])
end

#default_dimensions ⇒ `Integer`

Returns the # of vector dimensions for the embeddings

Returns:

(Integer) —

The # of vector dimensions

# File 'lib/langchain/llm/ollama.rb', line 53

def default_dimensions
  # since Ollama can run multiple models, look it up or generate an embedding and return the size
  @default_dimensions ||=
    EMBEDDING_SIZES.fetch(defaults[:embeddings_model_name].to_sym) do
      embed(text: "test").embedding.size
    end
end

#embed(text:, model: , mirostat: nil, mirostat_eta: nil, mirostat_tau: nil, num_ctx: nil, num_gqa: nil, num_gpu: nil, num_thread: nil, repeat_last_n: nil, repeat_penalty: nil, temperature: , seed: nil, stop: nil, tfs_z: nil, num_predict: nil, top_k: nil, top_p: nil) ⇒ `Langchain::LLM::OllamaResponse`

Generate an embedding for a given text

Parameters:

text (String) —

The text to generate an embedding for
model (String) (defaults to: ) —

The model to use
options (Hash) —

The options to use

Returns:

(Langchain::LLM::OllamaResponse) —

Response object

# File 'lib/langchain/llm/ollama.rb', line 192

def embed(
  text:,
  model: defaults[:embeddings_model_name],
  mirostat: nil,
  mirostat_eta: nil,
  mirostat_tau: nil,
  num_ctx: nil,
  num_gqa: nil,
  num_gpu: nil,
  num_thread: nil,
  repeat_last_n: nil,
  repeat_penalty: nil,
  temperature: defaults[:temperature],
  seed: nil,
  stop: nil,
  tfs_z: nil,
  num_predict: nil,
  top_k: nil,
  top_p: nil
)
  parameters = {
    prompt: text,
    model: model
  }.compact

  llm_parameters = {
    mirostat: mirostat,
    mirostat_eta: mirostat_eta,
    mirostat_tau: mirostat_tau,
    num_ctx: num_ctx,
    num_gqa: num_gqa,
    num_gpu: num_gpu,
    num_thread: num_thread,
    repeat_last_n: repeat_last_n,
    repeat_penalty: repeat_penalty,
    temperature: temperature,
    seed: seed,
    stop: stop,
    tfs_z: tfs_z,
    num_predict: num_predict,
    top_k: top_k,
    top_p: top_p
  }

  parameters[:options] = llm_parameters.compact

  response = client.post("api/embeddings") do |req|
    req.body = parameters
  end

  Langchain::LLM::OllamaResponse.new(response.body, model: parameters[:model])
end

#summarize(text:) ⇒ `String`

Generate a summary for a given text

Parameters:

text (String) —

The text to generate a summary for

Returns:

(String) —

The summary

# File 'lib/langchain/llm/ollama.rb', line 249

def summarize(text:)
  prompt_template = Langchain::Prompt.load_from_path(
    file_path: Langchain.root.join("langchain/llm/prompts/ollama/summarize_template.yaml")
  )
  prompt = prompt_template.format(text: text)

  complete(prompt: prompt)
end

Class: Langchain::LLM::Ollama

Overview

Constant Summary collapse

Instance Attribute Summary collapse

Instance Method Summary collapse

Methods inherited from Base

Methods included from DependencyHelper

Constructor Details

#initialize(url: "http://localhost:11434", default_options: {}) ⇒ Ollama

Instance Attribute Details

#defaults ⇒ Object (readonly)

#url ⇒ Object (readonly)

Instance Method Details

#chat(params = {}) ⇒ Object

#default_dimensions ⇒ Integer

#summarize(text:) ⇒ String

#initialize(url: "http://localhost:11434", default_options: {}) ⇒ `Ollama`

#defaults ⇒ `Object` (readonly)

#url ⇒ `Object` (readonly)

#chat(params = {}) ⇒ `Object`

#default_dimensions ⇒ `Integer`

#summarize(text:) ⇒ `String`