Class: Langchain::LLM::LlamaCpp

Inherits:
Base
  • Object
show all
Defined in:
lib/langchain/llm/llama_cpp.rb

Overview

A wrapper around the LlamaCpp.rb library

Gem requirements:

gem "llama_cpp"

Usage:

llama = Langchain::LLM::LlamaCpp.new(
  model_path: ENV["LLAMACPP_MODEL_PATH"],
  n_gpu_layers: Integer(ENV["LLAMACPP_N_GPU_LAYERS"]),
  n_threads: Integer(ENV["LLAMACPP_N_THREADS"])
)

Instance Attribute Summary collapse

Attributes inherited from Base

#client, #defaults

Instance Method Summary collapse

Methods inherited from Base

#chat, #chat_parameters, #default_dimension, #default_dimensions, #summarize

Methods included from DependencyHelper

#depends_on

Constructor Details

#initialize(model_path:, n_gpu_layers: 1, n_ctx: 2048, n_threads: 1, seed: 0) ⇒ LlamaCpp

Returns a new instance of LlamaCpp.

Parameters:

  • model_path (String)

    The path to the model to use

  • n_gpu_layers (Integer) (defaults to: 1)

    The number of GPU layers to use

  • n_ctx (Integer) (defaults to: 2048)

    The number of context tokens to use

  • n_threads (Integer) (defaults to: 1)

    The CPU number of threads to use

  • seed (Integer) (defaults to: 0)

    The seed to use



25
26
27
28
29
30
31
32
33
# File 'lib/langchain/llm/llama_cpp.rb', line 25

def initialize(model_path:, n_gpu_layers: 1, n_ctx: 2048, n_threads: 1, seed: 0)
  depends_on "llama_cpp"

  @model_path = model_path
  @n_gpu_layers = n_gpu_layers
  @n_ctx = n_ctx
  @n_threads = n_threads
  @seed = seed
end

Instance Attribute Details

#model_pathObject

Returns the value of attribute model_path.



17
18
19
# File 'lib/langchain/llm/llama_cpp.rb', line 17

def model_path
  @model_path
end

#n_ctxObject

Returns the value of attribute n_ctx.



17
18
19
# File 'lib/langchain/llm/llama_cpp.rb', line 17

def n_ctx
  @n_ctx
end

#n_gpu_layersObject

Returns the value of attribute n_gpu_layers.



17
18
19
# File 'lib/langchain/llm/llama_cpp.rb', line 17

def n_gpu_layers
  @n_gpu_layers
end

#n_threads=(value) ⇒ Object

Sets the attribute n_threads

Parameters:

  • value

    the value to set the attribute n_threads to.



18
19
20
# File 'lib/langchain/llm/llama_cpp.rb', line 18

def n_threads=(value)
  @n_threads = value
end

#seedObject

Returns the value of attribute seed.



17
18
19
# File 'lib/langchain/llm/llama_cpp.rb', line 17

def seed
  @seed
end

Instance Method Details

#complete(prompt:, n_predict: 128) ⇒ String

Returns The completed prompt.

Parameters:

  • prompt (String)

    The prompt to complete

  • n_predict (Integer) (defaults to: 128)

    The number of tokens to predict

Returns:

  • (String)

    The completed prompt



51
52
53
54
55
# File 'lib/langchain/llm/llama_cpp.rb', line 51

def complete(prompt:, n_predict: 128)
  # contexts do not appear to be stateful when it comes to completion, so re-use the same one
  context = completion_context
  ::LLaMACpp.generate(context, prompt, n_predict: n_predict)
end

#embed(text:) ⇒ Array<Float>

Returns The embedding.

Parameters:

  • text (String)

    The text to embed

Returns:

  • (Array<Float>)

    The embedding



37
38
39
40
41
42
43
44
45
46
# File 'lib/langchain/llm/llama_cpp.rb', line 37

def embed(text:)
  # contexts are kinda stateful when it comes to embeddings, so allocate one each time
  context = embedding_context

  embedding_input = @model.tokenize(text: text, add_bos: true)
  return unless embedding_input.size.positive?

  context.eval(tokens: embedding_input, n_past: 0)
  Langchain::LLM::LlamaCppResponse.new(context, model: context.model.desc)
end