Module: Tiktoken

Defined in:
lib/tiktoken_ruby.rb,
lib/tiktoken_ruby/version.rb

Defined Under Namespace

Classes: Encoding

Constant Summary collapse

VERSION =
"0.0.6"

Class Method Summary collapse

Class Method Details

.encoding_for_model(model_name) ⇒ Tiktoken::Encoding

Gets the encoding for an OpenAI model

Examples:

Count tokens for text

enc = Tiktoken.encoding_for_model("gpt-4")
enc.encode("hello world").length #=> 2

Parameters:

  • model_name (Symbol|String)

    The name of the model to get the encoding for

Returns:



35
36
37
38
39
40
41
42
43
44
45
46
47
# File 'lib/tiktoken_ruby.rb', line 35

def encoding_for_model(model_name)
  PREFIX_MODELS.each do |prefix|
    if model_name.to_s.start_with?("#{prefix}-")
      model_name = prefix
      break
    end
  end

  encoding_name = MODEL_TO_ENCODING_NAME[model_name.to_sym]
  return nil unless encoding_name

  get_encoding(encoding_name)
end

.get_encoding(name) ⇒ Tiktoken::Encoding

Returns an encoding by name. If the encoding is not already loaded it will be loaded, but otherwise it will reuse the instance of that type that was previous loaded

Examples:

Encode and decode text

enc = Tiktoken.get_encoding("cl100k_base")
enc.decode(enc.encode("hello world")) #=> "hello world"

Parameters:

  • name (Symbol|String)

    The name of the encoding to load

Returns:



22
23
24
25
26
27
# File 'lib/tiktoken_ruby.rb', line 22

def get_encoding(name)
  name = name.to_sym
  return nil unless SUPPORTED_ENCODINGS.include?(name)

  Tiktoken::Encoding.for_name_cached(name)
end

.list_encoding_namesArray<Symbol>

Lists all the encodings that are supported

Returns:

  • (Array<Symbol>)

    The list of supported encodings



51
52
53
# File 'lib/tiktoken_ruby.rb', line 51

def list_encoding_names
  SUPPORTED_ENCODINGS
end

.list_model_namesArray<Symbol>

Lists all the models that are supported

Returns:

  • (Array<Symbol>)

    The list of supported models



57
58
59
# File 'lib/tiktoken_ruby.rb', line 57

def list_model_names
  MODEL_TO_ENCODING_NAME.keys
end