Class: Tiktoken::Encoding
- Inherits:
-
Object
- Object
- Tiktoken::Encoding
- Defined in:
- lib/tiktoken_ruby/encoding.rb
Instance Attribute Summary collapse
-
#name ⇒ Object
readonly
Returns the value of attribute name.
Class Method Summary collapse
-
.for_name(encoding) ⇒ Tiktoken::Encoding
This returns a new Tiktoken::Encoding instance for the requested encoding.
-
.for_name_cached(encoding) ⇒ Tiktoken::Encoding
This returns a Tiktoken::Encoding instance for the requested encoding It will reuse an existing encoding if it’s already been loaded.
Instance Method Summary collapse
-
#decode(tokens) ⇒ String
Decodes the tokens back into text.
-
#encode(text, allowed_special: []) ⇒ Array<Integer>
Encodes the text as a list of integer tokens.
-
#encode_ordinary(text) ⇒ Array<Integer>
Encodes the text as a list of integer tokens.
Instance Attribute Details
#name ⇒ Object (readonly)
Returns the value of attribute name.
4 5 6 |
# File 'lib/tiktoken_ruby/encoding.rb', line 4 def name @name end |
Class Method Details
.for_name(encoding) ⇒ Tiktoken::Encoding
This returns a new Tiktoken::Encoding instance for the requested encoding
9 10 11 |
# File 'lib/tiktoken_ruby/encoding.rb', line 9 def self.for_name(encoding) Tiktoken::Encoding.new(Tiktoken::BpeFactory.send(encoding.to_sym), encoding.to_sym) end |
.for_name_cached(encoding) ⇒ Tiktoken::Encoding
This returns a Tiktoken::Encoding instance for the requested encoding It will reuse an existing encoding if it’s already been loaded
17 18 19 20 |
# File 'lib/tiktoken_ruby/encoding.rb', line 17 def self.for_name_cached(encoding) @encodings ||= {} @encodings[encoding.to_sym] ||= Tiktoken::Encoding.for_name(encoding) end |
Instance Method Details
#decode(tokens) ⇒ String
Decodes the tokens back into text
42 43 44 |
# File 'lib/tiktoken_ruby/encoding.rb', line 42 def decode(tokens) @ext_base_bpe.decode(tokens) end |
#encode(text, allowed_special: []) ⇒ Array<Integer>
Encodes the text as a list of integer tokens. This encoding will treat special non text tokens as text unless they’re in the allowed_special array. It’s basically like the text was escaped
35 36 37 |
# File 'lib/tiktoken_ruby/encoding.rb', line 35 def encode(text, allowed_special: []) @ext_base_bpe.encode(text, allowed_special) end |
#encode_ordinary(text) ⇒ Array<Integer>
Encodes the text as a list of integer tokens. This encoding will encode special non text tokens basically it’s unescaped
26 27 28 |
# File 'lib/tiktoken_ruby/encoding.rb', line 26 def encode_ordinary(text) @ext_base_bpe.encode_ordinary(text) end |