Class: Boxcars::VectorStore::SplitText

Inherits:
Object
  • Object
show all
Includes:
Boxcars::VectorStore
Defined in:
lib/boxcars/vector_store/split_text.rb

Overview

Split a text into chunks of a given size.

Instance Method Summary collapse

Methods included from Boxcars::VectorStore

included

Constructor Details

#initialize(separator: "Search", chunk_size: 7, chunk_overlap: 3, text: "") ⇒ SplitText

Returns a new instance of SplitText.

Parameters:

  • separator (String) (defaults to: "Search")

    The string to use to split the text.

  • chunk_size (Integer) (defaults to: 7)

    The size of each chunk.

  • chunk_overlap (Integer) (defaults to: 3)

    The amount of overlap between chunks.

  • text (String) (defaults to: "")

    The text to split.



13
14
15
16
17
18
19
20
# File 'lib/boxcars/vector_store/split_text.rb', line 13

def initialize(separator: "Search", chunk_size: 7, chunk_overlap: 3, text: "")
  validate_params(separator, chunk_size, chunk_overlap, text)

  @separator = separator
  @chunk_size = chunk_size
  @chunk_overlap = chunk_overlap
  @text = text
end

Instance Method Details

#callObject



22
23
24
25
26
27
# File 'lib/boxcars/vector_store/split_text.rb', line 22

def call
  splits = text.split(separator)
  merged_splits = merge_splits(splits, separator)

  merged_splits&.sort
end