Class: Baran::SentenceTextSplitter
- Inherits:
-
TextSplitter
- Object
- TextSplitter
- Baran::SentenceTextSplitter
- Defined in:
- lib/baran/sentence_text_splitter.rb
Instance Attribute Summary
Attributes inherited from TextSplitter
Instance Method Summary collapse
-
#initialize(chunk_size: 1024, chunk_overlap: 64) ⇒ SentenceTextSplitter
constructor
A new instance of SentenceTextSplitter.
- #splitted(text) ⇒ Object
Methods inherited from TextSplitter
Constructor Details
#initialize(chunk_size: 1024, chunk_overlap: 64) ⇒ SentenceTextSplitter
Returns a new instance of SentenceTextSplitter.
5 6 7 |
# File 'lib/baran/sentence_text_splitter.rb', line 5 def initialize(chunk_size: 1024, chunk_overlap: 64) super(chunk_size: chunk_size, chunk_overlap: chunk_overlap) end |
Instance Method Details
#splitted(text) ⇒ Object
9 10 11 12 |
# File 'lib/baran/sentence_text_splitter.rb', line 9 def splitted(text) # Use a regex to split text based on the specified sentence-ending characters followed by whitespace text.scan(/[^.!?]+[.!?]+(?:\s+|\z)/).map(&:strip) end |