Class: Langchain::Processors::Pptx
- Defined in:
- lib/langchain/processors/pptx.rb
Constant Summary collapse
- EXTENSIONS =
[".pptx"]
- CONTENT_TYPES =
["application/vnd.openxmlformats-officedocument.presentationml.presentation"]
Instance Method Summary collapse
-
#initialize ⇒ Pptx
constructor
A new instance of Pptx.
-
#parse(data) ⇒ String
Parse the document and return the text.
Methods included from DependencyHelper
Constructor Details
#initialize ⇒ Pptx
Returns a new instance of Pptx.
9 10 11 |
# File 'lib/langchain/processors/pptx.rb', line 9 def initialize(*) depends_on "power_point_pptx" end |
Instance Method Details
#parse(data) ⇒ String
Parse the document and return the text
16 17 18 19 20 21 22 23 24 25 26 |
# File 'lib/langchain/processors/pptx.rb', line 16 def parse(data) presentation = PowerPointPptx::Document.open(data) = presentation. contents = .map(&:content) text = contents.map do |sections| sections.map(&:strip).join(" ") end text.join("\n\n") end |