Class: Langchain::Processors::Docx

Inherits:
Base
  • Object
show all
Defined in:
lib/langchain/processors/docx.rb

Constant Summary collapse

EXTENSIONS =
[".docx"]
CONTENT_TYPES =
["application/vnd.openxmlformats-officedocument.wordprocessingml.document"]

Instance Method Summary collapse

Methods included from DependencyHelper

#depends_on

Constructor Details

#initializeDocx

Returns a new instance of Docx.



9
10
11
# File 'lib/langchain/processors/docx.rb', line 9

def initialize(*)
  depends_on "docx"
end

Instance Method Details

#parse(data) ⇒ String

Parse the document and return the text

Parameters:

  • data (File)

Returns:

  • (String)


16
17
18
19
20
# File 'lib/langchain/processors/docx.rb', line 16

def parse(data)
  ::Docx::Document
    .open(StringIO.new(data.read))
    .text
end