Class: BxBuilderChain::Processors::Pdf

Inherits:
Base
  • Object
show all
Defined in:
lib/bx_builder_chain/processors/pdf.rb

Constant Summary collapse

EXTENSIONS =
[".pdf"]
CONTENT_TYPES =
["application/pdf"]

Instance Method Summary collapse

Methods included from DependencyHelper

#depends_on

Constructor Details

#initializePdf

Returns a new instance of Pdf.



9
10
11
12
# File 'lib/bx_builder_chain/processors/pdf.rb', line 9

def initialize(*)
  depends_on "pdf-reader"
  require "pdf-reader"
end

Instance Method Details

#parse(data) ⇒ String

Parse the document and return the text

Parameters:

  • data (File)

Returns:

  • (String)


17
18
19
20
21
22
23
# File 'lib/bx_builder_chain/processors/pdf.rb', line 17

def parse(data)
  ::PDF::Reader
    .new(StringIO.new(data.read))
    .pages
    .map(&:text)
    .join("\n\n")
end