Class: BxBuilderChain::Processors::Pdf
- Defined in:
- lib/bx_builder_chain/processors/pdf.rb
Constant Summary collapse
- EXTENSIONS =
[".pdf"]
- CONTENT_TYPES =
["application/pdf"]
Instance Method Summary collapse
-
#initialize ⇒ Pdf
constructor
A new instance of Pdf.
-
#parse(data) ⇒ String
Parse the document and return the text.
Methods included from DependencyHelper
Constructor Details
#initialize ⇒ Pdf
Returns a new instance of Pdf.
9 10 11 12 |
# File 'lib/bx_builder_chain/processors/pdf.rb', line 9 def initialize(*) depends_on "pdf-reader" require "pdf-reader" end |
Instance Method Details
#parse(data) ⇒ String
Parse the document and return the text
17 18 19 20 21 22 23 |
# File 'lib/bx_builder_chain/processors/pdf.rb', line 17 def parse(data) ::PDF::Reader .new(StringIO.new(data.read)) .pages .map(&:text) .join("\n\n") end |