Module: Docsplit::TransparentPDFs
- Included in:
- Docsplit
- Defined in:
- lib/docsplit/transparent_pdfs.rb
Overview
Include a method to transparently convert non-PDF arguments to temporary PDFs. Allows us to pretend to natively support docs, rtf, ppt, and so on.
Instance Method Summary collapse
-
#ensure_pdfs(docs) ⇒ Object
Temporarily convert any non-PDF documents to PDFs before running them through further extraction.
- #is_pdf?(doc) ⇒ Boolean
Instance Method Details
#ensure_pdfs(docs) ⇒ Object
Temporarily convert any non-PDF documents to PDFs before running them through further extraction.
9 10 11 12 13 14 15 16 17 18 19 |
# File 'lib/docsplit/transparent_pdfs.rb', line 9 def ensure_pdfs(docs) [docs].flatten.map do |doc| if is_pdf?(doc) doc else tempdir = File.join(Dir.tmpdir, 'docsplit') extract_pdf([doc], {:output => tempdir}) File.join(tempdir, File.basename(doc, File.extname(doc)) + '.pdf') end end end |
#is_pdf?(doc) ⇒ Boolean
21 22 23 |
# File 'lib/docsplit/transparent_pdfs.rb', line 21 def is_pdf?(doc) File.extname(doc).downcase == '.pdf' || File.open(doc, 'rb', &:readline) =~ /\A\%PDF-\d+(\.\d+)?/ end |