Class: FormatParser::PDFParser
- Inherits:
-
Object
- Object
- FormatParser::PDFParser
- Includes:
- IOUtils
- Defined in:
- lib/parsers/pdf_parser.rb
Constant Summary collapse
- PDF_MARKER =
First 9 bytes of a PDF should be in this format, according to:
https://stackoverflow.com/questions/3108201/detect-if-pdf-file-is-correct-header-pdf
There are however exceptions, which are left out for now.
/%PDF-[12]\.[0-8]{1}/
- PDF_CONTENT_TYPE =
'application/pdf'
Constants included from IOUtils
Instance Method Summary collapse
Methods included from IOUtils
#read_bytes, #read_fixed_point, #read_int, #safe_read, #safe_skip, #skip_bytes
Instance Method Details
#call(io) ⇒ Object
16 17 18 19 20 21 22 23 24 25 |
# File 'lib/parsers/pdf_parser.rb', line 16 def call(io) io = FormatParser::IOConstraint.new(io) header = safe_read(io, 9) return unless header =~ PDF_MARKER FormatParser::Document.new(format: :pdf, content_type: PDF_CONTENT_TYPE) rescue FormatParser::IOUtils::InvalidRead nil end |
#likely_match?(filename) ⇒ Boolean
12 13 14 |
# File 'lib/parsers/pdf_parser.rb', line 12 def likely_match?(filename) filename =~ /\.(pdf|ai)$/i end |