Class: Mindee::Extraction::PdfExtractor::ExtractedPdf

Inherits:
Object
  • Object
show all
Defined in:
lib/mindee/extraction/pdf_extractor/extracted_pdf.rb

Overview

An extracted sub-Pdf.

Instance Attribute Summary collapse

Instance Method Summary collapse

Constructor Details

#initialize(pdf_bytes, filename) ⇒ ExtractedPdf

Returns a new instance of ExtractedPdf.

Parameters:

  • pdf_bytes (StreamIO)
  • filename (String)


19
20
21
22
# File 'lib/mindee/extraction/pdf_extractor/extracted_pdf.rb', line 19

def initialize(pdf_bytes, filename)
  @pdf_bytes = pdf_bytes
  @filename = filename
end

Instance Attribute Details

#filenameString (readonly)

Name of the file.

Returns:

  • (String)


15
16
17
# File 'lib/mindee/extraction/pdf_extractor/extracted_pdf.rb', line 15

def filename
  @filename
end

#pdf_bytesStreamIO (readonly)

Byte contents of the pdf

Returns:

  • (StreamIO)


11
12
13
# File 'lib/mindee/extraction/pdf_extractor/extracted_pdf.rb', line 11

def pdf_bytes
  @pdf_bytes
end

Instance Method Details

#as_input_sourceMindee::Input::Source::BytesInputSource

Returns the current PDF object as a usable BytesInputSource.



49
50
51
# File 'lib/mindee/extraction/pdf_extractor/extracted_pdf.rb', line 49

def as_input_source
  Mindee::Input::Source::BytesInputSource.new(@pdf_bytes.read, @filename)
end

#page_countInteger

Retrieves the page count for a given pdf.

Returns:

  • (Integer)


26
27
28
29
30
31
# File 'lib/mindee/extraction/pdf_extractor/extracted_pdf.rb', line 26

def page_count
  current_pdf = Mindee::PDF::PdfProcessor.open_pdf(pdf_bytes)
  current_pdf.pages.size
rescue TypeError
  raise 'Could not retrieve page count from Extracted PDF object.'
end

#write_to_file(output_path) ⇒ Object

Writes the contents of the current PDF object to a file.

Parameters:

  • output_path (String)

    Path to write to.



35
36
37
38
39
40
41
42
43
44
45
# File 'lib/mindee/extraction/pdf_extractor/extracted_pdf.rb', line 35

def write_to_file(output_path)
  raise 'Provided path is not a file' if File.directory?(destination)
  raise 'Invalid save path provided' unless File.exist?(File.expand_path('..', output_path))

  if File.extname(output_path).downcase == '.pdf'
    base_path = File.expand_path('..', output_path)
    output_path = File.expand_path("#{File.basename(output_path)}.pdf", base_path)
  end

  File.write(output_path, @pdf_bytes)
end