Class: Mindee::Extraction::ExtractedImage

Inherits:
Object
  • Object
show all
Defined in:
lib/mindee/extraction/common/extracted_image.rb

Overview

Generic class for image extraction.

Instance Attribute Summary collapse

Instance Method Summary collapse

Constructor Details

#initialize(input_source, page_id, element_id) ⇒ ExtractedImage

Initializes the ExtractedImage with a buffer and an internal file name.

Parameters:

  • input_source (LocalInputSource)

    Local source for input.

  • page_id (Integer)

    ID of the page the element was found on.

  • element_id (Integer, nil)

    ID of the element in a page.



27
28
29
30
31
32
33
34
35
36
37
38
# File 'lib/mindee/extraction/common/extracted_image.rb', line 27

def initialize(input_source, page_id, element_id)
  @buffer = StringIO.new(input_source.io_stream.read)
  @buffer.rewind
  extension = if input_source.pdf?
                'jpg'
              else
                File.extname(input_source.filename)
              end
  @internal_file_name = "#{input_source.filename}_p#{page_id}_#{element_id}.#{extension}"
  @page_id = page_id
  @element_id = element_id.nil? ? 0 : element_id
end

Instance Attribute Details

#bufferObject (readonly)

Buffer object of the file's content.



17
18
19
# File 'lib/mindee/extraction/common/extracted_image.rb', line 17

def buffer
  @buffer
end

#element_idObject (readonly)

Id of the element on a given page.



14
15
16
# File 'lib/mindee/extraction/common/extracted_image.rb', line 14

def element_id
  @element_id
end

#internal_file_nameObject (readonly)

Internal name for the file.



20
21
22
# File 'lib/mindee/extraction/common/extracted_image.rb', line 20

def internal_file_name
  @internal_file_name
end

#page_idObject (readonly)

Id of the page the image was extracted from.



11
12
13
# File 'lib/mindee/extraction/common/extracted_image.rb', line 11

def page_id
  @page_id
end

Instance Method Details

#as_sourceFileInputSource

Return the file as a Mindee-compatible BufferInput source.

Returns:

  • (FileInputSource)

    A BufferInput source.



66
67
68
69
# File 'lib/mindee/extraction/common/extracted_image.rb', line 66

def as_source
  @buffer.rewind
  Mindee::Input::Source::BytesInputSource.new(@buffer.read, @internal_file_name)
end

#save_to_file(output_path, file_format = nil) ⇒ Object

Saves the document to a file.

extension if not provided.

Parameters:

  • output_path (String)

    Path to save the file to.

  • file_format (String, nil) (defaults to: nil)

    Optional MiniMagick-compatible format for the file. Inferred from file

Raises:

  • (MindeeError)

    If an invalid path or filename is provided.



46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
# File 'lib/mindee/extraction/common/extracted_image.rb', line 46

def save_to_file(output_path, file_format = nil)
  resolved_path = Pathname.new(output_path).realpath
  if file_format.nil?
    raise ArgumentError, 'Invalid file format.' if resolved_path.extname.delete('.').empty?

    file_format = resolved_path.extname.delete('.').upcase
  end
  @buffer.rewind
  image = MiniMagick::Image.read(@buffer)
  image.format file_format.downcase
  image.write resolved_path.to_s
rescue TypeError
  raise 'Invalid path/filename provided.'
rescue StandardError
  raise "Could not save file #{Pathname.new(output_path).basename}."
end