Class: HexaPDF::Document::Files

Inherits:
Object
  • Object
show all
Includes:
Enumerable
Defined in:
lib/hexapdf/document/files.rb

Overview

This class provides methods for managing file specifications of a PDF file.

Note that for a given PDF file not all file specifications may be found, e.g. when a file specification is only a string. Therefore this module can only handle those file specifications that are indirect file specification dictionaries with the /Type key set.

Instance Method Summary collapse

Constructor Details

#initialize(document) ⇒ Files

Creates a new Files object for the given PDF document.



50
51
52
# File 'lib/hexapdf/document/files.rb', line 50

def initialize(document)
  @document = document
end

Instance Method Details

#add(file_or_io, name: nil, description: nil, mime_type: nil, embed: true) ⇒ Object

:call-seq:

files.add(filename, name: nil, description: nil, embed: true) -> file_spec
files.add(io, name:, description: nil)                        -> file_spec

Adds the file or IO to the PDF document and returns the corresponding file specification object.

Options:

name

The name that should be used for the file path. This name is also used for registering the file in the EmbeddedFiles name tree.

When a filename is given, the basename of the file is used by default for name if it is not specified.

description

A description of the file.

mime_type

The MIME type that should be set for embedded files (so only used if embed is true).

embed

When an IO object is given, it is always embedded and this option is ignored.

When a filename is given and this option is true, then the file is embedded. Otherwise only a reference to it is stored.

See: HexaPDF::Type::FileSpecification



83
84
85
86
87
88
89
90
91
92
93
94
95
96
# File 'lib/hexapdf/document/files.rb', line 83

def add(file_or_io, name: nil, description: nil, mime_type: nil, embed: true)
  name ||= File.basename(file_or_io) if file_or_io.kind_of?(String)
  if name.nil?
    raise ArgumentError, "The name argument is mandatory when given an IO object"
  end

  spec = @document.add({Type: :Filespec})
  spec.path = name
  spec[:Desc] = description if description
  if embed || !file_or_io.kind_of?(String)
    spec.embed(file_or_io, name: name, mime_type: mime_type, register: true)
  end
  spec
end

#each(search: false) ⇒ Object

:call-seq:

files.each(search: false) {|file_spec| block }   -> files
files.each(search: false)                        -> Enumerator

Iterates over indirect file specification dictionaries of the PDF.

By default, only the file specifications in their standard locations, i.e. in the EmbeddedFiles name tree and in the page annotations, are returned. If the search option is true, then all indirect objects are searched for file specification dictionaries which can be much slower.



108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
# File 'lib/hexapdf/document/files.rb', line 108

def each(search: false)
  return to_enum(__method__, search: search) unless block_given?

  if search
    @document.each do |obj|
      yield(obj) if obj.type == :Filespec
    end
  else
    seen = {}
    tree = @document.catalog[:Names] && @document.catalog[:Names][:EmbeddedFiles]
    tree&.each_entry do |_, spec|
      seen[spec] = true
      yield(spec)
    end

    @document.pages.each do |page|
      page.each_annotation do |annot|
        next unless annot[:Subtype] == :FileAttachment
        spec = annot[:FS]
        yield(spec) unless seen.key?(spec)
        seen[spec] = true
      end
    end
  end

  self
end