Class: HexaPDF::Type::ObjectStream
- Inherits:
-
Stream
- Object
- Object
- Dictionary
- Stream
- HexaPDF::Type::ObjectStream
- Defined in:
- lib/hexapdf/type/object_stream.rb
Overview
Represents PDF type ObjStm, object streams.
An object stream is a stream that can hold multiple indirect objects. Since the objects are stored inside the stream, filters can be used to compress the stream content and therefore represent the indirect objects more compactly than would be possible otherwise.
How are Object Streams Used?
When an indirect object that resides in an object stream needs to be loaded, the object stream itself is parsed and loaded and #parse_stream is invoked to get an ObjectStream::Data object representing the stored indirect objects. After that the requested indirect object itself is loaded and returned using this ObjectStream::Data object. From a user’s perspective nothing changes when an object is located inside an object stream instead of directly in a PDF file.
The indirect objects initially stored in the object stream are automatically added to the list of to-be-stored objects when #parse_stream is invoked. Additional objects can be assigned to the object stream via #add_object or deleted from it via #delete_object.
Before an object stream is written, it is necessary to invoke #write_objects so that the to-be-stored objects are serialized to the stream. This is automatically done by the Writer. A user thus only has to define which objects should reside in the object stream.
However, only objects that can be written to the object stream are actually written. The other objects are deleted from the object stream (#delete_object) and written normally.
See PDF2.0 s7.5.7
Defined Under Namespace
Classes: Data
Constant Summary
Constants included from DictionaryFields
DictionaryFields::Boolean, DictionaryFields::PDFByteString, DictionaryFields::PDFDate
Instance Attribute Summary
Attributes inherited from Object
#data, #document, #must_be_indirect
Instance Method Summary collapse
-
#add_object(ref) ⇒ Object
Adds the given object to the list of objects that should be stored in this object stream.
-
#delete_object(ref) ⇒ Object
Deletes the given object from the list of objects that should be stored in this object stream.
-
#object_index(obj) ⇒ Object
Returns the index into the array containing the to-be-stored objects for the given reference/PDF object.
-
#parse_stream ⇒ Object
Parses the stream and returns an ObjectStream::Data object that can be used for retrieving the objects defined by this object stream.
-
#write_objects(revision) ⇒ Object
:call-seq: objstm.write_objects(revision) -> obj_to_stm_hash.
Methods inherited from Stream
#must_be_indirect?, #raw_stream, #set_filter, #stream, #stream=, #stream_decoder, #stream_encoder, #stream_source
Methods inherited from Dictionary
#[], #[]=, define_field, define_type, #delete, #each, each_field, #empty?, field, #key?, #to_hash, type, #type
Methods inherited from Object
#<=>, #==, #cache, #cached?, #clear_cache, deep_copy, #deep_copy, #document?, #eql?, field, #gen, #gen=, #hash, #indirect?, #initialize, #inspect, make_direct, #must_be_indirect?, #null?, #oid, #oid=, #type, #validate, #value, #value=
Constructor Details
This class inherits a constructor from HexaPDF::Object
Instance Method Details
#add_object(ref) ⇒ Object
Adds the given object to the list of objects that should be stored in this object stream.
The ref
argument can either be a reference or any PDF object.
125 126 127 128 129 130 131 |
# File 'lib/hexapdf/type/object_stream.rb', line 125 def add_object(ref) return if object_index(ref) index = objects.size / 2 objects[index] = ref objects[ref] = index end |
#delete_object(ref) ⇒ Object
Deletes the given object from the list of objects that should be stored in this object stream.
The ref
argument can either be a reference or a PDF object.
137 138 139 140 141 142 143 144 145 146 147 |
# File 'lib/hexapdf/type/object_stream.rb', line 137 def delete_object(ref) index = objects[ref] return unless index move_index = objects.size / 2 - 1 objects[index] = objects[move_index] objects[objects[index]] = index objects.delete(ref) objects.delete(move_index) end |
#object_index(obj) ⇒ Object
Returns the index into the array containing the to-be-stored objects for the given reference/PDF object.
151 152 153 |
# File 'lib/hexapdf/type/object_stream.rb', line 151 def object_index(obj) objects[obj] end |
#parse_stream ⇒ Object
Parses the stream and returns an ObjectStream::Data object that can be used for retrieving the objects defined by this object stream.
The object references are also added to this object stream so that they are included when the object gets written.
113 114 115 116 117 118 119 120 |
# File 'lib/hexapdf/type/object_stream.rb', line 113 def parse_stream return @stream_data if defined?(@stream_data) data = stream oids, offsets = parse_oids_and_offsets(data) @objects ||= {} oids.each {|oid| add_object(Reference.new(oid, 0)) } @stream_data = Data.new(data, oids, offsets) end |
#write_objects(revision) ⇒ Object
:call-seq:
objstm.write_objects(revision) -> obj_to_stm_hash
Writes the added objects to the stream and returns a hash mapping all written objects to this object stream.
There are some reasons why an added object may not be stored in the stream:
-
It has a generation number other than 0.
-
It is a stream object.
-
It doesn’t reside in the given Revision object.
Such objects are additionally deleted from the list of to-be-stored objects and are later written as indirect objects.
169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 |
# File 'lib/hexapdf/type/object_stream.rb', line 169 def write_objects(revision) index = 0 object_info = ''.b data = ''.b serializer = Serializer.new obj_to_stm = {} is_encrypt_dict = document.revisions.each.with_object({}) do |rev, hash| hash[rev.trailer[:Encrypt]] = true end while index < objects.size / 2 obj = revision.object(objects[index]) # Due to a bug in Adobe Acrobat, the Catalog may not be in an object stream if the # document is encrypted if obj.nil? || obj.null? || obj.gen != 0 || obj.kind_of?(Stream) || is_encrypt_dict[obj] || obj.type == :Catalog || obj.type == :Sig || obj.type == :DocTimeStamp || (obj.respond_to?(:key?) && obj.key?(:ByteRange) && obj.key?(:Contents)) delete_object(objects[index]) next end obj_to_stm[obj] = self object_info << "#{obj.oid} #{data.size} " data << serializer.serialize(obj) << " " index += 1 end value[:Type] = :ObjStm value[:N] = objects.size / 2 value[:First] = object_info.size self.stream = object_info << data set_filter(:FlateDecode) obj_to_stm end |