Class: HexaPDF::Document::Metadata
- Inherits:
-
Object
- Object
- HexaPDF::Document::Metadata
- Defined in:
- lib/hexapdf/document/metadata.rb
Overview
This class provides methods for reading and writing the document-level metadata.
When an instance is created (usually through HexaPDF::Document#metadata), the metadata is read from the document’s information dictionary (see HexaPDF::Type::Info) and made available through the various methods.
By default, the metadata is written to the information dictionary as well as to the document’s metadata stream (see HexaPDF::Type::Metadata) once the document is written. This can be controlled via the #write_info_dict and #write_metdata_stream methods.
While HexaPDF is able to write an XMP packet (using a limited form) to the document’s metadata stream, it provides no way for reading XMP metadata. If reading functionality or extended writing functionality is needed, make sure this class does not write the metadata and read/create the metadata stream yourself.
Caveats
-
Disabling writing to the information dictionary will only prevent parts from being written. The #producer is always written to the information dictionary as per the AGPL license terms. The #modification_date may be written depending on the arguments to HexaPDF::Document#write.
-
If writing the metadata stream is enabled, any existing metadata stream is completely overwritten. This means the metadata stream is not updated with the changed information.
Adding custom metadata properties
All the properties specified for the information dictionary are supported.
Furthermore, HexaPDF supports writing custom properties to the metadata stream. For this to work the used XMP namespaces need to be registered using #register_namespace. Additionally, the types of all used XMP properties need to be registered using #register_property.
The following types for XMP properties are supported:
- String
-
Maps to the XMP simple string value. Values need to be of type String.
- Integer
-
Maps to the XMP integer core value type and gets formatted as string. Values need to be of type Integer.
- Date
-
Maps to the XMP simple string value, correctly formatted. Values need to be of type Time, Date, or DateTime
- URI
-
Maps to the XMP simple value variant of URI. Values need to be of type String or URI.
- Boolean
-
Maps to the XMP simple string value, correctly formatted. Values need to be either
true
orfalse
. - OrderedArray
-
Maps to the XMP ordered array. Values need to be of type Array and items must be XMP simple values.
- UnorderedArray
-
Maps to the XMP unordered array. Values need to be of type Array and items must be simple values.
LanguageArray
Maps to the XMP language alternatives array. Values need to be of type Array and items
must either be strings (they are associated with the set default language) or
LocalizedString instances.
See: PDF2.0 s14.3, www.adobe.com/products/xmp.html
Defined Under Namespace
Classes: LocalizedString
Constant Summary collapse
- PREDEFINED_NAMESPACES =
Contains a mapping of predefined prefixes for XMP namespaces for metadata.
{ "rdf" => "http://www.w3.org/1999/02/22-rdf-syntax-ns#", "xmp" => "http://ns.adobe.com/xap/1.0/", "pdf" => "http://ns.adobe.com/pdf/1.3/", "dc" => "http://purl.org/dc/elements/1.1/", "x" => "adobe:ns:meta/", "pdfaid" => "http://www.aiim.org/pdfa/ns/id/", }.freeze
- PREDEFINED_PROPERTIES =
Contains a mapping of predefined XMP properties to their types, i.e. from namespace to property and then type.
{ "http://ns.adobe.com/xap/1.0/" => { 'CreatorTool' => 'String', 'CreateDate' => 'Date', 'ModifyDate' => 'Date', }.freeze, "http://ns.adobe.com/pdf/1.3/" => { 'Keywords' => 'String', 'Producer' => 'String', 'Trapped' => 'Boolean', }.freeze, "http://purl.org/dc/elements/1.1/" => { 'creator' => 'OrderedArray', 'description' => 'LanguageArray', 'title' => 'LanguageArray', }.freeze, "http://www.aiim.org/pdfa/ns/id/" => { 'part' => 'Integer', 'conformance' => 'String', }.freeze, }.freeze
Instance Method Summary collapse
-
#author(value = :UNSET) ⇒ Object
:call-seq: metadata.author -> author or nil metadata.author(value) -> value.
-
#creation_date(value = :UNSET) ⇒ Object
:call-seq: metadata.creation_date -> creation_date or nil metadata.creation_date(value) -> value.
-
#creator(value = :UNSET) ⇒ Object
:call-seq: metadata.creator -> creator or nil metadata.creator(value) -> value.
-
#custom_metadata(data) ⇒ Object
Adds the given
data
string as custom metadata to the XMP document. -
#default_language(value = :UNSET) ⇒ Object
:call-seq: metadata.default_language -> language metadata.default_language(value) -> value.
-
#delete(ns = nil, property = nil) ⇒ Object
:call-seq: metadata.delete metadata.delete(ns_prefix) metadata.delete(ns_prefix, name).
-
#initialize(document) ⇒ Metadata
constructor
Creates a new Metadata object for the given PDF document.
-
#keywords(value = :UNSET) ⇒ Object
:call-seq: metadata.keywords -> keywords or nil metadata.keywords(value) -> value.
-
#modification_date(value = :UNSET) ⇒ Object
:call-seq: metadata.modification_date -> modification_date or nil metadata.modification_date(value) -> value.
-
#namespace(ns) ⇒ Object
Returns the namespace URI associated with the given prefix.
-
#producer(value = :UNSET) ⇒ Object
:call-seq: metadata.producer -> producer or nil metadata.producer(value) -> value.
-
#property(ns, property, value = :UNSET) ⇒ Object
:call-seq: metadata.property(ns_prefix, name) -> property_value metadata.property(ns_prefix, name, value) -> value.
-
#register_namespace(prefix, uri) ⇒ Object
Registers the
prefix
for the given namespaceuri
. -
#register_property_type(prefix, property, type) ⇒ Object
Registers the
property
for the namespace specified viaprefix
as the giventype
. -
#subject(value = :UNSET) ⇒ Object
:call-seq: metadata.subject -> subject or nil metadata.subject(value) -> value.
-
#title(value = :UNSET) ⇒ Object
:call-seq: metadata.title -> title or nil metadata.title(value) -> value.
-
#trapped(value = :UNSET) ⇒ Object
:call-seq: metadata.trapped -> trapped or nil metadata.trapped(value) -> value.
-
#write_info_dict(value) ⇒ Object
Makes HexaPDF write the information dictionary if
value
istrue
. -
#write_info_dict? ⇒ Boolean
Returns
true
if the information dictionary should be written. -
#write_metadata_stream(value) ⇒ Object
Makes HexaPDF write the metadata stream if
value
istrue
. -
#write_metadata_stream? ⇒ Boolean
Returns
true
if the metadata stream should be written.
Constructor Details
#initialize(document) ⇒ Metadata
Creates a new Metadata object for the given PDF document.
158 159 160 161 162 163 164 165 166 167 168 169 |
# File 'lib/hexapdf/document/metadata.rb', line 158 def initialize(document) @document = document @namespaces = PREDEFINED_NAMESPACES.dup @properties = PREDEFINED_PROPERTIES.transform_values(&:dup) @default_language = document.catalog[:Lang] || 'x-default' @metadata = Hash.new {|h, k| h[k] = {} } @custom_metadata = [] write_info_dict(true) (true) @document.register_listener(:complete_objects, &method(:write_metadata)) end |
Instance Method Details
#author(value = :UNSET) ⇒ Object
:call-seq:
metadata.author -> author or nil
metadata.author(value) -> value
Returns the name of the person who created the document (author) if no argument is given. Otherwise sets the author to the given value.
The value nil
is returned if the property ist not set. And by using nil
as value
the property is deleted from the metadata.
This metadata property is represented by the XMP name dc:creator.
308 309 310 |
# File 'lib/hexapdf/document/metadata.rb', line 308 def (value = :UNSET) property('dc', 'creator', value) end |
#creation_date(value = :UNSET) ⇒ Object
:call-seq:
metadata.creation_date -> creation_date or nil
metadata.creation_date(value) -> value
Returns the date and time (a Time object) the document was created if no argument is given. Otherwise sets the creation date to the given value.
The value nil
is returned if the property ist not set. And by using nil
as value
the property is deleted from the metadata.
This metadata property is represented by the XMP name xmp:CreateDate.
387 388 389 |
# File 'lib/hexapdf/document/metadata.rb', line 387 def creation_date(value = :UNSET) property('xmp', 'CreateDate', value) end |
#creator(value = :UNSET) ⇒ Object
:call-seq:
metadata.creator -> creator or nil
metadata.creator(value) -> value
Returns the name of the PDF processor that created the original document from which this PDF was converted if no argument is given. Otherwise sets the name of the creator tool to the given value.
The value nil
is returned if the property ist not set. And by using nil
as value
the property is deleted from the metadata.
This metadata property is represented by the XMP name xmp:CreatorTool.
357 358 359 |
# File 'lib/hexapdf/document/metadata.rb', line 357 def creator(value = :UNSET) property('xmp', 'CreatorTool', value) end |
#custom_metadata(data) ⇒ Object
Adds the given data
string as custom metadata to the XMP document.
The data
string must contain a fully valid ‘rdf:Description’ element.
Using this method allows adding metadata like PDF/A schema definitions for which there is no direct support by HexaPDF.
258 259 260 |
# File 'lib/hexapdf/document/metadata.rb', line 258 def (data) @custom_metadata << data end |
#default_language(value = :UNSET) ⇒ Object
:call-seq:
metadata.default_language -> language
metadata.default_language(value) -> value
Returns the default language in RFC3066 format used for unlocalized strings if no argument is given. Otherwise sets the default language to the given language.
The initial default lanuage is taken from the document catalog’s /Lang entry. If that is not set, the default language is assumed to be default language (‘x-default’).
180 181 182 183 184 185 186 |
# File 'lib/hexapdf/document/metadata.rb', line 180 def default_language(value = :UNSET) if value == :UNSET @default_language else @default_language = value end end |
#delete(ns = nil, property = nil) ⇒ Object
:call-seq:
.delete
.delete(ns_prefix)
.delete(ns_prefix, name)
Deletes either all metadata properties, only the ones from a specific namespace, or a specific one.
269 270 271 272 273 274 275 276 277 |
# File 'lib/hexapdf/document/metadata.rb', line 269 def delete(ns = nil, property = nil) if ns.nil? && property.nil? @metadata.clear elsif property.nil? @metadata.delete(namespace(ns)) else @metadata[namespace(ns)].delete(property) end end |
#keywords(value = :UNSET) ⇒ Object
:call-seq:
metadata.keywords -> keywords or nil
metadata.keywords(value) -> value
Returns the keywords associated with the document if no argument is given. Otherwise sets keywords to the given value.
The value nil
is returned if the property ist not set. And by using nil
as value
the property is deleted from the metadata.
This metadata property is represented by the XMP name pdf:Keywords.
341 342 343 |
# File 'lib/hexapdf/document/metadata.rb', line 341 def keywords(value = :UNSET) property('pdf', 'Keywords', value) end |
#modification_date(value = :UNSET) ⇒ Object
:call-seq:
metadata.modification_date -> modification_date or nil
metadata.modification_date(value) -> value
Returns the date and time (a Time object) the document was most recently modified if no argument is given. Otherwise sets the modification date to the given value.
The value nil
is returned if the property ist not set. And by using nil
as value
the property is deleted from the metadata.
This metadata property is represented by the XMP name xmp:ModifyDate.
402 403 404 |
# File 'lib/hexapdf/document/metadata.rb', line 402 def modification_date(value = :UNSET) property('xmp', 'ModifyDate', value) end |
#namespace(ns) ⇒ Object
Returns the namespace URI associated with the given prefix.
218 219 220 221 222 |
# File 'lib/hexapdf/document/metadata.rb', line 218 def namespace(ns) @namespaces.fetch(ns) do raise HexaPDF::Error, "Namespace prefix '#{ns}' not registered" end end |
#producer(value = :UNSET) ⇒ Object
:call-seq:
metadata.producer -> producer or nil
metadata.producer(value) -> value
Returns the name of the PDF processor that converted the original document to PDF if no argument is given. Otherwise sets the name of the producer to the given value.
The value nil
is returned if the property ist not set. And by using nil
as value
the property is deleted from the metadata.
This metadata property is represented by the XMP name pdf:Producer.
372 373 374 |
# File 'lib/hexapdf/document/metadata.rb', line 372 def producer(value = :UNSET) property('pdf', 'Producer', value) end |
#property(ns, property, value = :UNSET) ⇒ Object
:call-seq:
metadata.property(ns_prefix, name) -> property_value
metadata.property(ns_prefix, name, value) -> value
Returns the value for the property specified via the namespace prefix ns_prefix
and name
if the value
argument is not provided. Otherwise sets the property to value
.
The value nil
is returned if the property ist not set. And by using nil
as value
the property is deleted from the metadata.
241 242 243 244 245 246 247 248 249 250 |
# File 'lib/hexapdf/document/metadata.rb', line 241 def property(ns, property, value = :UNSET) ns = @metadata[namespace(ns)] if value == :UNSET ns[property] elsif value.nil? ns.delete(property) else ns[property] = value end end |
#register_namespace(prefix, uri) ⇒ Object
Registers the prefix
for the given namespace uri
.
213 214 215 |
# File 'lib/hexapdf/document/metadata.rb', line 213 def register_namespace(prefix, uri) @namespaces[prefix] = uri end |
#register_property_type(prefix, property, type) ⇒ Object
Registers the property
for the namespace specified via prefix
as the given type
.
The argument type
has to be one of the following: ‘String’, ‘Integer’, ‘Date’, ‘URI’, ‘Boolean’, ‘OrderedArray’, ‘UnorderedArray’, or ‘LanguageArray’.
228 229 230 |
# File 'lib/hexapdf/document/metadata.rb', line 228 def register_property_type(prefix, property, type) (@properties[namespace(prefix)] ||= {})[property] = type end |
#subject(value = :UNSET) ⇒ Object
:call-seq:
metadata.subject -> subject or nil
metadata.subject(value) -> value
Returns the subject of the document if no argument is given. Otherwise sets the subject to the given value.
If the value
is a LocalizedString, the language for the subject is taken from it. Otherwise the language specified via #default_language is used.
The value nil
is returned if the property ist not set. And by using nil
as value
the property is deleted from the metadata.
This metadata property is represented by the XMP name dc:description.
326 327 328 |
# File 'lib/hexapdf/document/metadata.rb', line 326 def subject(value = :UNSET) property('dc', 'description', value) end |
#title(value = :UNSET) ⇒ Object
:call-seq:
metadata.title -> title or nil
metadata.title(value) -> value
Returns the document’s title if no argument is given. Otherwise sets the document’s title to the given value.
If the value
is a LocalizedString, the language for the title is taken from it. Otherwise the language specified via #default_language is used.
The value nil
is returned if the property is not set. And by using nil
as value
the property is deleted from the metadata.
This metadata property is represented by the XMP name dc:title.
293 294 295 |
# File 'lib/hexapdf/document/metadata.rb', line 293 def title(value = :UNSET) property('dc', 'title', value) end |
#trapped(value = :UNSET) ⇒ Object
:call-seq:
metadata.trapped -> trapped or nil
metadata.trapped(value) -> value
Returns true
if the document has been modified to include trapping information if no argument is given. Otherwise sets the trapped status to the given boolean value.
The value nil
is returned if the property ist not set. And by using nil
as value
the property is deleted from the metadata.
This metadata property is represented by the XMP name pdf:Trapped.
417 418 419 |
# File 'lib/hexapdf/document/metadata.rb', line 417 def trapped(value = :UNSET) property('pdf', 'Trapped', value) end |
#write_info_dict(value) ⇒ Object
Makes HexaPDF write the information dictionary if value
is true
.
See the class documentation for caveats.
196 197 198 |
# File 'lib/hexapdf/document/metadata.rb', line 196 def write_info_dict(value) @write_info_dict = value end |
#write_info_dict? ⇒ Boolean
Returns true
if the information dictionary should be written.
189 190 191 |
# File 'lib/hexapdf/document/metadata.rb', line 189 def write_info_dict? @write_info_dict end |
#write_metadata_stream(value) ⇒ Object
Makes HexaPDF write the metadata stream if value
is true
.
See the class documentation for caveats.
208 209 210 |
# File 'lib/hexapdf/document/metadata.rb', line 208 def (value) @write_metadata_stream = value end |
#write_metadata_stream? ⇒ Boolean
Returns true
if the metadata stream should be written.
201 202 203 |
# File 'lib/hexapdf/document/metadata.rb', line 201 def @write_metadata_stream end |