Class: PDF::Toolkit

Inherits:
Object
  • Object
show all
Extended by:
Forwardable
Includes:
Enumerable
Defined in:
lib/pdf/toolkit.rb

Overview

PDF::Toolkit can be used as a simple class, or derived from and tweaked. The following two examples have identical results.

my_pdf = PDF::Toolkit.open("somefile.pdf")
my_pdf.updated_at = Time.now # ModDate
my_pdf["SomeAttribute"] = "Some value"
my_pdf.save!

class MyDocument < PDF::Toolkit
  info_accessor :some_attribute
  def before_save
    self.updated_at = Time.now
  end
end
my_pdf = MyDocument.open("somefile.pdf")
my_pdf.some_attribute = "Some value"
my_pdf.save!

Note the use of a before_save callback in the second example. This is the only supported callback unless you use the experimental #loot_active_record class method.

Requirements

PDF::Toolkit requires pdftk, which is available from www.accesspdf.com/pdftk. For full functionality, also install xpdf from www.foolabs.com/xpdf. ActiveSupport (from Ruby on Rails) is also required but this dependency may be removed in the future.

Limitations

Timestamps are written in UTF-16 by pdftk, which is not appropriately handled by pdfinfo.

pdftk requires the owner password, even for simply querying the document.

Defined Under Namespace

Classes: Error, ExecutionError, FileNotSaved

Constant Summary collapse

PDF_TOOLKIT_VERSION =
"0.5.0"

Instance Attribute Summary collapse

Class Method Summary collapse

Instance Method Summary collapse

Constructor Details

#initialize(filename, input_password = nil) ⇒ Toolkit

Like open, only the attributes are lazily loaded. Under most circumstances, open is preferred.



233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
# File 'lib/pdf/toolkit.rb', line 233

def initialize(filename,input_password = nil)
  @filename = if filename.respond_to?(:to_str)
                filename.to_str
              elsif filename.kind_of?(self.class)
                filename.instance_variable_get("@filename")
              elsif filename.respond_to?(:path)
                filename.path
              else
                filename
              end
  @input_password = input_password || default_input_password
  @owner_password = default_owner_password
  @user_password  = default_user_password
  @permissions = default_permissions || []
  @new_info = {}
  callback(:after_initialize) if respond_to?(:after_initialize) && respond_to?(:callback)
  # reload
end

Instance Attribute Details

#owner_password=(value) ⇒ Object (writeonly)

Sets the attribute owner_password

Parameters:

  • value

    the value to set the attribute owner_password to.



253
254
255
# File 'lib/pdf/toolkit.rb', line 253

def owner_password=(value)
  @owner_password = value
end

#pdf_idsObject (readonly)

Returns the value of attribute pdf_ids.



252
253
254
# File 'lib/pdf/toolkit.rb', line 252

def pdf_ids
  @pdf_ids
end

#permissionsObject (readonly)

Returns the value of attribute permissions.



252
253
254
# File 'lib/pdf/toolkit.rb', line 252

def permissions
  @permissions
end

#user_password=(value) ⇒ Object (writeonly)

Sets the attribute user_password

Parameters:

  • value

    the value to set the attribute user_password to.



253
254
255
# File 'lib/pdf/toolkit.rb', line 253

def user_password=(value)
  @user_password = value
end

Class Method Details

.human_attribute_name(arg) ⇒ Object

:nodoc:



160
161
162
# File 'lib/pdf/toolkit.rb', line 160

def human_attribute_name(arg) #:nodoc:
  defined? ActiveRecord::Base ? ActiveRecord::Base.human_attribute_name(arg) : arg.gsub(/_/,' ')
end

.info_accessor(accessor_name, info_key = nil) ⇒ Object

Add an accessor for a key. If the key is omitted, defaults to a camelized version of the accessor (foo_bar becomes FooBar). The example below illustrates the defaults.

class MyDocument < PDF::Toolkit
  info_accessor :created_at, "CreationDate"
  info_accessor :updated_at, "ModDate"
  info_accessor :author
  [:subject, :title, :keywords, :producer, :creator].each do |key|
    info_accessor key
  end
end

MyDocument.open("document.pdf").created_at


100
101
102
103
104
105
106
107
108
109
# File 'lib/pdf/toolkit.rb', line 100

def info_accessor(accessor_name, info_key = nil)
  info_key ||= camelize_key(accessor_name)
  read_inheritable_attribute(:info_accessors)[accessor_name] = info_key
  define_method accessor_name do
    self[info_key]
  end
  define_method "#{accessor_name}=" do |value|
    self[info_key] = value
  end
end

.loot_active_recordObject

This method will require and include validations, callbacks, and timestamping from ActiveRecord. Use at your own risk.



135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
# File 'lib/pdf/toolkit.rb', line 135

def loot_active_record
  require 'active_support'
  require 'active_record'
  # require 'active_record/validations'
  # require 'active_record/callbacks'
  # require 'active_record/timestamp'

  unless defined? @@looted_active_record
    @@looted_active_record = true
    meta = (class <<self; self; end)
    alias_method :initialize_ar_hack, :initialize
    include ActiveRecord::Validations
    include ActiveRecord::Callbacks
    include ActiveRecord::Timestamp
    alias_method :initialize, :initialize_ar_hack

    cattr_accessor :record_timestamps # nil by default

    meta.send(:define_method,:default_timezone) do
      defined? ActiveRecord::Base ?  ActiveRecord::Base.default_timezone : :local
    end
  end
  self
end

.open(filename, input_password = nil) ⇒ Object

Create a new object associated with filename and read in the associated metadata.

my_pdf = PDF::Toolkit.open("document.pdf")


225
226
227
228
229
# File 'lib/pdf/toolkit.rb', line 225

def self.open(filename,input_password = nil)
  object = new(filename,input_password)
  object.reload
  object
end

.pdftk(*args, &block) ⇒ Object

Invoke pdftk with the given arguments, plus dont_ask. If :mode or a block is given, IO::popen is called. Otherwise, Kernel#system is used.

result = PDF::Toolkit.pdftk(*%w(foo.pdf bar.pdf cat output baz.pdf))
io = PDF::Toolkit.pdftk("foo.pdf","dump_data","output","-",:mode => 'r')
PDF::Toolkit.pdftk("foo.pdf","dump_data","output","-") { |io| io.read }


118
119
120
121
122
123
124
# File 'lib/pdf/toolkit.rb', line 118

def pdftk(*args,&block)
  options = args.last.is_a?(Hash) ? args.pop : {}
  args << "dont_ask"
  args << options
  result = call_program(executables[:pdftk],*args,&block)
  return block_given? ? $?.success? : result
end

.pdftotext(file, outfile = nil, &block) ⇒ Object

Invoke pdftotext. If outfile is omitted, returns an IO object for the output.



128
129
130
131
# File 'lib/pdf/toolkit.rb', line 128

def pdftotext(file,outfile = nil,&block)
  call_program(executables[:pdftotext],file,
    outfile||"-",:mode => (outfile ? nil : 'r'),&block)
end

Instance Method Details

#[](key) ⇒ Object

Read a metadata attribute.

author = my_pdf["Author"]

See info_accessor for an alternate syntax.



352
353
354
355
356
357
# File 'lib/pdf/toolkit.rb', line 352

def [](key)
  key = lookup_key(key)
  return @new_info[key.to_s] if @new_info.has_key?(key.to_s)
  ensure_loaded
  @info[key.to_s]
end

#[]=(key, value) ⇒ Object

Write a metadata attribute.

my_pdf["Author"] = author

See info_accessor for an alternate syntax.



365
366
367
368
# File 'lib/pdf/toolkit.rb', line 365

def []=(key,value)
  key = lookup_key(key)
  @new_info[key.to_s] = value
end

#delete(key) ⇒ Object

Remove the metadata attribute from the file.



384
385
386
387
388
389
390
391
# File 'lib/pdf/toolkit.rb', line 384

def delete(key)
  key = lookup_key(key)
  if @info.has_key?(key) || !@pages
    @new_info[key] = nil
  else
    @new_info.delete(key)
  end
end

#delete_if(&block) ⇒ Object

Remove metadata if the given block returns false. The following would remove all timestamps.

my_pdf.delete_if {|key,value| value.kind_of?(Time)}


410
411
412
413
# File 'lib/pdf/toolkit.rb', line 410

def delete_if(&block)
  reject!(&block)
  self
end

#has_key?(value) ⇒ Boolean Also known as: key?

True if the file has the given metadata attribute.

Returns:

  • (Boolean)


376
377
378
379
380
# File 'lib/pdf/toolkit.rb', line 376

def has_key?(value)
  ensure_loaded
  value = lookup_key(value)
  (@info.has_key?(value) || @new_info.has_key?(value)) && !!(self[value])
end

#merge!(hash) ⇒ Object

Add the specified attributes to the file. If symbols are given as keys, they are camelized.

my_pdf.merge!(“Author” => “Dave Thomas”, :title => “Programming Ruby”)



419
420
421
422
423
424
# File 'lib/pdf/toolkit.rb', line 419

def merge!(hash)
  hash.each do |k,v|
    @new_info[lookup_key(k)] = v
  end
  self
end

#new_record?Boolean

:nodoc:

Returns:

  • (Boolean)


343
344
345
# File 'lib/pdf/toolkit.rb', line 343

def new_record? #:nodoc:
  !@new_filename.nil?
end

#page_countObject Also known as: pages



255
256
257
258
# File 'lib/pdf/toolkit.rb', line 255

def page_count
  read_data unless @pages
  @pages
end

#pathObject

Path to the file.



263
264
265
# File 'lib/pdf/toolkit.rb', line 263

def path
  @new_filename || @filename
end

#reject!(&block) ⇒ Object

Like delete_if, only nil is returned if no attributes were removed.



394
395
396
397
398
399
400
401
402
403
404
# File 'lib/pdf/toolkit.rb', line 394

def reject!(&block)
  ensure_loaded
  ret = nil
  each do |key,value|
    if yield(key,value)
      ret = self
      delete(key)
    end
  end
  ret
end

#reloadObject

Reload (or load) the file’s metadata.



277
278
279
280
281
282
# File 'lib/pdf/toolkit.rb', line 277

def reload
  @new_info = {}
  read_data
  # run_callbacks_for(:after_load)
  self
end

#saveObject

Commit changes to the PDF. The return value is a boolean reflecting the success of the operation (This should always be true unless you’re utilizing #loot_active_record).



287
288
289
# File 'lib/pdf/toolkit.rb', line 287

def save
  create_or_update
end

#save!Object

Like save, only raise an exception if the operation fails.

TODO: ensure no ActiveRecord::RecordInvalid errors make it through.



294
295
296
297
298
299
300
# File 'lib/pdf/toolkit.rb', line 294

def save!
  if save
    self
  else
    raise FileNotSaved
  end
end

#save_as(filename) ⇒ Object

Save to a different file. A new object is returned if the operation succeeded. Otherwise, nil is returned.



304
305
306
307
308
# File 'lib/pdf/toolkit.rb', line 304

def save_as(filename)
  dup.save_as!(filename)
rescue FileNotSaved
  nil
end

#save_as!(filename) ⇒ Object

Save to a different file. The existing object is modified. An exception is raised if the operation fails.



312
313
314
315
316
317
318
# File 'lib/pdf/toolkit.rb', line 312

def save_as!(filename)
  @new_filename = filename
  save!
  self
rescue ActiveRecord::RecordInvalid
  raise FileNotSaved
end

#to_hashObject

Create a hash from the file’s metadata.



335
336
337
338
# File 'lib/pdf/toolkit.rb', line 335

def to_hash
  ensure_loaded
  @info.merge(@new_info).reject {|key,value| value.nil?}
end

#to_sObject

:nodoc:



328
329
330
# File 'lib/pdf/toolkit.rb', line 328

def to_s #:nodoc:
  "#<#{self.class}:#{path}>"
end

#to_text(filename = nil, &block) ⇒ Object

Invoke pdftotext on the file and return an IO object for reading the results.

text = my_pdf.to_text.read


324
325
326
# File 'lib/pdf/toolkit.rb', line 324

def to_text(filename = nil,&block)
  self.class.send(:pdftotext,@filename,filename,&block)
end

#update_attribute(key, value) ⇒ Object



370
371
372
373
# File 'lib/pdf/toolkit.rb', line 370

def update_attribute(key,value)
  self[key] = value
  save
end

#versionObject

Retrieve the file’s version as a symbol.

my_pdf.version # => :"1.4"


270
271
272
273
274
# File 'lib/pdf/toolkit.rb', line 270

def version
  @version ||= File.open(@filename) do |io|
    io.read(8)[5..-1].to_sym
  end
end