Class: Moab::SignatureCatalog

Inherits:
Manifest
  • Object
show all
Includes:
HappyMapper
Defined in:
lib/moab/signature_catalog.rb

Overview

Note:

Copyright © 2012 by The Board of Trustees of the Leland Stanford Junior University. All rights reserved. See LICENSE for details.

A digital object’s Signature Catalog is derived from an filtered aggregation of the file inventories of a digital object’s set of versions. (see #update) It has an entry for every file (identified by FileSignature) found in any of the versions, along with a record of the SDR storage location that was used to preserve a single file instance. Once this catalog has been populated, it has multiple uses:

  • The signature index is used to determine which files of a newly submitted object version are new additions and which are duplicates of files previously ingested. (See #version_additions) (When a new version contains a mixture of added files and files carried over from the previous version we only need to store the files from the new version that have unique file signatures.)

  • Reconstruction of an object version (see Moab::StorageObject#reconstruct_version) requires a combination of a full version’s FileInventory and the SignatureCatalog.

  • The catalog can also be used for performing consistency checks between manifest files and storage

Data Model

Examples:

See Also:

Instance Attribute Summary collapse

Instance Method Summary collapse

Constructor Details

#initialize(opts = {}) ⇒ SignatureCatalog

Returns a new instance of SignatureCatalog.



35
36
37
38
39
# File 'lib/moab/signature_catalog.rb', line 35

def initialize(opts={})
  @entries = Array.new
  @signature_hash = OrderedHash.new
  super(opts)
end

Instance Attribute Details

#block_countInteger

Returns The total disk usage (in 1 kB blocks) of all data files (estimating du -k result) (dynamically calculated).

Returns:

  • (Integer)

    The total disk usage (in 1 kB blocks) of all data files (estimating du -k result) (dynamically calculated)



84
# File 'lib/moab/signature_catalog.rb', line 84

attribute :block_count, Integer, :tag => 'blockCount', :on_save => Proc.new {|t| t.to_s}

#byte_countInteger

Returns The total size (in bytes) of all data files (dynamically calculated).

Returns:

  • (Integer)

    The total size (in bytes) of all data files (dynamically calculated)



76
# File 'lib/moab/signature_catalog.rb', line 76

attribute :byte_count, Integer, :tag => 'byteCount', :on_save => Proc.new {|t| t.to_s}

#catalog_datetimeTime

Returns The datetime at which the catalog was updated.

Returns:

  • (Time)

    The datetime at which the catalog was updated



56
# File 'lib/moab/signature_catalog.rb', line 56

attribute :catalog_datetime, Time, :tag => 'catalogDatetime', :on_save => Proc.new {|t| t.to_s}

#digital_object_idString

Returns The object ID (druid).

Returns:

  • (String)

    The object ID (druid)



43
# File 'lib/moab/signature_catalog.rb', line 43

attribute :digital_object_id, String, :tag => 'objectId'

#entriesArray<SignatureCatalogEntry>

Returns The set of data groups comprising the version.

Returns:



98
# File 'lib/moab/signature_catalog.rb', line 98

has_many :entries, SignatureCatalogEntry, :tag => 'entry'

#file_countInteger

Returns The total number of data files (dynamically calculated).

Returns:

  • (Integer)

    The total number of data files (dynamically calculated)



68
# File 'lib/moab/signature_catalog.rb', line 68

attribute :file_count, Integer, :tag => 'fileCount', :on_save => Proc.new {|t| t.to_s}

#signature_hashOrderedHash

Returns An index having FileSignature objects as keys and Moab::SignatureCatalogEntry objects as values.

Returns:



107
108
109
# File 'lib/moab/signature_catalog.rb', line 107

def signature_hash
  @signature_hash
end

#version_idInteger

Returns The ordinal version number.

Returns:

  • (Integer)

    The ordinal version number



47
# File 'lib/moab/signature_catalog.rb', line 47

attribute :version_id, Integer, :tag => 'versionId', :key => true, :on_save => Proc.new {|n| n.to_s}

Instance Method Details

#add_entry(entry) ⇒ void

This method returns an undefined value.

Returns Add a new entry to the catalog and to the #signature_hash index.

Parameters:



112
113
114
115
# File 'lib/moab/signature_catalog.rb', line 112

def add_entry(entry)
  @signature_hash[entry.signature] = entry
  entries << entry
end

#catalog_filepath(file_signature) ⇒ String

Returns The object-relative path of the file having the specified signature.

Parameters:

  • file_signature (FileSignature)

    The signature of the file whose path is sought

Returns:

  • (String)

    The object-relative path of the file having the specified signature

Raises:



119
120
121
122
123
# File 'lib/moab/signature_catalog.rb', line 119

def catalog_filepath(file_signature)
  catalog_entry = @signature_hash[file_signature]
  raise FileNotFoundException, "catalog entry not found for #{file_signature.fixity.inspect} in #{@digital_object_id} - #{@version_id}" if catalog_entry.nil?
  catalog_entry.storage_path
end

#composite_keyString

Returns The unique identifier concatenating digital object id with version id.

Returns:

  • (String)

    The unique identifier concatenating digital object id with version id



50
51
52
# File 'lib/moab/signature_catalog.rb', line 50

def composite_key
  @digital_object_id + '-' + StorageObject.version_dirname(@version_id)
end

#normalize_group_signatures(group, group_pathname = nil) ⇒ void

This method returns an undefined value.

Returns Inspect and upgrade the group’s signature data to include all desired checksums.

Parameters:

  • group (FileGroup)

    A group of the files from a file inventory

  • group_pathname (Pathname) (defaults to: nil)

    The location of the directory containing the group’s files



128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
# File 'lib/moab/signature_catalog.rb', line 128

def normalize_group_signatures(group, group_pathname=nil)
  unless  group_pathname.nil?
    group_pathname = Pathname(group_pathname)
    raise "Could not locate #{group_pathname}" unless group_pathname.exist?
  end
  group.files.each do |file|
    unless file.signature.complete?
      if @signature_hash.has_key?(file.signature)
        file.signature = @signature_hash.find {|k,v| k == file.signature}[0]
      elsif group_pathname
        file_pathname = group_pathname.join(file.instances[0].path)
        file.signature = file.signature.normalized_signature(file_pathname)
      end
    end
  end
end

#summary_fieldsArray<String>

Returns The data fields to include in summary reports.

Returns:

  • (Array<String>)

    The data fields to include in summary reports



92
93
94
# File 'lib/moab/signature_catalog.rb', line 92

def summary_fields
  %w{digital_object_id version_id catalog_datetime file_count byte_count block_count}
end

#update(version_inventory, data_pathname) ⇒ void

This method returns an undefined value.

Returns Compares the FileSignature entries in the new versions FileInventory against the signatures in this catalog and create new Moab::SignatureCatalogEntry addtions to the catalog.

Examples:

Parameters:

  • version_inventory (FileInventory)

    The complete inventory of the files comprising a digital object version

  • data_pathname (Pathname)

    The location of the object’s data directory



151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
# File 'lib/moab/signature_catalog.rb', line 151

def update(version_inventory, data_pathname)
  version_inventory.groups.each do |group|
    group.files.each do |file|
      unless @signature_hash.has_key?(file.signature)
        entry = SignatureCatalogEntry.new
        entry.version_id = version_inventory.version_id
        entry.group_id = group.group_id
        entry.path = file.instances[0].path
        if file.signature.complete?
          entry.signature = file.signature
        else
          file_pathname = data_pathname.join(group.group_id,entry.path)
          entry.signature = file.signature.normalized_signature(file_pathname)
        end
        add_entry(entry)
      end
    end
  end
  @version_id = version_inventory.version_id
  @catalog_datetime = Time.now
end

#version_additions(version_inventory) ⇒ FileInventory

Returns Retrurns a filtered copy of the input inventory containing only those files that were added in this version.

Examples:

Parameters:

  • version_inventory (FileInventory)

    The complete inventory of the files comprising a digital object version

Returns:

  • (FileInventory)

    Retrurns a filtered copy of the input inventory containing only those files that were added in this version



178
179
180
181
182
183
184
185
186
187
188
189
190
191
# File 'lib/moab/signature_catalog.rb', line 178

def version_additions(version_inventory)
  version_additions = FileInventory.new(:type=>'additions')
  version_additions.copy_ids(version_inventory)
  version_inventory.groups.each do |group|
    group_addtions = FileGroup.new(:group_id => group.group_id)
    group.files.each do |file|
      unless @signature_hash.has_key?(file.signature)
        group_addtions.add_file_instance(file.signature,file.instances[0])
      end
    end
    version_additions.groups << group_addtions if group_addtions.files.size > 0
  end
  version_additions
end