Class: Moab::SignatureCatalog
- Inherits:
-
Manifest
- Object
- Manifest
- Moab::SignatureCatalog
- Includes:
- HappyMapper
- Defined in:
- lib/moab/signature_catalog.rb
Overview
Copyright © 2012 by The Board of Trustees of the Leland Stanford Junior University. All rights reserved. See LICENSE for details.
A digital object’s Signature Catalog is derived from an filtered aggregation of the file inventories of a digital object’s set of versions. (see #update) It has an entry for every file (identified by FileSignature) found in any of the versions, along with a record of the SDR storage location that was used to preserve a single file instance. Once this catalog has been populated, it has multiple uses:
-
The signature index is used to determine which files of a newly submitted object version are new additions and which are duplicates of files previously ingested. (See #version_additions) (When a new version contains a mixture of added files and files carried over from the previous version we only need to store the files from the new version that have unique file signatures.)
-
Reconstruction of an object version (see Moab::StorageObject#reconstruct_version) requires a combination of a full version’s FileInventory and the SignatureCatalog.
-
The catalog can also be used for performing consistency checks between manifest files and storage
Data Model
-
SignatureCatalog = lookup table containing a cumulative collection of all files ever ingested
-
SignatureCatalogEntry [1..*] = an row in the lookup table containing storage information about a single file
-
FileSignature [1] = file fixity information
-
-
Instance Attribute Summary collapse
-
#block_count ⇒ Integer
The total disk usage (in 1 kB blocks) of all data files (estimating du -k result) (dynamically calculated).
-
#byte_count ⇒ Integer
The total size (in bytes) of all data files (dynamically calculated).
-
#catalog_datetime ⇒ Time
The datetime at which the catalog was updated.
-
#digital_object_id ⇒ String
The object ID (druid).
-
#entries ⇒ Array<SignatureCatalogEntry>
The set of data groups comprising the version.
-
#file_count ⇒ Integer
The total number of data files (dynamically calculated).
-
#signature_hash ⇒ OrderedHash
An index having FileSignature objects as keys and SignatureCatalogEntry objects as values.
-
#version_id ⇒ Integer
The ordinal version number.
Instance Method Summary collapse
-
#add_entry(entry) ⇒ void
Add a new entry to the catalog and to the #signature_hash index.
-
#catalog_filepath(file_signature) ⇒ String
The object-relative path of the file having the specified signature.
-
#composite_key ⇒ String
The unique identifier concatenating digital object id with version id.
-
#initialize(opts = {}) ⇒ SignatureCatalog
constructor
A new instance of SignatureCatalog.
-
#normalize_group_signatures(group, group_pathname = nil) ⇒ void
Inspect and upgrade the group’s signature data to include all desired checksums.
-
#summary_fields ⇒ Array<String>
The data fields to include in summary reports.
-
#update(version_inventory, data_pathname) ⇒ void
Compares the FileSignature entries in the new versions FileInventory against the signatures in this catalog and create new SignatureCatalogEntry addtions to the catalog.
-
#version_additions(version_inventory) ⇒ FileInventory
Retrurns a filtered copy of the input inventory containing only those files that were added in this version.
Constructor Details
#initialize(opts = {}) ⇒ SignatureCatalog
Returns a new instance of SignatureCatalog.
35 36 37 38 39 |
# File 'lib/moab/signature_catalog.rb', line 35 def initialize(opts={}) @entries = Array.new @signature_hash = OrderedHash.new super(opts) end |
Instance Attribute Details
#block_count ⇒ Integer
Returns The total disk usage (in 1 kB blocks) of all data files (estimating du -k result) (dynamically calculated).
84 |
# File 'lib/moab/signature_catalog.rb', line 84 attribute :block_count, Integer, :tag => 'blockCount', :on_save => Proc.new {|t| t.to_s} |
#byte_count ⇒ Integer
Returns The total size (in bytes) of all data files (dynamically calculated).
76 |
# File 'lib/moab/signature_catalog.rb', line 76 attribute :byte_count, Integer, :tag => 'byteCount', :on_save => Proc.new {|t| t.to_s} |
#catalog_datetime ⇒ Time
Returns The datetime at which the catalog was updated.
56 |
# File 'lib/moab/signature_catalog.rb', line 56 attribute :catalog_datetime, Time, :tag => 'catalogDatetime', :on_save => Proc.new {|t| t.to_s} |
#digital_object_id ⇒ String
Returns The object ID (druid).
43 |
# File 'lib/moab/signature_catalog.rb', line 43 attribute :digital_object_id, String, :tag => 'objectId' |
#entries ⇒ Array<SignatureCatalogEntry>
Returns The set of data groups comprising the version.
98 |
# File 'lib/moab/signature_catalog.rb', line 98 has_many :entries, SignatureCatalogEntry, :tag => 'entry' |
#file_count ⇒ Integer
Returns The total number of data files (dynamically calculated).
68 |
# File 'lib/moab/signature_catalog.rb', line 68 attribute :file_count, Integer, :tag => 'fileCount', :on_save => Proc.new {|t| t.to_s} |
#signature_hash ⇒ OrderedHash
Returns An index having FileSignature objects as keys and Moab::SignatureCatalogEntry objects as values.
107 108 109 |
# File 'lib/moab/signature_catalog.rb', line 107 def signature_hash @signature_hash end |
#version_id ⇒ Integer
Returns The ordinal version number.
47 |
# File 'lib/moab/signature_catalog.rb', line 47 attribute :version_id, Integer, :tag => 'versionId', :key => true, :on_save => Proc.new {|n| n.to_s} |
Instance Method Details
#add_entry(entry) ⇒ void
This method returns an undefined value.
Returns Add a new entry to the catalog and to the #signature_hash index.
112 113 114 115 |
# File 'lib/moab/signature_catalog.rb', line 112 def add_entry(entry) @signature_hash[entry.signature] = entry entries << entry end |
#catalog_filepath(file_signature) ⇒ String
Returns The object-relative path of the file having the specified signature.
119 120 121 122 123 |
# File 'lib/moab/signature_catalog.rb', line 119 def catalog_filepath(file_signature) catalog_entry = @signature_hash[file_signature] raise FileNotFoundException, "catalog entry not found for #{file_signature.fixity.inspect} in #{@digital_object_id} - #{@version_id}" if catalog_entry.nil? catalog_entry.storage_path end |
#composite_key ⇒ String
Returns The unique identifier concatenating digital object id with version id.
50 51 52 |
# File 'lib/moab/signature_catalog.rb', line 50 def composite_key @digital_object_id + '-' + StorageObject.version_dirname(@version_id) end |
#normalize_group_signatures(group, group_pathname = nil) ⇒ void
This method returns an undefined value.
Returns Inspect and upgrade the group’s signature data to include all desired checksums.
128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 |
# File 'lib/moab/signature_catalog.rb', line 128 def normalize_group_signatures(group, group_pathname=nil) unless group_pathname.nil? group_pathname = Pathname(group_pathname) raise "Could not locate #{group_pathname}" unless group_pathname.exist? end group.files.each do |file| unless file.signature.complete? if @signature_hash.has_key?(file.signature) file.signature = @signature_hash.find {|k,v| k == file.signature}[0] elsif group_pathname file_pathname = group_pathname.join(file.instances[0].path) file.signature = file.signature.normalized_signature(file_pathname) end end end end |
#summary_fields ⇒ Array<String>
Returns The data fields to include in summary reports.
92 93 94 |
# File 'lib/moab/signature_catalog.rb', line 92 def summary_fields %w{digital_object_id version_id catalog_datetime file_count byte_count block_count} end |
#update(version_inventory, data_pathname) ⇒ void
This method returns an undefined value.
Returns Compares the FileSignature entries in the new versions FileInventory against the signatures in this catalog and create new Moab::SignatureCatalogEntry addtions to the catalog.
151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 |
# File 'lib/moab/signature_catalog.rb', line 151 def update(version_inventory, data_pathname) version_inventory.groups.each do |group| group.files.each do |file| unless @signature_hash.has_key?(file.signature) entry = SignatureCatalogEntry.new entry.version_id = version_inventory.version_id entry.group_id = group.group_id entry.path = file.instances[0].path if file.signature.complete? entry.signature = file.signature else file_pathname = data_pathname.join(group.group_id,entry.path) entry.signature = file.signature.normalized_signature(file_pathname) end add_entry(entry) end end end @version_id = version_inventory.version_id @catalog_datetime = Time.now end |
#version_additions(version_inventory) ⇒ FileInventory
Returns Retrurns a filtered copy of the input inventory containing only those files that were added in this version.
178 179 180 181 182 183 184 185 186 187 188 189 190 191 |
# File 'lib/moab/signature_catalog.rb', line 178 def version_additions(version_inventory) version_additions = FileInventory.new(:type=>'additions') version_additions.copy_ids(version_inventory) version_inventory.groups.each do |group| group_addtions = FileGroup.new(:group_id => group.group_id) group.files.each do |file| unless @signature_hash.has_key?(file.signature) group_addtions.add_file_instance(file.signature,file.instances[0]) end end version_additions.groups << group_addtions if group_addtions.files.size > 0 end version_additions end |