Class: Moab::FileGroup
- Inherits:
-
Serializer::Serializable
- Object
- Serializer::Serializable
- Moab::FileGroup
- Includes:
- HappyMapper
- Defined in:
- lib/moab/file_group.rb
Overview
Copyright © 2012 by The Board of Trustees of the Leland Stanford Junior University. All rights reserved. See LICENSE for details.
A container for a standard subset of a digital objects FileManifestation objects Used to segregate depositor content from repository metadata files This is a child element of FileInventory, which contains a full example
Data Model
-
FileInventory = container for recording information about a collection of related files
-
FileGroup [1..*] = subset allow segregation of content and metadata files
-
FileManifestation [1..*] = snapshot of a file’s filesystem characteristics
-
FileSignature [1] = file fixity information
-
FileInstance [1..*] = filepath and timestamp of any physical file having that signature
-
-
-
Instance Attribute Summary collapse
-
#base_directory ⇒ Object
Returns the value of attribute base_directory.
-
#block_count ⇒ Integer
The total disk usage (in 1 kB blocks) of all data files (estimating du -k result) (dynamically calculated).
-
#byte_count ⇒ Integer
The total size (in bytes) of all data files (dynamically calculated).
-
#data_source ⇒ String
The directory location or other source of this groups file data.
-
#file_count ⇒ Integer
The total number of data files (dynamically calculated).
-
#files ⇒ Array<FileManifestation>
The set of files comprising the group.
-
#group_id ⇒ String
The name of the file group.
-
#signature_hash ⇒ Hash<FileSignature, FileManifestation>
The actual in-memory store for the collection of FileManifestation objects that are contained in this file group.
Instance Method Summary collapse
-
#add_file(manifestation) ⇒ void
Add a single FileManifestation object to this group.
-
#add_file_instance(signature, instance) ⇒ void
Add a single FileSignature,FileInstance key/value pair to this group.
-
#add_physical_file(pathname, _validated = nil) ⇒ void
Add a single physical file’s data to the array of files in this group.
-
#group_from_bagit_subdir(directory, signatures_from_bag, recursive = true) ⇒ FileGroup
Harvest a directory (using digest hash for fixity data) and add all files to the file group.
-
#group_from_directory(directory, recursive = true) ⇒ FileGroup
Harvest a directory and add all files to the file group.
-
#harvest_directory(path, recursive, validated = nil) ⇒ void
Traverse a directory tree and add all files to the file group Note that unlike Find.find and Dir.glob, Pathname passes through symbolic links.
-
#initialize(opts = {}) ⇒ FileGroup
constructor
A new instance of FileGroup.
-
#is_descendent_of_base?(pathname) ⇒ Boolean
FIXME: shouldn’t this method be named descendent_of_base?.
-
#path_hash ⇒ Hash<String,FileSignature>
An index of file paths, used to test for existence of a filename in this file group.
-
#path_hash_subset(signature_subset) ⇒ Hash<String,FileSignature>
A pathname,signature hash containing a subset of the filenames in this file group, e.g., ….
-
#path_list ⇒ Array<String>
The list of file paths in this group.
-
#remove_file_having_path(path) ⇒ void
for example, the manifest inventory does not contain a file entry for itself.
-
#summary_fields ⇒ Array<String>
The data fields to include in summary reports.
Methods inherited from Serializer::Serializable
#array_to_hash, deep_diff, #diff, #key, #key_name, #summary, #to_hash, #to_json, #to_yaml, #variable_names, #variables
Constructor Details
#initialize(opts = {}) ⇒ FileGroup
Returns a new instance of FileGroup.
24 25 26 27 28 29 |
# File 'lib/moab/file_group.rb', line 24 def initialize(opts = {}) @signature_hash = {} @data_source = '' @signatures_from_bag = nil # prevents later warning: instance variable @signatures_from_bag not initialized super(opts) end |
Instance Attribute Details
#base_directory ⇒ Object
Returns the value of attribute base_directory.
165 166 167 |
# File 'lib/moab/file_group.rb', line 165 def base_directory @base_directory end |
#block_count ⇒ Integer
Returns The total disk usage (in 1 kB blocks) of all data files (estimating du -k result) (dynamically calculated).
57 |
# File 'lib/moab/file_group.rb', line 57 attribute :block_count, Integer, tag: 'blockCount', on_save: proc { |i| i.to_s } |
#byte_count ⇒ Integer
Returns The total size (in bytes) of all data files (dynamically calculated).
49 |
# File 'lib/moab/file_group.rb', line 49 attribute :byte_count, Integer, tag: 'byteCount', on_save: proc { |i| i.to_s } |
#data_source ⇒ String
Returns The directory location or other source of this groups file data.
37 |
# File 'lib/moab/file_group.rb', line 37 attribute :data_source, String, tag: 'dataSource' |
#file_count ⇒ Integer
Returns The total number of data files (dynamically calculated).
41 |
# File 'lib/moab/file_group.rb', line 41 attribute :file_count, Integer, tag: 'fileCount', on_save: proc { |i| i.to_s } |
#files ⇒ Array<FileManifestation>
Returns The set of files comprising the group.
70 |
# File 'lib/moab/file_group.rb', line 70 has_many :files, FileManifestation, tag: 'file' |
#group_id ⇒ String
Returns The name of the file group.
33 |
# File 'lib/moab/file_group.rb', line 33 attribute :group_id, String, tag: 'groupId', key: true |
#signature_hash ⇒ Hash<FileSignature, FileManifestation>
Returns The actual in-memory store for the collection of Moab::FileManifestation objects that are contained in this file group.
78 79 80 |
# File 'lib/moab/file_group.rb', line 78 def signature_hash @signature_hash end |
Instance Method Details
#add_file(manifestation) ⇒ void
This method returns an undefined value.
Returns Add a single Moab::FileManifestation object to this group.
131 132 133 134 135 |
# File 'lib/moab/file_group.rb', line 131 def add_file(manifestation) manifestation.instances.each do |instance| add_file_instance(manifestation.signature, instance) end end |
#add_file_instance(signature, instance) ⇒ void
This method returns an undefined value.
Returns Add a single Moab::FileSignature,Moab::FileInstance key/value pair to this group. Data is actually stored in the #signature_hash.
142 143 144 145 146 147 148 149 150 |
# File 'lib/moab/file_group.rb', line 142 def add_file_instance(signature, instance) manifestation = signature_hash[signature] || begin FileManifestation.new.tap do |file_manifestation| file_manifestation.signature = signature signature_hash[signature] = file_manifestation end end manifestation.instances << instance end |
#add_physical_file(pathname, _validated = nil) ⇒ void
This method returns an undefined value.
Returns Add a single physical file’s data to the array of files in this group. If fixity data was supplied in bag manifests, then utilize that data.
233 234 235 236 237 238 239 240 241 242 243 |
# File 'lib/moab/file_group.rb', line 233 def add_physical_file(pathname, _validated = nil) pathname = Pathname.new(pathname). instance = FileInstance.new.instance_from_file(pathname, @base_directory) if @signatures_from_bag && @signatures_from_bag[pathname] signature = @signatures_from_bag[pathname] signature = signature.normalized_signature(pathname) unless signature.complete? else signature = FileSignature.new.signature_from_file(pathname) end add_file_instance(signature, instance) end |
#group_from_bagit_subdir(directory, signatures_from_bag, recursive = true) ⇒ FileGroup
Returns Harvest a directory (using digest hash for fixity data) and add all files to the file group.
186 187 188 189 |
# File 'lib/moab/file_group.rb', line 186 def group_from_bagit_subdir(directory, signatures_from_bag, recursive = true) @signatures_from_bag = signatures_from_bag group_from_directory(directory, recursive) end |
#group_from_directory(directory, recursive = true) ⇒ FileGroup
Returns Harvest a directory and add all files to the file group.
195 196 197 198 199 200 201 202 203 |
# File 'lib/moab/file_group.rb', line 195 def group_from_directory(directory, recursive = true) self.base_directory = directory @data_source = @base_directory.to_s harvest_directory(directory, recursive) self rescue Exception # Errno::ENOENT @data_source = directory.to_s self end |
#harvest_directory(path, recursive, validated = nil) ⇒ void
This method returns an undefined value.
Returns Traverse a directory tree and add all files to the file group Note that unlike Find.find and Dir.glob, Pathname passes through symbolic links.
213 214 215 216 217 218 219 220 221 222 223 224 225 226 |
# File 'lib/moab/file_group.rb', line 213 def harvest_directory(path, recursive, validated = nil) pathname = Pathname.new(path). validated ||= is_descendent_of_base?(pathname) pathname.children.sort.each do |child| next if child.basename.to_s == '.DS_Store' if child.directory? harvest_directory(child, recursive, validated) if recursive else add_physical_file(child, validated) end end nil end |
#is_descendent_of_base?(pathname) ⇒ Boolean
FIXME: shouldn’t this method be named descendent_of_base?
171 172 173 174 175 176 177 178 179 180 |
# File 'lib/moab/file_group.rb', line 171 def is_descendent_of_base?(pathname) raise(MoabRuntimeError, 'base_directory has not been set') if @base_directory.nil? is_descendent = false pathname..ascend { |ancestor| is_descendent ||= (ancestor == @base_directory) } # FIXME: shouldn't it simply return false? raise(MoabRuntimeError, "#{pathname} is not a descendent of #{@base_directory}") unless is_descendent is_descendent end |
#path_hash ⇒ Hash<String,FileSignature>
Returns An index of file paths, used to test for existence of a filename in this file group.
83 84 85 86 87 88 89 90 91 |
# File 'lib/moab/file_group.rb', line 83 def path_hash path_hash = {} signature_hash.each do |signature, manifestation| manifestation.instances.each do |instance| path_hash[instance.path] = signature end end path_hash end |
#path_hash_subset(signature_subset) ⇒ Hash<String,FileSignature>
Returns A pathname,signature hash containing a subset of the filenames in this file group, e.g., ….
102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 |
# File 'lib/moab/file_group.rb', line 102 def path_hash_subset(signature_subset) # the structure of the `signature_hash` attr is documented above signature_hash .filter_map do |signature, manifestation| # filters out signatures not in the provided subset next unless signature_subset.include?(signature) # for each instance in the manifestation, return an array of its path and the signature from the above block manifestation.instances.map { |instance| [instance.path, signature] } end # the nested map operations above return e.g.: [[["intro-1.jpg", # #<Moab::FileSignature>],...]] which needs to be flattened one time to # convert back into a hash .flatten(1) .to_h end |
#path_list ⇒ Array<String>
Returns The list of file paths in this group.
94 95 96 |
# File 'lib/moab/file_group.rb', line 94 def path_list files.collect { |file| file.instances.collect(&:path) }.flatten end |
#remove_file_having_path(path) ⇒ void
This method returns an undefined value.
for example, the manifest inventory does not contain a file entry for itself
155 156 157 158 |
# File 'lib/moab/file_group.rb', line 155 def remove_file_having_path(path) signature = path_hash[path] signature_hash.delete(signature) end |
#summary_fields ⇒ Array<String>
Returns The data fields to include in summary reports.
64 65 66 |
# File 'lib/moab/file_group.rb', line 64 def summary_fields %w[group_id file_count byte_count block_count] end |