Class: Moab::FileGroup

Inherits:
Serializable
  • Object
show all
Includes:
HappyMapper
Defined in:
lib/moab/file_group.rb

Overview

Note:

Copyright © 2012 by The Board of Trustees of the Leland Stanford Junior University. All rights reserved. See LICENSE for details.

A container for a standard subset of a digital objects FileManifestation objects Used to segregate depositor content from repository metadata files This is a child element of FileInventory, which contains a full example

Data Model

  • FileInventory = container for recording information about a collection of related files

    • FileGroup [1..*] = subset allow segregation of content and metadata files

      • FileManifestation [1..*] = snapshot of a file’s filesystem characteristics

        • FileSignature [1] = file fixity information

        • FileInstance [1..*] = filepath and timestamp of any physical file having that signature

Instance Attribute Summary collapse

Instance Method Summary collapse

Constructor Details

#initialize(opts = {}) ⇒ FileGroup

Returns a new instance of FileGroup.



26
27
28
29
30
# File 'lib/moab/file_group.rb', line 26

def initialize(opts={})
  @signature_hash = OrderedHash.new
  @data_source = ""
  super(opts)
end

Instance Attribute Details

#base_directoryPathname

Returns The full path used as the basis of the relative paths reported in Moab::FileInstance objects that are children of the Moab::FileManifestation objects contained in this file group.

Returns:



158
159
160
# File 'lib/moab/file_group.rb', line 158

def base_directory
  @base_directory
end

#block_countInteger

Returns The total disk usage (in 1 kB blocks) of all data files (estimating du -k result) (dynamically calculated).

Returns:

  • (Integer)

    The total disk usage (in 1 kB blocks) of all data files (estimating du -k result) (dynamically calculated)



58
# File 'lib/moab/file_group.rb', line 58

attribute :block_count, Integer, :tag => 'blockCount', :on_save => Proc.new {|i| i.to_s}

#byte_countInteger

Returns The total size (in bytes) of all data files (dynamically calculated).

Returns:

  • (Integer)

    The total size (in bytes) of all data files (dynamically calculated)



50
# File 'lib/moab/file_group.rb', line 50

attribute :byte_count, Integer, :tag => 'byteCount', :on_save => Proc.new {|i| i.to_s}

#data_sourceString

Returns The directory location or other source of this groups file data.

Returns:

  • (String)

    The directory location or other source of this groups file data



38
# File 'lib/moab/file_group.rb', line 38

attribute :data_source, String, :tag => 'dataSource'

#file_countInteger

Returns The total number of data files (dynamically calculated).

Returns:

  • (Integer)

    The total number of data files (dynamically calculated)



42
# File 'lib/moab/file_group.rb', line 42

attribute :file_count, Integer, :tag => 'fileCount', :on_save => Proc.new {|i| i.to_s}

#filesArray<FileManifestation>

Returns The set of files comprising the group.

Returns:



72
# File 'lib/moab/file_group.rb', line 72

has_many :files, FileManifestation, :tag => 'file'

#group_idString

Returns The name of the file group.

Returns:

  • (String)

    The name of the file group



34
# File 'lib/moab/file_group.rb', line 34

attribute :group_id, String, :tag => 'groupId', :key => true

#signature_hashOrderedHash<FileSignature, FileManifestation>

Returns The actual in-memory store for the collection of Moab::FileManifestation objects that are contained in this file group.

Returns:



80
81
82
# File 'lib/moab/file_group.rb', line 80

def signature_hash
  @signature_hash
end

Instance Method Details

#add_file(manifestation) ⇒ void

This method returns an undefined value.

Returns Add a single Moab::FileManifestation object to this group.

Parameters:



126
127
128
129
130
# File 'lib/moab/file_group.rb', line 126

def add_file(manifestation)
  manifestation.instances.each do |instance|
    add_file_instance(manifestation.signature, instance)
  end
end

#add_file_instance(signature, instance) ⇒ void

This method returns an undefined value.

Returns Add a single Moab::FileSignature,Moab::FileInstance key/value pair to this group. Data is actually stored in the #signature_hash.

Parameters:

  • signature (FileSignature)

    The signature of the file instance to be added

  • instance (FileInstance)

    The pathname and datetime of the file instance to be added



137
138
139
140
141
142
143
144
145
146
# File 'lib/moab/file_group.rb', line 137

def add_file_instance(signature,instance)
  if @signature_hash.has_key?(signature)
    manifestation = @signature_hash[signature]
  else
    manifestation = FileManifestation.new
    manifestation.signature = signature
    @signature_hash[signature] = manifestation
  end
  manifestation.instances << instance
end

#add_physical_file(pathname, validated = nil) ⇒ void

This method returns an undefined value.

Returns Add a single physical file’s data to the array of files in this group. If fixity data was supplied in bag manifests, then utilize that data.

Parameters:

  • pathname (Pathname, String)

    The location of the file to be added

  • validated (Boolean) (defaults to: nil)

    if true, path is verified to be descendant of (#base_directory)



226
227
228
229
230
231
232
233
234
235
236
237
238
239
# File 'lib/moab/file_group.rb', line 226

def add_physical_file(pathname, validated=nil)
  pathname=Pathname.new(pathname).expand_path
  validated ||= is_descendent_of_base?(pathname)
  instance = FileInstance.new.instance_from_file(pathname, @base_directory)
  if @signatures_from_bag && @signatures_from_bag[pathname]
    signature = @signatures_from_bag[pathname]
    unless signature.complete?
      signature = signature.normalized_signature(pathname)
    end
  else
    signature = FileSignature.new.signature_from_file(pathname)
  end
  add_file_instance(signature,instance)
end

#group_from_bagit_subdir(directory, signatures_from_bag, recursive = true) ⇒ FileGroup

Returns Harvest a directory (using digest hash for fixity data) and add all files to the file group.

Parameters:

  • directory (Pathame, String)

    The directory whose children are to be added to the file group

  • signatures_from_bag (Hash<Pathname,Signature>)

    The fixity data already calculated for the files

  • recursive (Boolean) (defaults to: true)

    if true, descend into child directories

Returns:

  • (FileGroup)

    Harvest a directory (using digest hash for fixity data) and add all files to the file group



179
180
181
182
# File 'lib/moab/file_group.rb', line 179

def group_from_bagit_subdir(directory, signatures_from_bag, recursive=true)
  @signatures_from_bag = signatures_from_bag
  group_from_directory(directory, recursive)
end

#group_from_directory(directory, recursive = true) ⇒ FileGroup

Returns Harvest a directory and add all files to the file group.

Parameters:

  • directory (Pathname, String)

    The location of the files to harvest

  • recursive (Boolean) (defaults to: true)

    if true, descend into child directories

Returns:

  • (FileGroup)

    Harvest a directory and add all files to the file group



188
189
190
191
192
193
194
195
196
# File 'lib/moab/file_group.rb', line 188

def group_from_directory(directory, recursive=true)
  self.base_directory = directory
  @data_source = @base_directory.to_s
  harvest_directory(directory, recursive)
  self
rescue Exception # Errno::ENOENT
  @data_source = directory.to_s
  self
end

#harvest_directory(path, recursive, validated = nil) ⇒ void

This method returns an undefined value.

Returns Traverse a directory tree and add all files to the file group Note that unlike Find.find and Dir.glob, Pathname passes through symbolic links.

Parameters:

  • path (Pathname, String)

    pathname of the directory to be harvested

  • recursive (Boolean)

    if true, also harvest subdirectories

  • validated (Boolean) (defaults to: nil)

    if true, path is verified to be descendant of (#base_directory)

See Also:



206
207
208
209
210
211
212
213
214
215
216
217
218
219
# File 'lib/moab/file_group.rb', line 206

def harvest_directory(path, recursive, validated=nil)
  pathname=Pathname.new(path).expand_path
  validated ||= is_descendent_of_base?(pathname)
  pathname.children.sort.each do |child|
    if child.basename.to_s == ".DS_Store"
      next
    elsif child.directory?
      harvest_directory(child,recursive, validated) if recursive
    else
      add_physical_file(child, validated)
    end
  end
  nil
end

#is_descendent_of_base?(pathname) ⇒ Boolean

Returns Test whether the given path is contained within the #base_directory.

Parameters:

  • pathname (Pathname)

    The file path to be tested

Returns:

  • (Boolean)

    Test whether the given path is contained within the #base_directory



167
168
169
170
171
172
173
# File 'lib/moab/file_group.rb', line 167

def is_descendent_of_base?(pathname)
  raise("base_directory has not been set") if @base_directory.nil?
  is_descendent = false
  pathname.expand_path.ascend {|ancestor| is_descendent ||= (ancestor == @base_directory)}
  raise("#{pathname} is not a descendent of #{@base_directory}") unless is_descendent
  is_descendent
end

#path_hashOrderedHash<String,FileSignature>

Returns An index of file paths, used to test for existence of a filename in this file group.

Returns:



85
86
87
88
89
90
91
92
93
# File 'lib/moab/file_group.rb', line 85

def path_hash
  path_hash = OrderedHash.new
  @signature_hash.each do |signature,manifestation|
    manifestation.instances.each do |instance|
      path_hash[instance.path] = signature
    end
  end
  path_hash
end

#path_hash_subset(signature_subset) ⇒ OrderedHash<String,FileSignature>

Returns A pathname,signature hash containing a subset of the filenames in this file group.

Parameters:

  • signature_subset (Array<FileSignature>)

    The signatures used to select the entries to return

Returns:



103
104
105
106
107
108
109
110
111
112
# File 'lib/moab/file_group.rb', line 103

def path_hash_subset(signature_subset)
  path_hash = OrderedHash.new
  signature_subset.each do |signature|
    manifestation = @signature_hash[signature]
    manifestation.instances.each do |instance|
      path_hash[instance.path] = signature
    end
  end
  path_hash
end

#path_listArray<String>

Returns The list of file paths in this group.

Returns:

  • (Array<String>)

    The list of file paths in this group



96
97
98
# File 'lib/moab/file_group.rb', line 96

def path_list
  files.collect{|file| file.instances.collect{|instance| instance.path}}.flatten
end

#remove_file_having_path(path) ⇒ void

This method returns an undefined value.

for example, the manifest inventory does not contain a file entry for itself

Parameters:

  • path (String)

    The path of the file to be removed



151
152
153
154
# File 'lib/moab/file_group.rb', line 151

def remove_file_having_path(path)
  signature = self.path_hash[path]
  @signature_hash.delete(signature)
end

#summary_fieldsArray<String>

Returns The data fields to include in summary reports.

Returns:

  • (Array<String>)

    The data fields to include in summary reports



65
66
67
# File 'lib/moab/file_group.rb', line 65

def summary_fields
  %w{group_id file_count byte_count block_count}
end