Class: DerivativeRodeo::Generators::BaseGenerator
- Inherits:
-
Object
- Object
- DerivativeRodeo::Generators::BaseGenerator
- Defined in:
- lib/derivative_rodeo/generators/base_generator.rb
Overview
The Base Generator defines the interface and common methods.
Fundamentally, they are about ensuring the files end up at the specified location, based on the given:
In extending a BaseGenerator you:
-
must assign an #output_extension
-
must impliment a #build_step method
-
may override #with_each_requisite_location_and_tmp_file_path
#generated_files is “where the magic happens” rubocop:disable Metrics/ClassLength
Direct Known Subclasses
AltoGenerator, CopyGenerator, HocrGenerator, MonochromeGenerator, PdfSplitGenerator, PlainTextGenerator, ThumbnailGenerator, WordCoordinatesGenerator
Class Attributes collapse
-
#output_extension ⇒ String
Of the form that starts with a string and may contain periods (though likely not as the first character).
Attributes collapse
-
#input_uris ⇒ Array<String>
readonly
The “original” files that we’ll be processing (via #generated_files).
-
#output_location_template ⇒ String
readonly
The template that defines where we’ll be writing the #input_uris (via #generated_files).
-
#preprocessed_location_template ⇒ String, NilClass
readonly
The template that defines where we might find existing processed files for the given #input_uris (via #generated_files).
Instance Method Summary collapse
- #build_step(input_location:, output_location:, input_tmp_file_path:) ⇒ StorageLocations::BaseLocation
-
#derive_preprocessed_template_from(input_location:, preprocessed_location_template:) ⇒ String
Some generators (e.g. PdfSplitGenerator) need to cooerce the location template based on the input location.
-
#destination(input_location) ⇒ StorageLocations::BaseLocation
Returns the output location for the given :input_location.
-
#generated_files ⇒ Array<StorageLocations::BaseLocation>
Based on the #input_uris ensure that we have files at the given output location (as derived from the #output_location_template).
- #generated_uris ⇒ Array<String>
-
#initialize(input_uris:, output_location_template:, preprocessed_location_template: nil) ⇒ BaseGenerator
constructor
A new instance of BaseGenerator.
- #input_files ⇒ Array<StorageLocations::BaseLocation>
-
#run(command) ⇒ String
A bit of indirection to create a common interface for running a shell command.
- #valid_instantiation? ⇒ Boolean private
-
#with_each_requisite_location_and_tmp_file_path {|input_location, tmp_file_path| ... } ⇒ Object
The files that are required as part of the #generated_files (though more precisely the #build_step.).
Constructor Details
#initialize(input_uris:, output_location_template:, preprocessed_location_template: nil) ⇒ BaseGenerator
Returns a new instance of BaseGenerator.
69 70 71 72 73 74 75 76 77 |
# File 'lib/derivative_rodeo/generators/base_generator.rb', line 69 def initialize(input_uris:, output_location_template:, preprocessed_location_template: nil) @input_uris = Array.wrap(input_uris) @output_location_template = output_location_template @preprocessed_location_template = preprocessed_location_template return if valid_instantiation? raise Errors::ExtensionMissingError.new(klass: self.class) end |
Instance Attribute Details
#input_uris ⇒ Array<String> (readonly)
The “original” files that we’ll be processing (via #generated_files)
44 45 46 |
# File 'lib/derivative_rodeo/generators/base_generator.rb', line 44 def input_uris @input_uris end |
#output_extension ⇒ String
Returns of the form that starts with a string and may contain periods (though likely not as the first character).
36 |
# File 'lib/derivative_rodeo/generators/base_generator.rb', line 36 class_attribute :output_extension |
#output_location_template ⇒ String (readonly)
The template that defines where we’ll be writing the #input_uris (via #generated_files)
50 51 52 |
# File 'lib/derivative_rodeo/generators/base_generator.rb', line 50 def output_location_template @output_location_template end |
#preprocessed_location_template ⇒ String, NilClass (readonly)
The template that defines where we might find existing processed files for the given #input_uris (via #generated_files)
58 59 60 |
# File 'lib/derivative_rodeo/generators/base_generator.rb', line 58 def preprocessed_location_template @preprocessed_location_template end |
Instance Method Details
#build_step(input_location:, output_location:, input_tmp_file_path:) ⇒ StorageLocations::BaseLocation
102 103 104 |
# File 'lib/derivative_rodeo/generators/base_generator.rb', line 102 def build_step(input_location:, output_location:, input_tmp_file_path:) raise NotImplementedError, "#{self.class}#build_step" end |
#derive_preprocessed_template_from(input_location:, preprocessed_location_template:) ⇒ String
Some generators (e.g. PdfSplitGenerator) need to cooerce the location template based on the input location. Most often, however, the given :preprocessed_location_template is adequate and would be the typical returned value.
rubocop:disable Lint/UnusedMethodArgument
292 293 294 |
# File 'lib/derivative_rodeo/generators/base_generator.rb', line 292 def derive_preprocessed_template_from(input_location:, preprocessed_location_template:) preprocessed_location_template end |
#destination(input_location) ⇒ StorageLocations::BaseLocation
Returns the output location for the given :input_location. The file at the location destination might exist or might not. In the case where we have a #preprocessed_location_template, we’ll also check the preprocessed location for the file, and if it exists there copy it to the target output location.
In the case of non-existence, then the #build_step will create the file.
rubocop:disable Metrics/MethodLength rubocop:disable Metrics/AbcSize
224 225 226 227 228 229 230 231 232 233 234 235 236 237 238 239 240 241 242 243 244 245 246 247 248 249 250 251 252 253 254 255 256 257 258 259 260 261 262 263 264 265 266 267 268 269 270 271 272 273 274 275 276 277 |
# File 'lib/derivative_rodeo/generators/base_generator.rb', line 224 def destination(input_location) output_location = input_location.derived_file_from(template: output_location_template, extension: output_extension) if output_location.exist? = "#{self.class}#destination :: " \ "input_location file_uri #{input_location.file_uri} :: " \ "Found output_location file_uri #{output_location.file_uri}." logger.info() return output_location end unless preprocessed_location_template = "#{self.class}#destination :: " \ "input_location file_uri #{input_location.file_uri} :: " \ "No preprocessed_location_template provided " \ "nor does a file exist at output_location file_uri #{output_location.file_uri}; " \ "moving on to generation via #{self.class}#build_step." logger.info() return output_location end template = derive_preprocessed_template_from(input_location: input_location, preprocessed_location_template: preprocessed_location_template) preprocessed_location = input_location.derived_file_from(template: template, extension: output_extension) # We only want the location if it exists if preprocessed_location.exist? = "#{self.class}#destination :: " \ "input_location file_uri #{input_location.file_uri} :: " \ "Found preprocessed_location file_uri #{preprocessed_location.file_uri}." logger.info() # Let's make sure we reap the fruits of the pre-processing; and don't worry that generator # will also write some logs. output_location = CopyGenerator.new( input_uris: [preprocessed_location.file_uri], output_location_template: output_location.file_uri ).generated_files.first return output_location end = "#{self.class}#destination :: " \ "input_location file_uri #{input_location.file_uri} :: " \ "No file exists at preprocessed_location file_uri #{preprocessed_location.file_uri} " \ "nor output_location file_uri #{output_location.file_uri}; " \ "moving on to generation via #{self.class}#build_step." logger.info() # NOTE: The file does not exist at the output_location; but we pass this information along so # that the #build_step knows where to write the file. output_location end |
#generated_files ⇒ Array<StorageLocations::BaseLocation>
This is the method where the magic happens!
Based on the #input_uris ensure that we have files at the given output location (as derived from the #output_location_template). We ensure that by:
-
Checking if a file already exists at the output location
-
Copying a preprocessed file to the output location if a preprocessed file exists
-
Generating the file based on the input location
rubocop:disable Metrics/MethodLength
123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 |
# File 'lib/derivative_rodeo/generators/base_generator.rb', line 123 def generated_files # TODO: Examples please return @generated_files if defined?(@generated_files) logger.info("Starting #{self.class}#generated_files with " \ "input_uris: #{input_uris.inspect}, " \ "output_location_template: #{output_location_template.inspect}, and " \ "preprocessed_location_template: #{preprocessed_location_template.inspect}.") # As much as I would like to use map or returned values; given the implementations it's # better to explicitly require that; reducing downstream implementation headaches. # # In other words, this little bit of ugly in a method that has yet to change in a subclass # helps ease subclass implementations of the #with_each_requisite_location_and_tmp_file_path or # #build_step @generated_files = [] # BaseLocation is like the Ruby `File` (Pathname) "File.exist?(path) :: location.exist?" # "file:///Users/jfriesen/.profile" with_each_requisite_location_and_tmp_file_path do |input_location, input_tmp_file_path| output_location = destination(input_location) @generated_files << if output_location.exist? output_location else = "#{self.class}#generated_files :: " \ "input_location file_uri #{input_location.file_uri} :: " \ "Generating output_location file_uri #{output_location.file_uri} via build_step." logger.info() build_step(input_location: input_location, output_location: output_location, input_tmp_file_path: input_tmp_file_path) end end @generated_files end |
#generated_uris ⇒ Array<String>
160 161 162 163 |
# File 'lib/derivative_rodeo/generators/base_generator.rb', line 160 def generated_uris # TODO: what do we do about nils? generated_files.map { |file| file&.file_uri } end |
#input_files ⇒ Array<StorageLocations::BaseLocation>
201 202 203 204 205 |
# File 'lib/derivative_rodeo/generators/base_generator.rb', line 201 def input_files @input_files ||= input_uris.map do |file_uri| DerivativeRodeo::StorageLocations::BaseLocation.from_uri(file_uri) end end |
#run(command) ⇒ String
A bit of indirection to create a common interface for running a shell command.
302 303 304 305 306 307 308 309 |
# File 'lib/derivative_rodeo/generators/base_generator.rb', line 302 def run(command) logger.debug "* Start command: #{command}" # TODO: What kind of error handling do we want? result = `#{command}` logger.debug "* Result: \n* #{result.gsub("\n", "\n* ")}" logger.debug "* End command: #{command}" result end |
#valid_instantiation? ⇒ Boolean
This method is part of a private API. You should avoid using this method if possible, as it may be removed or be changed in the future.
85 86 87 88 89 90 |
# File 'lib/derivative_rodeo/generators/base_generator.rb', line 85 def valid_instantiation? # TODO: Does this even make sense. # When we have a BaseGenerator and not one of it's children or when we've assigned the # output_extension. instance_of? is more specific than is_a? instance_of?(DerivativeRodeo::Generators::BaseGenerator) || output_extension end |
#with_each_requisite_location_and_tmp_file_path {|input_location, tmp_file_path| ... } ⇒ Object
The files that are required as part of the #generated_files (though more precisely the #build_step.)
This method is responsible for one thing:
-
yielding a StorageLocations::BaseLocation and the path (as String) to the files location in the temporary working space.
This method allows child classes to modify the file_uris for example, to filter out files that are not of the correct type or as a means of having “this” generator depend on another generator. The HocrGenerator requires that the input_location be a monochrome; so it does conversions of each given input_location. The PdfSplitGenerator uses this method to take each given PDF and generated one image per page of each given PDF. Those images are then treated as the requisite locations.
191 192 193 194 195 196 197 |
# File 'lib/derivative_rodeo/generators/base_generator.rb', line 191 def with_each_requisite_location_and_tmp_file_path input_files.each do |input_location| input_location.with_existing_tmp_path do |tmp_file_path| yield(input_location, tmp_file_path) end end end |