Module: Wukong::Hadoop::MapLogic
- Included in:
- HadoopRunner
- Defined in:
- lib/wukong-hadoop/runner/map_logic.rb
Overview
Implements logic for figuring out the correct mapper commandline given wu-hadoop's arguments.
Instance Method Summary collapse
-
#explicit_map_command? ⇒ true, false
Were we given an explicit map command (like 'cut -f 1') or are we to introspect and construct the command?.
-
#explicit_map_processor? ⇒ true, false
Were we given a processor to use as our mapper explicitly by name or are we to introspect to discover the correct processor?.
-
#explicit_mapper? ⇒ true, false
Were we given an explicit mapper (either as a command or as a processor) or should we introspect to find one?.
-
#mapper_arg ⇒ String
The argument that we should introspect on to turn into our mapper.
-
#mapper_commandline ⇒ String
Return the actual commandline used by the mapper, whether running in local or Hadoop mode.
-
#mapper_name ⇒ String
Return the name of the processor to use as the mapper.
-
#mapper_needs_run_arg? ⇒ true, false
Does the mapper commandline need an explicit --run argument?.
Instance Method Details
#explicit_map_command? ⇒ true, false
Were we given an explicit map command (like 'cut -f 1') or are we to introspect and construct the command?
28 29 30 |
# File 'lib/wukong-hadoop/runner/map_logic.rb', line 28 def explicit_map_command? settings[:map_command] end |
#explicit_map_processor? ⇒ true, false
Were we given a processor to use as our mapper explicitly by name or are we to introspect to discover the correct processor?
37 38 39 |
# File 'lib/wukong-hadoop/runner/map_logic.rb', line 37 def explicit_map_processor? settings[:mapper] end |
#explicit_mapper? ⇒ true, false
Were we given an explicit mapper (either as a command or as a processor) or should we introspect to find one?
45 46 47 |
# File 'lib/wukong-hadoop/runner/map_logic.rb', line 45 def explicit_mapper? explicit_map_processor? || explicit_map_command? end |
#mapper_arg ⇒ String
The argument that we should introspect on to turn into our mapper.
53 54 55 |
# File 'lib/wukong-hadoop/runner/map_logic.rb', line 53 def mapper_arg args.first end |
#mapper_commandline ⇒ String
Return the actual commandline used by the mapper, whether running in local or Hadoop mode.
You should be able to copy, paste, and run this command unmodified to debug the mapper.
15 16 17 18 19 20 21 22 |
# File 'lib/wukong-hadoop/runner/map_logic.rb', line 15 def mapper_commandline return settings[:map_command] if explicit_map_command? arg = (mode == :hadoop ? File.basename(mapper_arg) : mapper_arg) [command_prefix, 'wu-local', arg].tap do |cmd| cmd << "--run=#{mapper_name}" if mapper_needs_run_arg? cmd << non_wukong_hadoop_params_string end.compact.map(&:to_s).reject(&:empty?).join(' ') end |
#mapper_name ⇒ String
Return the name of the processor to use as the mapper.
Will raise a Wukong::Error if a given mapper is invalid or if none can be guessed.
Most of the logic that examines explicit command line arguments and checks for the existence of named processors or files is here.
80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 |
# File 'lib/wukong-hadoop/runner/map_logic.rb', line 80 def mapper_name case when explicit_mapper? if processor_registered?(settings[:mapper]) settings[:mapper] else raise Error.new("No such processor: '#{settings[:mapper]}'") end when map_only? && processor_registered?(mapper_arg) mapper_arg when map_only? && file_is_processor?(mapper_arg) processor_name_from_file(mapper_arg) when single_job_arg? && explicit_reducer? && processor_registered?(mapper_arg) mapper_arg when separate_map_and_reduce_args? && processor_registered?(mapper_arg) mapper_arg when separate_map_and_reduce_args? && file_is_processor?(mapper_arg) processor_name_from_file(mapper_arg) when processor_registered?('mapper') 'mapper' else raise Error.new("Could not find a processor to use as a mapper") end end |
#mapper_needs_run_arg? ⇒ true, false
Does the mapper commandline need an explicit --run argument?
Will not be used if the processor name is the same as the name of the script.
63 64 65 66 67 68 |
# File 'lib/wukong-hadoop/runner/map_logic.rb', line 63 def mapper_needs_run_arg? return false if settings[:map_command] return false if mapper_arg.to_s == mapper_name.to_s return false if File.basename(mapper_arg.to_s, '.rb') == mapper_name.to_s true end |