Class: UniqueFirstReduce

Inherits:
ReduceBase show all
Defined in:
lib/mrtoolkit.rb

Overview

First record of each unique field Drops the given number of colums. By default, drops the first column.

Instance Attribute Summary

Attributes inherited from Stage

#errors, #in_fields, #in_sep, #out_fields, #out_sep

Instance Method Summary collapse

Methods inherited from ReduceBase

#process, #process_begin, #process_each, #process_end, #process_end_internal, #process_internal, #process_term, #run

Methods inherited from Stage

#catch_errors, #copy_struct, #emit, #emit_separator, #field, #field_separator, #new_input, #new_output, #prepare, #process_step, #write_out

Constructor Details

#initialize(*args) ⇒ UniqueFirstReduce

Returns a new instance of UniqueFirstReduce.



693
694
695
696
697
698
699
700
701
702
703
704
# File 'lib/mrtoolkit.rb', line 693

def initialize(*args)
  if args[0]
    @n = args[0].to_i - 1
  else
    @n = 0
  end
  if args[1]
    @m = args[1].to_i - 1
  else
    @m = -1
  end
end

Instance Method Details

#declareObject



706
707
708
709
710
711
# File 'lib/mrtoolkit.rb', line 706

def declare
  (0..@m).each {|i| field "skip#{i}"}
  (0..@n).each {|i| field "col#{i}"}

  (0..@n).each {|i| emit "col#{i}"}
end

#process_init(input, output) ⇒ Object

copy over all dest fields



713
714
715
# File 'lib/mrtoolkit.rb', line 713

def process_init(input, output)
  copy_struct(input, output, @m+1)
end