Module: Cascading::TextOperations

Included in:
Assembly
Defined in:
lib/cascading/text_operations.rb

Overview

Module of pipe assemblies that wrap operations defined in the Cascading cascading.operations.text package. These are split out only to group similar functionality.

Mapping of DSL pipes into Cascading text operations:

parse_date

DateParser

format_date

DateFormatter

join_fields

FieldJoiner

Instance Method Summary collapse

Instance Method Details

#format_date(input_field, date_format, into_field, options = {}) ⇒ Object

Converts a timestamp into a formatted date string using the specified date_format.

Example:

format_date 'timestamp', 'yyyy/MM/dd', 'text_date'


35
36
37
38
39
40
41
42
43
44
45
46
47
48
# File 'lib/cascading/text_operations.rb', line 35

def format_date(input_field, date_format, into_field, options = {})
  output = options[:output] || all_fields # Overrides Cascading default

  input_field = fields(input_field)
  raise "input_field must declare exactly one field, was '#{input_field}'" unless input_field.size == 1
  into_field = fields(into_field)
  raise "into_field must declare exactly one field, was '#{into_field}'" unless into_field.size == 1

  each(
    input_field,
    :function => Java::CascadingOperationText::DateFormatter.new(into_field, date_format),
    :output => output
  )
end

#join_fields(input_fields, delimiter, into_field) ⇒ Object

Joins multiple fields into a single field given a delimiter.

Example:

join_fields ['field1', 'field2'], ',', 'comma_separated'


54
55
56
57
58
59
60
61
62
63
64
65
# File 'lib/cascading/text_operations.rb', line 54

def join_fields(input_fields, delimiter, into_field)
  output = options[:output] || all_fields # Overrides Cascading default

  into_field = fields(into_field)
  raise "into_field must declare exactly one field, was '#{into_field}'" unless into_field.size == 1

  each(
    input_fields,
    :function => Java::CascadingOperationText::FieldJoiner.new(into_field, delimiter.to_s),
    :output => output
  )
end

#parse_date(input_field, date_format, into_field, options = {}) ⇒ Object

Parses the given input_field as a date using the provided format string.

Example:

parse_date 'text_date', 'yyyy/MM/dd', 'timestamp'


15
16
17
18
19
20
21
22
23
24
25
26
27
28
# File 'lib/cascading/text_operations.rb', line 15

def parse_date(input_field, date_format, into_field, options = {})
  output = options[:output] || all_fields # Overrides Cascading default

  input_field = fields(input_field)
  raise "input_field must declare exactly one field, was '#{input_field}'" unless input_field.size == 1
  into_field = fields(into_field)
  raise "into_field must declare exactly one field, was '#{into_field}'" unless into_field.size == 1

  each(
    input_field,
    :function => Java::CascadingOperationText::DateParser.new(into_field, date_format),
    :output => output
  )
end