Class: Cascading::Scope

Inherits:
Object
  • Object
show all
Defined in:
lib/cascading/scope.rb

Overview

Scope is a wrapper for a the private Cascading c.f.p.Scope object used to connect the dataflow graph by resolving fields. cascading.jruby wraps this facility so that it may be used to propagate field names at composition time (not Cascading plan time) in the same way they will later be propagated by the planner.

Instance Attribute Summary collapse

Class Method Summary collapse

Instance Method Summary collapse

Constructor Details

#initialize(scope) ⇒ Scope

Construct a Scope given the Cascading c.f.p.Scope to wrap.



11
12
13
# File 'lib/cascading/scope.rb', line 11

def initialize(scope)
  @scope = scope
end

Instance Attribute Details

#scopeObject

Returns the value of attribute scope.



8
9
10
# File 'lib/cascading/scope.rb', line 8

def scope
  @scope
end

Class Method Details

.empty_scope(name) ⇒ Object

Build an empty Scope, wrapping an empty c.f.p.Scope.



27
28
29
# File 'lib/cascading/scope.rb', line 27

def self.empty_scope(name)
  Scope.new(Java::CascadingFlowPlanner::Scope.new(name))
end

.flow_scope(name) ⇒ Object

Build a c.f.p.Scope for a Flow, which is empty except for its name.



22
23
24
# File 'lib/cascading/scope.rb', line 22

def self.flow_scope(name)
  Java::CascadingFlowPlanner::Scope.new(name)
end

.outgoing_scope(flow_element, incoming_scopes) ⇒ Object

Build a Scope for an arbitrary flow element. This is used to update the Scope at each stage in a pipe Assembly.



44
45
46
47
# File 'lib/cascading/scope.rb', line 44

def self.outgoing_scope(flow_element, incoming_scopes)
  java_scopes = incoming_scopes.compact.map{ |s| s.scope }
  Scope.new(outgoing_scope_for(flow_element, java.util.HashSet.new(java_scopes)))
end

.source_scope(name, tap, flow_scope) ⇒ Object

Build a Scope for a single source Tap. The flow_scope is propagated through this call into a new Scope.



33
34
35
36
37
38
39
40
# File 'lib/cascading/scope.rb', line 33

def self.source_scope(name, tap, flow_scope)
  incoming_scopes = java.util.HashSet.new
  incoming_scopes.add(flow_scope)
  java_scope = outgoing_scope_for(tap, incoming_scopes)
  # Taps and Pipes don't name their outgoing scopes like other FlowElements
  java_scope.name = name
  Scope.new(java_scope)
end

Instance Method Details

#copyObject

Copy one Scope into another; relies upon the copy constructor of c.f.p.Scope.



17
18
19
# File 'lib/cascading/scope.rb', line 17

def copy
  Scope.new(Java::CascadingFlowPlanner::Scope.new(@scope))
end

#grouping_fieldsObject

The grouping fields of the Scope, which indicate the keys of an group/cogroup.



57
58
59
# File 'lib/cascading/scope.rb', line 57

def grouping_fields
  @scope.out_grouping_fields
end

#to_sObject

Prints a detailed description of this Scope, including its type and various selectors, fields, and key fields. Data is bubbled up directly from the Cascading c.f.p.Scope. This output can be useful for debugging the propagation of fields through your job (see Flow#debug_scope and Assembly#debug_scope, which both rely upon this method).



66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
# File 'lib/cascading/scope.rb', line 66

def to_s
  kind = 'Unknown'
  kind = 'Tap'   if @scope.tap?
  kind = 'Group' if @scope.group?
  kind = 'Each'  if @scope.each?
  kind = 'Every' if @scope.every?
  <<-END
Scope name: #{@scope.name}
  Kind: #{kind}
  Key selectors:     #{scope_fields_to_s(:key_selectors)}
  Sorting selectors: #{scope_fields_to_s(:sorting_selectors)}
  Remainder fields:  #{scope_fields_to_s(:remainder_fields)}
  Declared fields:   #{scope_fields_to_s(:declared_fields)}
  Arguments
selector:   #{scope_fields_to_s(:arguments_selector)}
declarator: #{scope_fields_to_s(:arguments_declarator)}
  Out grouping
selector:   #{scope_fields_to_s(:out_grouping_selector)}
fields:     #{scope_fields_to_s(:out_grouping_fields)}
key fields: #{scope_fields_to_s(:key_selectors)}
  Out values
selector: #{scope_fields_to_s(:out_values_selector)}
fields:   #{scope_fields_to_s(:out_values_fields)}
END
end

#values_fieldsObject

The values fields of the Scope, which indicate the fields in the current dataflow tuple.



51
52
53
# File 'lib/cascading/scope.rb', line 51

def values_fields
  @scope.out_values_fields
end