Class: Cascading::Node

Inherits:
Object
  • Object
show all
Defined in:
lib/cascading/base.rb

Overview

A Node is a Cascade, Flow, or Assembly, all of which are composite structures that describe the hierarchical structure of your job. A Cascade may contain many Flows and a Flow and Assembly may contain many Assemblies (branches in the case of the Assembly). Nodes are named, contain parent and child pointers, and keep track of their children both by name and by insertion order.

Nodes must be uniquely named within the scope of their parent so that they unambiguously looked up for connecting pipes within a flow. However, we only ensure that children are uniquely named upon insertion; full uniqueness isn’t required until Node#find_child is called (this allows for name reuse in a few limited circumstances that was important when migrating the Etsy workload to enforce these constraints).

Direct Known Subclasses

Assembly, Cascade, Flow

Instance Attribute Summary collapse

Instance Method Summary collapse

Constructor Details

#initialize(name, parent) ⇒ Node

A Node requires a name and a parent when it is constructed. Children are added later with Node#add_child.



20
21
22
23
24
25
26
# File 'lib/cascading/base.rb', line 20

def initialize(name, parent)
  @name = name
  @parent = parent
  @children = {}
  @child_names = []
  @last_child = nil
end

Instance Attribute Details

#child_namesObject

Returns the value of attribute child_names.



16
17
18
# File 'lib/cascading/base.rb', line 16

def child_names
  @child_names
end

#childrenObject

Returns the value of attribute children.



16
17
18
# File 'lib/cascading/base.rb', line 16

def children
  @children
end

#last_childObject

Returns the value of attribute last_child.



16
17
18
# File 'lib/cascading/base.rb', line 16

def last_child
  @last_child
end

#nameObject

Returns the value of attribute name.



16
17
18
# File 'lib/cascading/base.rb', line 16

def name
  @name
end

#parentObject

Returns the value of attribute parent.



16
17
18
# File 'lib/cascading/base.rb', line 16

def parent
  @parent
end

Instance Method Details

#add_child(node) ⇒ Object

Children must be uniquely named within the scope of each Node. This ensures, for example, two assemblies are not created within the same flow with the same name, causing joins, unions, and sinks on them to be ambiguous.



32
33
34
35
36
37
38
39
# File 'lib/cascading/base.rb', line 32

def add_child(node)
  raise AmbiguousNodeNameException.new("Attempted to add '#{node.qualified_name}', but node named '#{node.name}' already exists") if @children[node.name]

  @children[node.name] = node
  @child_names << node.name
  @last_child = node
  node
end

#describe(offset = '') ⇒ Object Also known as: desc

Produces a textual description of this Node. This method is overridden by all classes inheriting Node, so it serves mainly as a template for describing a node with children.



50
51
52
# File 'lib/cascading/base.rb', line 50

def describe(offset = '')
  "#{offset}#{name}:node\n#{child_names.map{ |child| children[child].describe("#{offset}  ") }.join("\n")}"
end

#find_child(name) ⇒ Object

In order to find a child, we require it to be uniquely named within this Node and its children. This ensures, for example, branches in peer assemblies or branches and assemblies do not conflict in joins, unions, and sinks.



59
60
61
62
63
64
65
# File 'lib/cascading/base.rb', line 59

def find_child(name)
  all_children_with_name = find_all_children_with_name(name)
  qualified_names = all_children_with_name.map{ |child| child.qualified_name }
  raise AmbiguousNodeNameException.new("Ambiguous lookup of child by name '#{name}'; found '#{qualified_names.join("', '")}'") if all_children_with_name.size > 1

  all_children_with_name.first
end

#qualified_nameObject

The qualified name of a node is formed from the name of all nodes in the path from the root to that node.



43
44
45
# File 'lib/cascading/base.rb', line 43

def qualified_name
  parent ? "#{parent.qualified_name}.#{name}" : name
end

#rootObject

Returns the root Node, the topmost parent of the hierarchy (typically a Cascade or Flow).



69
70
71
72
# File 'lib/cascading/base.rb', line 69

def root
  return self unless parent
  parent.root
end