Class: Fog::AWS::EMR::Real
- Inherits:
-
Object
- Object
- Fog::AWS::EMR::Real
- Defined in:
- lib/fog/aws/emr.rb,
lib/fog/aws/requests/emr/run_job_flow.rb,
lib/fog/aws/requests/emr/add_job_flow_steps.rb,
lib/fog/aws/requests/emr/describe_job_flows.rb,
lib/fog/aws/requests/emr/add_instance_groups.rb,
lib/fog/aws/requests/emr/terminate_job_flows.rb,
lib/fog/aws/requests/emr/modify_instance_groups.rb,
lib/fog/aws/requests/emr/set_termination_protection.rb
Instance Method Summary collapse
-
#add_instance_groups(job_flow_id, options = {}) ⇒ Object
adds an instance group to a running cluster docs.amazonwebservices.com/ElasticMapReduce/latest/API/API_AddInstanceGroups.html ==== Parameters * JobFlowId <~String> - Job flow in which to add the instance groups * InstanceGroups<~Array> - Instance Groups to add * ‘BidPrice’<~String> - Bid price for each Amazon EC2 instance in the instance group when launching nodes as Spot Instances, expressed in USD.
-
#add_job_flow_steps(job_flow_id, options = {}) ⇒ Object
adds new steps to a running job flow.
-
#describe_job_flows(options = {}) ⇒ Object
returns a list of job flows that match all of the supplied parameters.
-
#initialize(options = {}) ⇒ Real
constructor
Initialize connection to EMR.
-
#modify_instance_groups(options = {}) ⇒ Object
modifies the number of nodes and configuration settings of an instance group..
- #reload ⇒ Object
- #run_hive(name, options = {}) ⇒ Object
-
#run_job_flow(name, options = {}) ⇒ Object
creates and starts running a new job flow docs.amazonwebservices.com/ElasticMapReduce/latest/API/API_RunJobFlow.html ==== Parameters * AdditionalInfo <~String> - A JSON string for selecting additional features.
-
#set_termination_protection(is_protected, options = {}) ⇒ Object
locks a job flow so the Amazon EC2 instances in the cluster cannot be terminated by user intervention.
-
#terminate_job_flows(options = {}) ⇒ Object
shuts a list of job flows down.
Constructor Details
#initialize(options = {}) ⇒ Real
Initialize connection to EMR
Notes
options parameter must include values for :aws_access_key_id and :aws_secret_access_key in order to create a connection
Examples
emr = EMR.new(
:aws_access_key_id => your_aws_access_key_id,
:aws_secret_access_key => your_aws_secret_access_key
)
Parameters
-
options<~Hash> - config arguments for connection. Defaults to {}.
-
region<~String> - optional region to use. For instance, in ‘eu-west-1’, ‘us-east-1’ and etc.
-
Returns
-
EMR object with connection to AWS.
64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 |
# File 'lib/fog/aws/emr.rb', line 64 def initialize(={}) @aws_access_key_id = [:aws_access_key_id] @aws_secret_access_key = [:aws_secret_access_key] @connection_options = [:connection_options] || {} @hmac = Fog::HMAC.new('sha256', @aws_secret_access_key) [:region] ||= 'us-east-1' @host = [:host] || "elasticmapreduce.#{[:region]}.amazonaws.com" @path = [:path] || '/' @persistent = [:persistent] || false @port = [:port] || 443 @scheme = [:scheme] || 'https' @connection = Fog::Connection.new("#{@scheme}://#{@host}:#{@port}#{@path}", @persistent, @connection_options) @region = [:region] end |
Instance Method Details
#add_instance_groups(job_flow_id, options = {}) ⇒ Object
adds an instance group to a running cluster docs.amazonwebservices.com/ElasticMapReduce/latest/API/API_AddInstanceGroups.html
Parameters
-
JobFlowId <~String> - Job flow in which to add the instance groups
-
InstanceGroups<~Array> - Instance Groups to add
-
‘BidPrice’<~String> - Bid price for each Amazon EC2 instance in the instance group when launching nodes as Spot Instances, expressed in USD.
-
‘InstanceCount’<~Integer> - Target number of instances for the instance group
-
‘InstanceRole’<~String> - MASTER | CORE | TASK The role of the instance group in the cluster
-
‘InstanceType’<~String> - The Amazon EC2 instance type for all instances in the instance group
-
‘MarketType’<~String> - ON_DEMAND | SPOT Market type of the Amazon EC2 instances used to create a cluster node
-
‘Name’<~String> - Friendly name given to the instance group.
-
Returns
-
response<~Excon::Response>:
-
body<~Hash>:
-
23 24 25 26 27 28 29 30 31 32 33 34 |
# File 'lib/fog/aws/requests/emr/add_instance_groups.rb', line 23 def add_instance_groups(job_flow_id, ={}) if instance_groups = .delete('InstanceGroups') .merge!(Fog::AWS.indexed_param('InstanceGroups.member.%d', [*instance_groups])) end request({ 'Action' => 'AddInstanceGroups', 'JobFlowId' => job_flow_id, :parser => Fog::Parsers::AWS::EMR::AddInstanceGroups.new, }.merge()) end |
#add_job_flow_steps(job_flow_id, options = {}) ⇒ Object
adds new steps to a running job flow. docs.amazonwebservices.com/ElasticMapReduce/latest/API/API_AddJobFlowSteps.html
Parameters
-
JobFlowId <~String> - A string that uniquely identifies the job flow
-
Steps <~Array> - A list of steps to be executed by the job flow
-
‘ActionOnFailure’<~String> - TERMINATE_JOB_FLOW | CANCEL_AND_WAIT | CONTINUE Specifies the action to take if the job flow step fails
-
‘HadoopJarStep’<~Array> - Specifies the JAR file used for the job flow step
-
‘Args’<~String list> - A list of command line arguments passed to the JAR file’s main function when executed.
-
‘Jar’<~String> - A path to a JAR file run during the step.
-
‘MainClass’<~String> - The name of the main class in the specified Java file. If not specified, the JAR file should specify a Main-Class in its manifest file
-
‘Properties’<~Array> - A list of Java properties that are set when the step runs. You can use these properties to pass key value pairs to your main function
-
‘Key’<~String> - The unique identifier of a key value pair
-
‘Value’<~String> - The value part of the identified key
-
-
-
‘Name’<~String> - The name of the job flow step
-
Returns
-
response<~Excon::Response>:
-
body<~Hash>:
-
26 27 28 29 30 31 32 33 34 35 36 37 |
# File 'lib/fog/aws/requests/emr/add_job_flow_steps.rb', line 26 def add_job_flow_steps(job_flow_id, ={}) if steps = .delete('Steps') .merge!(Fog::AWS.serialize_keys('Steps', steps)) end request({ 'Action' => 'AddJobFlowSteps', 'JobFlowId' => job_flow_id, :parser => Fog::Parsers::AWS::EMR::AddJobFlowSteps.new, }.merge()) end |
#describe_job_flows(options = {}) ⇒ Object
returns a list of job flows that match all of the supplied parameters. docs.amazonwebservices.com/ElasticMapReduce/latest/API/API_DescribeJobFlows.html
Parameters
-
CreatedAfter <~DateTime> - Return only job flows created after this date and time
-
CreatedBefore <~DateTime> - Return only job flows created before this date and time
-
JobFlowIds <~String list> - Return only job flows whose job flow ID is contained in this list
-
JobFlowStates <~String list> - RUNNING | WAITING | SHUTTING_DOWN | STARTING Return only job flows whose state is contained in this list
Returns
-
response<~Excon::Response>:
-
body<~Hash>:
-
-
JobFlows <~Array> - A list of job flows matching the parameters supplied.
-
AmiVersion <~String> - A list of bootstrap actions that will be run before Hadoop is started on the cluster nodes.
-
‘BootstrapActions’<~Array> - A list of the bootstrap actions run by the job flow
-
‘BootstrapConfig <~Array> - A description of the bootstrap action
-
‘Name’ <~String> - The name of the bootstrap action
-
‘ScriptBootstrapAction’ <~Array> - The script run by the bootstrap action.
-
‘Args’ <~String list> - A list of command line arguments to pass to the bootstrap action script.
-
‘Path’ <~String> - Location of the script to run during a bootstrap action.
-
-
-
-
‘ExecutionStatusDetail’<~Array> - Describes the execution status of the job flow
-
‘CreationDateTime <~DateTime> - The creation date and time of the job flow.
-
‘EndDateTime <~DateTime> - The completion date and time of the job flow.
-
‘LastStateChangeReason <~String> - Description of the job flow last changed state.
-
‘ReadyDateTime <~DateTime> - The date and time when the job flow was ready to start running bootstrap actions.
-
‘StartDateTime <~DateTime> - The start date and time of the job flow.
-
‘State <~DateTime> - COMPLETED | FAILED | TERMINATED | RUNNING | SHUTTING_DOWN | STARTING | WAITING | BOOTSTRAPPING The state of the job flow.
-
-
Instances <~Array> - A specification of the number and type of Amazon EC2 instances on which to run the job flow.
-
‘Ec2KeyName’<~String> - Specifies the name of the Amazon EC2 key pair that can be used to ssh to the master node as the user called “hadoop.
-
‘HadoopVersion’<~String> - “0.18” | “0.20” Specifies the Hadoop version for the job flow
-
‘InstanceCount’<~Integer> - The number of Amazon EC2 instances used to execute the job flow
-
‘InstanceGroups’<~Array> - Configuration for the job flow’s instance groups
-
‘BidPrice’ <~String> - Bid price for each Amazon EC2 instance in the instance group when launching nodes as Spot Instances, expressed in USD.
-
‘CreationDateTime’ <~DateTime> - The date/time the instance group was created.
-
‘EndDateTime’ <~DateTime> - The date/time the instance group was terminated.
-
‘InstanceGroupId’ <~String> - Unique identifier for the instance group.
-
‘InstanceRequestCount’<~Integer> - Target number of instances for the instance group
-
‘InstanceRole’<~String> - MASTER | CORE | TASK The role of the instance group in the cluster
-
‘InstanceRunningCount’<~Integer> - Actual count of running instances
-
‘InstanceType’<~String> - The Amazon EC2 instance type for all instances in the instance group
-
‘LastStateChangeReason’<~String> - Details regarding the state of the instance group
-
‘Market’<~String> - ON_DEMAND | SPOT Market type of the Amazon EC2 instances used to create a cluster
-
‘Name’<~String> - Friendly name for the instance group
-
‘ReadyDateTime’<~DateTime> - The date/time the instance group was available to the cluster
-
‘StartDateTime’<~DateTime> - The date/time the instance group was started
-
‘State’<~String> - PROVISIONING | STARTING | BOOTSTRAPPING | RUNNING | RESIZING | ARRESTED | SHUTTING_DOWN | TERMINATED | FAILED | ENDED State of instance group
-
-
-
‘KeepJobFlowAliveWhenNoSteps’ <~Boolean> - Specifies whether the job flow should terminate after completing all steps
-
‘MasterInstanceId’<~String> - The Amazon EC2 instance identifier of the master node
-
‘MasterInstanceType’<~String> - The EC2 instance type of the master node
-
‘MasterPublicDnsName’<~String> - The DNS name of the master node
-
‘NormalizedInstanceHours’<~Integer> - An approximation of the cost of the job flow, represented in m1.small/hours.
-
‘Placement’<~Array> - Specifies the Availability Zone the job flow will run in
-
‘AvailabilityZone’ <~String> - The Amazon EC2 Availability Zone for the job flow.
-
-
‘SlaveInstanceType’<~String> - The EC2 instance type of the slave nodes
-
‘TerminationProtected’<~Boolean> - Specifies whether to lock the job flow to prevent the Amazon EC2 instances from being terminated by API call, user intervention, or in the event of a job flow error
-
-
LogUri <~String> - Specifies the location in Amazon S3 to write the log files of the job flow. If a value is not provided, logs are not created
-
Name <~String> - The name of the job flow
-
Steps <~Array> - A list of steps to be executed by the job flow
-
‘ExecutionStatusDetail’<~Array> - Describes the execution status of the job flow
-
‘CreationDateTime <~DateTime> - The creation date and time of the job flow.
-
‘EndDateTime <~DateTime> - The completion date and time of the job flow.
-
‘LastStateChangeReason <~String> - Description of the job flow last changed state.
-
‘ReadyDateTime <~DateTime> - The date and time when the job flow was ready to start running bootstrap actions.
-
‘StartDateTime <~DateTime> - The start date and time of the job flow.
-
‘State <~DateTime> - COMPLETED | FAILED | TERMINATED | RUNNING | SHUTTING_DOWN | STARTING | WAITING | BOOTSTRAPPING The state of the job flow.
-
-
StepConfig <~Array> - The step configuration
-
‘ActionOnFailure’<~String> - TERMINATE_JOB_FLOW | CANCEL_AND_WAIT | CONTINUE Specifies the action to take if the job flow step fails
-
‘HadoopJarStep’<~Array> - Specifies the JAR file used for the job flow step
-
‘Args’<~String list> - A list of command line arguments passed to the JAR file’s main function when executed.
-
‘Jar’<~String> - A path to a JAR file run during the step.
-
‘MainClass’<~String> - The name of the main class in the specified Java file. If not specified, the JAR file should specify a Main-Class in its manifest file
-
‘Properties’<~Array> - A list of Java properties that are set when the step runs. You can use these properties to pass key value pairs to your main function
-
‘Key’<~String> - The unique identifier of a key value pair
-
‘Value’<~String> - The value part of the identified key
-
-
-
‘Name’<~String> - The name of the job flow step
-
-
82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 |
# File 'lib/fog/aws/requests/emr/describe_job_flows.rb', line 82 def describe_job_flows(={}) if job_ids = .delete('JobFlowIds') .merge!(Fog::AWS.serialize_keys('JobFlowIds', job_ids)) end if job_states = .delete('JobFlowStates') .merge!(Fog::AWS.serialize_keys('JobFlowStates', job_states)) end request({ 'Action' => 'DescribeJobFlows', :parser => Fog::Parsers::AWS::EMR::DescribeJobFlows.new, }.merge()) end |
#modify_instance_groups(options = {}) ⇒ Object
modifies the number of nodes and configuration settings of an instance group.. docs.amazonwebservices.com/ElasticMapReduce/latest/API/API_ModifyInstanceGroups.html
Parameters
-
InstanceGroups <~InstanceGroupModifyConfig list> - Instance groups to change
-
InstanceCount <~Integer> - Target size for instance group
-
InstanceGroupId <~String> - Unique ID of the instance group to expand or shrink
-
Returns
-
response<~Excon::Response>:
-
body<~Hash>
-
18 19 20 21 22 23 24 25 26 27 28 |
# File 'lib/fog/aws/requests/emr/modify_instance_groups.rb', line 18 def modify_instance_groups(={}) if job_ids = .delete('InstanceGroups') .merge!(Fog::AWS.serialize_keys('InstanceGroups', job_ids)) end request({ 'Action' => 'ModifyInstanceGroups', :parser => Fog::Parsers::AWS::EMR::ModifyInstanceGroups.new, }.merge()) end |
#reload ⇒ Object
80 81 82 |
# File 'lib/fog/aws/emr.rb', line 80 def reload @connection.reset end |
#run_hive(name, options = {}) ⇒ Object
71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 |
# File 'lib/fog/aws/requests/emr/run_job_flow.rb', line 71 def run_hive(name, ={}) steps = [] steps << { 'Name' => 'Setup Hive', 'HadoopJarStep' => { 'Jar' => 's3://us-east-1.elasticmapreduce/libs/script-runner/script-runner.jar', 'Args' => ['s3://us-east-1.elasticmapreduce/libs/hive/hive-script', '--base-path', 's3://us-east-1.elasticmapreduce/libs/hive/', '--install-hive']}, 'ActionOnFailure' => 'TERMINATE_JOB_FLOW' } # To add a configuration step to the Hive flow, see the step below # steps << { # 'Name' => 'Install Hive Site Configuration', # 'HadoopJarStep' => { # 'Jar' => 's3://us-east-1.elasticmapreduce/libs/script-runner/script-runner.jar', # 'Args' => ['s3://us-east-1.elasticmapreduce/libs/hive/hive-script', '--base-path', 's3://us-east-1.elasticmapreduce/libs/hive/', '--install-hive-site', '--hive-site=s3://my.bucket/hive/hive-site.xml']}, # 'ActionOnFailure' => 'TERMINATE_JOB_FLOW' # } ['Steps'] = steps if not ['Instances'].nil? ['Instances']['KeepJobFlowAliveWhenNoSteps'] = true end run_job_flow name, end |
#run_job_flow(name, options = {}) ⇒ Object
creates and starts running a new job flow docs.amazonwebservices.com/ElasticMapReduce/latest/API/API_RunJobFlow.html
Parameters
-
AdditionalInfo <~String> - A JSON string for selecting additional features.
-
BootstrapActions <~Array> - A list of bootstrap actions that will be run before Hadoop is started on the cluster nodes.
-
‘Name’<~String> - The name of the bootstrap action
-
‘ScriptBootstrapAction’<~Array> - The script run by the bootstrap action
-
‘Args’ <~Array> - A list of command line arguments to pass to the bootstrap action script
-
‘Path’ <~String> - Location of the script to run during a bootstrap action. Can be either a location in Amazon S3 or on a local file system.
-
-
-
Instances <~Array> - A specification of the number and type of Amazon EC2 instances on which to run the job flow.
-
‘Ec2KeyName’<~String> - Specifies the name of the Amazon EC2 key pair that can be used to ssh to the master node as the user called “hadoop.
-
‘HadoopVersion’<~String> - “0.18” | “0.20” Specifies the Hadoop version for the job flow
-
‘InstanceCount’<~Integer> - The number of Amazon EC2 instances used to execute the job flow
-
‘InstanceGroups’<~Array> - Configuration for the job flow’s instance groups
-
‘BidPrice’ <~String> - Bid price for each Amazon EC2 instance in the instance group when launching nodes as Spot Instances, expressed in USD.
-
‘InstanceCount’<~Integer> - Target number of instances for the instance group
-
‘InstanceRole’<~String> - MASTER | CORE | TASK The role of the instance group in the cluster
-
‘InstanceType’<~String> - The Amazon EC2 instance type for all instances in the instance group
-
‘MarketType’<~String> - ON_DEMAND | SPOT Market type of the Amazon EC2 instances used to create a cluster node
-
‘Name’<~String> - Friendly name given to the instance group.
-
-
‘KeepJobFlowAliveWhenNoSteps’ <~Boolean> - Specifies whether the job flow should terminate after completing all steps
-
‘MasterInstanceType’<~String> - The EC2 instance type of the master node
-
‘Placement’<~Array> - Specifies the Availability Zone the job flow will run in
-
‘AvailabilityZone’ <~String> - The Amazon EC2 Availability Zone for the job flow.
-
-
‘SlaveInstanceType’<~String> - The EC2 instance type of the slave nodes
-
‘TerminationProtected’<~Boolean> - Specifies whether to lock the job flow to prevent the Amazon EC2 instances from being terminated by API call, user intervention, or in the event of a job flow error
-
-
LogUri <~String> - Specifies the location in Amazon S3 to write the log files of the job flow. If a value is not provided, logs are not created
-
Name <~String> - The name of the job flow
-
Steps <~Array> - A list of steps to be executed by the job flow
-
‘ActionOnFailure’<~String> - TERMINATE_JOB_FLOW | CANCEL_AND_WAIT | CONTINUE Specifies the action to take if the job flow step fails
-
‘HadoopJarStep’<~Array> - Specifies the JAR file used for the job flow step
-
‘Args’<~String list> - A list of command line arguments passed to the JAR file’s main function when executed.
-
‘Jar’<~String> - A path to a JAR file run during the step.
-
‘MainClass’<~String> - The name of the main class in the specified Java file. If not specified, the JAR file should specify a Main-Class in its manifest file
-
‘Properties’<~Array> - A list of Java properties that are set when the step runs. You can use these properties to pass key value pairs to your main function
-
‘Key’<~String> - The unique identifier of a key value pair
-
‘Value’<~String> - The value part of the identified key
-
-
-
‘Name’<~String> - The name of the job flow step
-
Returns
-
response<~Excon::Response>:
-
body<~Hash>:
-
50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 |
# File 'lib/fog/aws/requests/emr/run_job_flow.rb', line 50 def run_job_flow(name, ={}) if bootstrap_actions = .delete('BootstrapActions') .merge!(Fog::AWS.serialize_keys('BootstrapActions', bootstrap_actions)) end if instances = .delete('Instances') .merge!(Fog::AWS.serialize_keys('Instances', instances)) end if steps = .delete('Steps') .merge!(Fog::AWS.serialize_keys('Steps', steps)) end request({ 'Action' => 'RunJobFlow', 'Name' => name, :parser => Fog::Parsers::AWS::EMR::RunJobFlow.new, }.merge()) end |
#set_termination_protection(is_protected, options = {}) ⇒ Object
locks a job flow so the Amazon EC2 instances in the cluster cannot be terminated by user intervention. docs.amazonwebservices.com/ElasticMapReduce/latest/API/API_SetTerminationProtection.html
Parameters
-
JobFlowIds <~String list> - list of strings that uniquely identify the job flows to protect
-
TerminationProtected <~Boolean> - indicates whether to protect the job flow
Returns
-
response<~Excon::Response>:
-
body<~Hash>
-
17 18 19 20 21 22 23 24 25 26 27 |
# File 'lib/fog/aws/requests/emr/set_termination_protection.rb', line 17 def set_termination_protection(is_protected, ={}) if job_ids = .delete('JobFlowIds') .merge!(Fog::AWS.serialize_keys('JobFlowIds', job_ids)) end request({ 'Action' => 'SetTerminationProtection', 'TerminationProtected' => is_protected, :parser => Fog::Parsers::AWS::EMR::SetTerminationProtection.new, }.merge()) end |
#terminate_job_flows(options = {}) ⇒ Object
shuts a list of job flows down. docs.amazonwebservices.com/ElasticMapReduce/latest/API/API_TerminateJobFlows.html
Parameters
-
JobFlowIds <~String list> - list of strings that uniquely identify the job flows to protect
Returns
-
response<~Excon::Response>:
-
body<~Hash>
-
16 17 18 19 20 21 22 23 24 25 |
# File 'lib/fog/aws/requests/emr/terminate_job_flows.rb', line 16 def terminate_job_flows(={}) if job_ids = .delete('JobFlowIds') .merge!(Fog::AWS.serialize_keys('JobFlowIds', job_ids)) end request({ 'Action' => 'TerminateJobFlows', :parser => Fog::Parsers::AWS::EMR::TerminateJobFlows.new, }.merge()) end |