Class: Mortar::Command::Local
Overview
run select pig commands on your local machine
Instance Attribute Summary
Attributes inherited from Base
Instance Method Summary collapse
-
#characterize ⇒ Object
local:characterize -f PARAMFILE.
-
#configure ⇒ Object
local:configure.
-
#illustrate ⇒ Object
local:illustrate PIGSCRIPT [ALIAS].
-
#luigi ⇒ Object
local:luigi SCRIPT.
-
#repl ⇒ Object
local:repl.
-
#run ⇒ Object
local:run SCRIPT.
-
#validate ⇒ Object
local:validate SCRIPT.
Methods inherited from Base
#api, #ask_public, #config_parameters, #get_error_message_context, #git, #initialize, #initialize_embedded_project, #luigi_parameters, namespace, #pig_parameters, #project, #register_api_call, #register_do, #register_project, #spark_script_arguments, #validate_project_name, #validate_project_structure
Methods included from Helpers
#action, #ask, #confirm, #copy_if_not_present_at_dest, #default_host, #deprecate, #display, #display_header, #display_object, #display_row, #display_table, #display_with_indent, #download_to_file, #ensure_dir_exists, #error, error_with_failure, error_with_failure=, extended, extended_into, #format_bytes, #format_date, #format_with_bang, #full_host, #get_terminal_environment, #home_directory, #host, #hprint, #hputs, included, included_into, #installed_with_omnibus?, #json_decode, #json_encode, #line_formatter, #longest, #output_with_bang, #pending_github_team_state_message, #quantify, #redisplay, #retry_on_exception, #running_on_a_mac?, #running_on_windows?, #set_buffer, #shell, #spinner, #status, #string_distance, #styled_array, #styled_error, #styled_hash, #styled_header, #suggestion, #test_name, #ticking, #time_ago, #truncate, #warning, #with_tty, #write_to_file
Constructor Details
This class inherits a constructor from Mortar::Command::Base
Instance Method Details
#characterize ⇒ Object
local:characterize -f PARAMFILE
Characterize will inspect your input data, inferring a schema and
generating keys, if needed. It will output CSV containing various
statistics about your data (most common values, percent null, etc.)
-f, –param-file PARAMFILE # Load pig parameter values from a file -g, –pigversion PIG_VERSION # Set pig version. Options are <PIG_VERSION_OPTIONS>.
Load some data and emit statistics. PARAMFILE (Required):
LOADER=<full class path of loader function>
INPUT_SRC=<Location of the input data>
OUTPUT_PATH=<Relative path from project root for output>
INFER_TYPES=<when true, recursively infers types for input data>
Example paramfile:
LOADER=org.apache.pig.piggybank.storage.JsonLoader()
INPUT_SRC=s3n://twitter-gardenhose-mortar/example
OUTPUT_PATH=twitter_char
INFER_TYPES=true
102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 |
# File 'lib/mortar/command/local.rb', line 102 def characterize validate_arguments! unless [:param_file] error("Usage: mortar local:characterize -f PARAMFILE.\nMust specify parameter file. For detailed help run:\n\n mortar local:characterize -h") end #cd into the project root project_root = [:project_root] ||= Dir.getwd unless File.directory?(project_root) error("No such directory #{project_root}") end Dir.chdir(project_root) gen = Mortar::Generators::CharacterizeGenerator.new gen.generate_characterize controlscript_name = "controlscripts/lib/characterize_control.py" gen = Mortar::Generators::CharacterizeGenerator.new gen.generate_characterize script = validate_script!(controlscript_name) params = config_parameters.concat(pig_parameters) ctrl = Mortar::Local::Controller.new ctrl.run(script, pig_version, params) gen.cleanup_characterize(project_root) end |
#configure ⇒ Object
local:configure
Install dependencies for running this mortar project locally - other mortar:local commands will also perform this step automatically.
-g, –pigversion PIG_VERSION # Set pig version. Options are <PIG_VERSION_OPTIONS>. –project-root PROJECTDIR # The root directory of the project if not the CWD
33 34 35 36 37 38 39 40 41 42 43 44 45 |
# File 'lib/mortar/command/local.rb', line 33 def configure validate_arguments! # cd into the project root project_root = [:project_root] ||= Dir.getwd unless File.directory?(project_root) error("No such directory #{project_root}") end Dir.chdir(project_root) ctrl = Mortar::Local::Controller.new ctrl.install_and_configure(pig_version, nil) end |
#illustrate ⇒ Object
local:illustrate PIGSCRIPT [ALIAS]
Locally illustrate the effects and output of a pigscript. If an alias is specified, will show data flow from the ancestor LOAD statements to the alias itself. If no alias is specified, will show data flow through all aliases in the script.
-s, –skippruning # Don’t try to reduce the illustrate results to the smallest size possible. -p, –parameter NAME=VALUE # Set a pig parameter value in your script. -f, –param-file PARAMFILE # Load pig parameter values from a file. -g, –pigversion PIG_VERSION # Set pig version. Options are <PIG_VERSION_OPTIONS>. –no_browser # Don’t open the illustrate results automatically in the browser. –project-root PROJECTDIR # The root directory of the project if not the CWD
Examples:
Illustrate all relations in the generate_regression_model_coefficients pigscript:
$ mortar illustrate pigscripts/generate_regression_model_coefficients.pig
148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 |
# File 'lib/mortar/command/local.rb', line 148 def illustrate pigscript_name = shift_argument alias_name = shift_argument validate_arguments! skip_pruning = [:skippruning] ||= false no_browser = [:no_browser] ||= false unless pigscript_name error("Usage: mortar local:illustrate PIGSCRIPT [ALIAS]\nMust specify PIGSCRIPT.") end # cd into the project root project_root = [:project_root] ||= Dir.getwd unless File.directory?(project_root) error("No such directory #{project_root}") end Dir.chdir(project_root) pigscript = validate_pigscript!(pigscript_name) params = config_parameters.concat(pig_parameters) ctrl = Mortar::Local::Controller.new ctrl.illustrate(pigscript, alias_name, pig_version, params, skip_pruning, no_browser) end |
#luigi ⇒ Object
local:luigi SCRIPT
Run a luigi pipeline script on your local machine in local scheduler mode. Any additional command line arguments will be passed directly to the luigi script.
–project-root PROJECTDIR # The root directory of the project if not the CWD -p, –parameter NAME=VALUE # [deprecated] Instead, pass luigi parameters directly as options (see below) -f, –param-file PARAMFILE # [deprecated] Instead, pass luigi parameters directly as options (see below)
Examples:
Run the recsys luigi script with a parameter named date-interval
$ mortar local:luigi luigiscripts/recsys.py --date-interval 2012-04
240 241 242 243 244 245 246 247 248 249 250 251 252 253 254 255 256 257 258 259 260 261 262 263 264 265 266 267 268 269 270 271 272 273 274 275 276 277 278 |
# File 'lib/mortar/command/local.rb', line 240 def luigi script_name = shift_argument unless script_name error("Usage: mortar local:luigi SCRIPT\nMust specify SCRIPT.") end # cd into the project root project_root = [:project_root] ||= Dir.getwd unless File.directory?(project_root) error("No such directory #{project_root}") end Dir.chdir(project_root) script = validate_luigiscript!(script_name) #Set git ref as environment variable for mortar-luigi to use when #running a MortarTask git_ref = sync_code_with_cloud() ENV['MORTAR_LUIGI_GIT_REF'] = git_ref # pick up standard luigi-style params provided by the user luigi_cli_parameters = luigi_parameters() # pick up old pig-style parameters (included for backwards compatibility) pig_style_parameters = pig_parameters() if pig_style_parameters.length > 0 warn "[DEPRECATION] Passing luigi parameters with -p is deprecated. Please pass them directly (e.g. --myparam myvalue)" end luigi_cli_parameters.concat(pig_style_parameters) cli_parameters = \ luigi_cli_parameters.sort_by { |p| p['name'] }.map { |arg| ["--#{arg['name']}", "#{arg['value']}"] }.flatten # get project configuration parameters project_config_params = config_parameters() ctrl = Mortar::Local::Controller.new ctrl.run_luigi(pig_version, script, cli_parameters, project_config_params) end |
#repl ⇒ Object
local:repl
Start a local Pig REPL session -p, –parameter NAME=VALUE # Set a pig parameter value in your script. -f, –param-file PARAMFILE # Load pig parameter values from a file. -g, –pigversion PIG_VERSION # Set pig version. Options are <PIG_VERSION_OPTIONS>.
217 218 219 220 221 222 223 224 |
# File 'lib/mortar/command/local.rb', line 217 def repl validate_arguments! params = config_parameters.concat(pig_parameters) ctrl = Mortar::Local::Controller.new ctrl.repl(pig_version, params) end |
#run ⇒ Object
local:run SCRIPT
Run a job on your local machine.
-p, –parameter NAME=VALUE # Set a pig parameter value in your script. -f, –param-file PARAMFILE # Load pig parameter values from a file. -g, –pigversion PIG_VERSION # Set pig version. Options are <PIG_VERSION_OPTIONS>. –project-root PROJECTDIR # The root directory of the project if not the CWD
Examples:
Run the generate_regression_model_coefficients script locally.
$ mortar local:run pigscripts/generate_regression_model_coefficients.pig
60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 |
# File 'lib/mortar/command/local.rb', line 60 def run script_name = shift_argument unless script_name error("Usage: mortar local:run SCRIPT\nMust specify SCRIPT.") end validate_arguments! # cd into the project root project_root = [:project_root] ||= Dir.getwd unless File.directory?(project_root) error("No such directory #{project_root}") end Dir.chdir(project_root) script = validate_script!(script_name) params = config_parameters.concat(pig_parameters) ctrl = Mortar::Local::Controller.new ctrl.run(script, pig_version, params) end |
#validate ⇒ Object
local:validate SCRIPT
Locally validate the syntax of a script.
-p, –parameter NAME=VALUE # Set a pig parameter value in your script. -f, –param-file PARAMFILE # Load pig parameter values from a file. -g, –pigversion PIG_VERSION # Set pig version. Options are <PIG_VERSION_OPTIONS>. –project-root PROJECTDIR # The root directory of the project if not the CWD
Examples:
Check the pig syntax of the generate_regression_model_coefficients pigscript locally.
$ mortar local:validate pigscripts/generate_regression_model_coefficients.pig
188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 |
# File 'lib/mortar/command/local.rb', line 188 def validate script_name = shift_argument unless script_name error("Usage: mortar local:validate SCRIPT\nMust specify SCRIPT.") end validate_arguments! # cd into the project root project_root = [:project_root] ||= Dir.getwd unless File.directory?(project_root) error("No such directory #{project_root}") end Dir.chdir(project_root) script = validate_script!(script_name) params = config_parameters.concat(pig_parameters) ctrl = Mortar::Local::Controller.new ctrl.validate(script, pig_version, params) end |