Class: OodCore::Job::Adapters::Fujitsu_TCS
- Inherits:
-
OodCore::Job::Adapter
- Object
- OodCore::Job::Adapter
- OodCore::Job::Adapters::Fujitsu_TCS
- Defined in:
- lib/ood_core/job/adapters/fujitsu_tcs.rb
Overview
An adapter object that describes the communication with a Fujitsu TCS resource manager for job management.
Defined Under Namespace
Classes: Batch
Constant Summary collapse
- STATE_MAP =
Mapping of state codes for Fujitsu TCS resource manager
{ 'ACC' => :queued, # Accepted job submission 'RJT' => :completed, # Rejected job submission 'QUE' => :queued, # Waiting for job execution 'RNA' => :queued, # Acquiring resources required for job execution 'RNP' => :running, # Executing prologue 'RUN' => :running, # Executing job 'RNE' => :running, # Executing epilogue 'RNO' => :running, # Waiting for completion of job termination processing 'SPP' => :suspended, # Suspend in progress 'SPD' => :suspended, # Suspended 'RSM' => :running, # Resume in progress 'EXT' => :completed, # Exited job end execution 'CCL' => :completed, # Exited job execution by interruption 'HLD' => :suspended, # In fixed state due to users 'ERR' => :completed, # In fixed state due to an error }
Instance Method Summary collapse
-
#delete(id) ⇒ void
Delete the submitted job.
- #directive_prefix ⇒ Object
-
#hold(id) ⇒ void
Put the submitted job on hold.
-
#info(id) ⇒ Info
Retrieve job info from the resource manager.
-
#info_all(attrs: nil) ⇒ Array<Info>
Retrieve info for all jobs from the resource manager.
-
#info_where_owner(owner, attrs: nil) ⇒ Array<Info>
Retrieve info for all jobs for a given owner or owners from the resource manager.
-
#initialize(opts = {}) ⇒ Fujitsu_TCS
constructor
private
A new instance of Fujitsu_TCS.
-
#release(id) ⇒ void
Release the job that is on hold.
-
#status(id) ⇒ Status
Retrieve job status from resource manager.
-
#submit(script, after: [], afterok: [], afternotok: [], afterany: []) ⇒ String
Submit a job with the attributes defined in the job template instance.
Methods inherited from OodCore::Job::Adapter
#accounts, #cluster_info, #info_all_each, #info_where_owner_each, #job_name_illegal_chars, #nodes, #queues, #sanitize_job_name, #supports_job_arrays?
Constructor Details
#initialize(opts = {}) ⇒ Fujitsu_TCS
This method is part of a private API. You should avoid using this method if possible, as it may be removed or be changed in the future.
Returns a new instance of Fujitsu_TCS.
192 193 194 195 196 |
# File 'lib/ood_core/job/adapters/fujitsu_tcs.rb', line 192 def initialize(opts = {}) o = opts.to_h.symbolize_keys @fujitsu_tcs = o.fetch(:fujitsu_tcs) { raise ArgumentError, "No Fujitsu TCS object specified. Missing argument: fujitsu_tcs" } end |
Instance Method Details
#delete(id) ⇒ void
This method returns an undefined value.
Delete the submitted job
375 376 377 378 379 380 |
# File 'lib/ood_core/job/adapters/fujitsu_tcs.rb', line 375 def delete(id) @fujitsu_tcs.delete_job(id.to_s) rescue Batch::Error => e # assume successful job deletion if can't find job id raise JobAdapterError, e. unless /\[ERR\.\] PJM .+ Job .+ does not exist/ =~ e. end |
#directive_prefix ⇒ Object
382 383 384 |
# File 'lib/ood_core/job/adapters/fujitsu_tcs.rb', line 382 def directive_prefix '#PJM' end |
#hold(id) ⇒ void
This method returns an undefined value.
Put the submitted job on hold
351 352 353 354 355 356 |
# File 'lib/ood_core/job/adapters/fujitsu_tcs.rb', line 351 def hold(id) @fujitsu_tcs.hold_job(id.to_s) rescue Batch::Error => e # assume successful job hold if can't find job id raise JobAdapterError, e. unless /\[ERR\.\] PJM .+ Job .+ does not exist/ =~ e. end |
#info(id) ⇒ Info
Retrieve job info from the resource manager
288 289 290 291 292 293 294 295 296 297 298 299 300 301 302 303 304 305 306 |
# File 'lib/ood_core/job/adapters/fujitsu_tcs.rb', line 288 def info(id) id = id.to_s info_ary = @fujitsu_tcs.get_jobs(id: id).map do |v| parse_job_info(v) end # If no job was found we assume that it has completed info_ary.empty? ? Info.new(id: id, status: :completed) : info_ary.first # @fujitsu_tcs.get_jobs() must return only one element. rescue Batch::Error => e # set completed status if can't find job id if /\[ERR\.\] PJM .+ Job .+ does not exist/ =~ e. Info.new( id: id, status: :completed ) else raise JobAdapterError, e. end end |
#info_all(attrs: nil) ⇒ Array<Info>
Retrieve info for all jobs from the resource manager
275 276 277 278 279 280 281 |
# File 'lib/ood_core/job/adapters/fujitsu_tcs.rb', line 275 def info_all(attrs: nil) @fujitsu_tcs.get_jobs().map do |v| parse_job_info(v) end rescue Batch::Error => e raise JobAdapterError, e. end |
#info_where_owner(owner, attrs: nil) ⇒ Array<Info>
Retrieve info for all jobs for a given owner or owners from the resource manager
313 314 315 316 317 318 319 320 |
# File 'lib/ood_core/job/adapters/fujitsu_tcs.rb', line 313 def info_where_owner(owner, attrs: nil) owner = Array.wrap(owner).map(&:to_s).join('+') @fujitsu_tcs.get_jobs(owner: owner).map do |v| parse_job_info(v) end rescue Batch::Error => e raise JobAdapterError, e. end |
#release(id) ⇒ void
This method returns an undefined value.
Release the job that is on hold
363 364 365 366 367 368 |
# File 'lib/ood_core/job/adapters/fujitsu_tcs.rb', line 363 def release(id) @fujitsu_tcs.release_job(id.to_s) rescue Batch::Error => e # assume successful job release if can't find job id raise JobAdapterError, e. unless /\[ERR\.\] PJM .+ Job .+ does not exist/ =~ e. end |
#status(id) ⇒ Status
Retrieve job status from resource manager
327 328 329 330 331 332 333 334 335 336 337 338 339 340 341 342 343 344 |
# File 'lib/ood_core/job/adapters/fujitsu_tcs.rb', line 327 def status(id) id = id.to_s jobs = @fujitsu_tcs.get_jobs(id: id) if job = jobs.detect { |j| j[:JOB_ID] == id } Status.new(state: get_state(job[:ST])) else # set completed status if can't find job id Status.new(state: :completed) end rescue Batch::Error => e # set completed status if can't find job id if /\[ERR\.\] PJM .+ Job .+ does not exist/ =~ e. Status.new(state: :completed) else raise JobAdapterError, e. end end |
#submit(script, after: [], afterok: [], afternotok: [], afterany: []) ⇒ String
Submit a job with the attributes defined in the job template instance
213 214 215 216 217 218 219 220 221 222 223 224 225 226 227 228 229 230 231 232 233 234 235 236 237 238 239 240 241 242 243 244 245 246 247 248 249 250 251 252 253 254 255 256 257 258 259 260 261 262 263 264 265 266 267 268 269 |
# File 'lib/ood_core/job/adapters/fujitsu_tcs.rb', line 213 def submit(script, after: [], afterok: [], afternotok: [], afterany: []) #after = Array(after).map(&:to_s) #afterok = Array(afterok).map(&:to_s) #afternotok = Array(afternotok).map(&:to_s) #afterany = Array(afterany).map(&:to_s) if !after.empty? || !afterok.empty? || !afternotok.empty? || !afterany.empty? raise JobAdapterError, "Dependency between jobs has not implemented yet." end # Set pjsub options args = [] args.concat (script.rerunnable ? ["--restart"] : ["--norestart"]) unless script.rerunnable.nil? args.concat ["--mail-list", script.email.join(",")] unless script.email.nil? if script.email_on_started && script.email_on_terminated args.concat ["-m", "b,e"] elsif script.email_on_started args.concat ["-m", "b"] elsif script.email_on_terminated args.concat ["-m", "e"] end args.concat ["-N", script.job_name] unless script.job_name.nil? args.concat ["-o", script.output_path] unless script.output_path.nil? args.concat ['--mpi', "proc=#{script.cores}"] unless script.cores.nil? if script.error_path.nil? args.concat ["-j"] else args.concat ["-e", script.error_path] end args.concat ["-L", "rscgrp=" + script.queue_name] unless script.queue_name.nil? args.concat ["-p", script.priority] unless script.priority.nil? # start_time: <%= Time.local(2023,11,22,13,4).to_i %> in form.yml.erb args.concat ["--at", script.start_time.localtime.strftime("%C%y%m%d%H%M")] unless script.start_time.nil? args.concat ["-L", "elapse=" + seconds_to_duration(script.wall_time)] unless script.wall_time.nil? args.concat ["--bulk", "--sparam", script.job_array_request] unless script.job_array_request.nil? # Set environment variables envvars = script.job_environment.to_h args.concat ["-x", envvars.map{|k,v| "#{k}=#{v}"}.join(",")] unless envvars.empty? args.concat ["-X"] if script.copy_environment? # Set native options args.concat script.native if script.native # Set content content = if script.shell_path.nil? script.content else "#!#{script.shell_path}\n#{script.content}" end # Submit job @fujitsu_tcs.submit_string(content, args: args) rescue Batch::Error => e raise JobAdapterError, e. end |