Class: Google::Cloud::Bigquery::Job

Inherits:
Object
  • Object
show all
Defined in:
lib/google/cloud/bigquery/job.rb,
lib/google/cloud/bigquery/job/list.rb

Overview

Job

Represents a generic Job that may be performed on a Table.

The subclasses of Job represent the specific BigQuery job types: CopyJob, ExtractJob, LoadJob, and QueryJob.

A job instance is created when you call Project#query_job, Dataset#query_job, Table#copy_job, Table#extract_job, Table#load_job.

Examples:

require "google/cloud/bigquery"

bigquery = Google::Cloud::Bigquery.new

job = bigquery.query_job "SELECT COUNT(word) as count FROM " \
                         "`bigquery-public-data.samples.shakespeare`"

job.wait_until_done!

if job.failed?
  puts job.error
else
  puts job.data.first
end

See Also:

Direct Known Subclasses

CopyJob, ExtractJob, LoadJob, QueryJob

Defined Under Namespace

Classes: List, ReservationUsage, ScriptStackFrame, ScriptStatistics

Attributes collapse

Lifecycle collapse

Instance Method Summary collapse

Instance Method Details

#cancelObject

Cancels the job.

Examples:

require "google/cloud/bigquery"

bigquery = Google::Cloud::Bigquery.new

query = "SELECT COUNT(word) as count FROM " \
        "`bigquery-public-data.samples.shakespeare`"

job = bigquery.query_job query

job.cancel


398
399
400
401
402
403
# File 'lib/google/cloud/bigquery/job.rb', line 398

def cancel
  ensure_service!
  resp = service.cancel_job job_id, location: location
  @gapi = resp.job
  true
end

#configurationObject Also known as: config

The configuration for the job. Returns a hash.

See Also:



301
302
303
# File 'lib/google/cloud/bigquery/job.rb', line 301

def configuration
  JSON.parse @gapi.configuration.to_json
end

#created_atTime?

The time when the job was created.

Returns:

  • (Time, nil)

    The creation time from the job statistics.



175
176
177
# File 'lib/google/cloud/bigquery/job.rb', line 175

def created_at
  Convert.millis_to_time @gapi.statistics.creation_time
end

#deleteBoolean

Requests that a job is deleted. This call will return when the job is deleted.

Examples:

require "google/cloud/bigquery"

bigquery = Google::Cloud::Bigquery.new

job = bigquery.job "my_job"

job.delete

Returns:

  • (Boolean)

    Returns true if the job was deleted.



421
422
423
424
425
# File 'lib/google/cloud/bigquery/job.rb', line 421

def delete
  ensure_service!
  service.delete_job job_id, location: location
  true
end

#done?Boolean

Checks if the job's state is DONE. When true, the job has stopped running. However, a DONE state does not mean that the job completed successfully. Use #failed? to detect if an error occurred or if the job was successful.

Returns:

  • (Boolean)

    true when DONE, false otherwise.



155
156
157
158
# File 'lib/google/cloud/bigquery/job.rb', line 155

def done?
  return false if state.nil?
  "done".casecmp(state).zero?
end

#ended_atTime?

The time when the job ended. This field is present when the job's state is DONE.

Returns:

  • (Time, nil)

    The end time from the job statistics.



196
197
198
# File 'lib/google/cloud/bigquery/job.rb', line 196

def ended_at
  Convert.millis_to_time @gapi.statistics.end_time
end

#errorHash?

The last error for the job, if any errors have occurred. Returns a hash.

Returns:

  • (Hash, nil)

    Returns a hash containing reason and message keys:

    { "reason"=>"notFound", "message"=>"Not found: Table bigquery-public-data:samples.BAD_ID" }

See Also:



344
345
346
# File 'lib/google/cloud/bigquery/job.rb', line 344

def error
  status["errorResult"]
end

#errorsArray<Hash>?

The errors for the job, if any errors have occurred. Returns an array of hash objects. See #error.

Returns:

  • (Array<Hash>, nil)

    Returns an array of hashes containing reason and message keys:

    { "reason"=>"notFound", "message"=>"Not found: Table bigquery-public-data:samples.BAD_ID" }



360
361
362
# File 'lib/google/cloud/bigquery/job.rb', line 360

def errors
  Array status["errors"]
end

#failed?Boolean

Checks if an error is present. Use #error to access the error object.

Returns:

  • (Boolean)

    true when there is an error, false otherwise.



166
167
168
# File 'lib/google/cloud/bigquery/job.rb', line 166

def failed?
  !error.nil?
end

#job_idString

The ID of the job.

Returns:

  • (String)

    The ID must contain only letters ([A-Za-z]), numbers ([0-9]), underscores (_), or dashes (-). The maximum length is 1,024 characters.



81
82
83
# File 'lib/google/cloud/bigquery/job.rb', line 81

def job_id
  @gapi.job_reference.job_id
end

#labelsHash

A hash of user-provided labels associated with this job. Labels can be provided when the job is created, and used to organize and group jobs.

The returned hash is frozen and changes are not allowed. Use CopyJob::Updater#labels= or ExtractJob::Updater#labels= or LoadJob::Updater#labels= or QueryJob::Updater#labels= to replace the entire hash.

Returns:

  • (Hash)

    The job labels.



377
378
379
380
381
# File 'lib/google/cloud/bigquery/job.rb', line 377

def labels
  m = @gapi.configuration.labels
  m = m.to_h if m.respond_to? :to_h
  m.dup.freeze
end

#locationString

The geographic location where the job runs.

Returns:

  • (String)

    A geographic location, such as "US", "EU" or "asia-northeast1".



101
102
103
# File 'lib/google/cloud/bigquery/job.rb', line 101

def location
  @gapi.job_reference.location
end

#num_child_jobsInteger

The number of child jobs executed.

Returns:

  • (Integer)

    The number of child jobs executed.



205
206
207
# File 'lib/google/cloud/bigquery/job.rb', line 205

def num_child_jobs
  @gapi.statistics.num_child_jobs || 0
end

#parent_job_idString?

If this is a child job, the id of the parent.

Returns:

  • (String, nil)

    The ID of the parent job, or nil if not a child job.



214
215
216
# File 'lib/google/cloud/bigquery/job.rb', line 214

def parent_job_id
  @gapi.statistics.parent_job_id
end

#pending?Boolean

Checks if the job's state is PENDING.

Returns:

  • (Boolean)

    true when PENDING, false otherwise.



142
143
144
145
# File 'lib/google/cloud/bigquery/job.rb', line 142

def pending?
  return false if state.nil?
  "pending".casecmp(state).zero?
end

#project_idString

The ID of the project containing the job.

Returns:

  • (String)

    The project ID.



90
91
92
# File 'lib/google/cloud/bigquery/job.rb', line 90

def project_id
  @gapi.job_reference.project_id
end

#reload!Object Also known as: refresh!

Reloads the job with current data from the BigQuery service.

Examples:

require "google/cloud/bigquery"

bigquery = Google::Cloud::Bigquery.new

query = "SELECT COUNT(word) as count FROM " \
        "`bigquery-public-data.samples.shakespeare`"

job = bigquery.query_job query

job.done?
job.reload!
job.done? #=> true


466
467
468
469
470
# File 'lib/google/cloud/bigquery/job.rb', line 466

def reload!
  ensure_service!
  gapi = service.get_job job_id, location: location
  @gapi = gapi
end

#rerun!Object

Created a new job with the current configuration.

Examples:

require "google/cloud/bigquery"

bigquery = Google::Cloud::Bigquery.new

query = "SELECT COUNT(word) as count FROM " \
        "`bigquery-public-data.samples.shakespeare`"

job = bigquery.query_job query

job.wait_until_done!
job.rerun!


443
444
445
446
447
# File 'lib/google/cloud/bigquery/job.rb', line 443

def rerun!
  ensure_service!
  gapi = service.insert_job @gapi.configuration, location: location
  Job.from_gapi gapi, service
end

#reservation_usageArray<Google::Cloud::Bigquery::Job::ReservationUsage>?

An array containing the job resource usage breakdown by reservation, if present. Reservation usage statistics are only reported for jobs that are executed within reservations. On-demand jobs do not report this data.

Returns:



224
225
226
227
# File 'lib/google/cloud/bigquery/job.rb', line 224

def reservation_usage
  return nil unless @gapi.statistics.reservation_usage
  Array(@gapi.statistics.reservation_usage).map { |g| ReservationUsage.from_gapi g }
end

#running?Boolean

Checks if the job's state is RUNNING.

Returns:

  • (Boolean)

    true when RUNNING, false otherwise.



132
133
134
135
# File 'lib/google/cloud/bigquery/job.rb', line 132

def running?
  return false if state.nil?
  "running".casecmp(state).zero?
end

#script_statisticsGoogle::Cloud::Bigquery::Job::ScriptStatistics?

The statistics including stack frames for a child job of a script.

Examples:

require "google/cloud/bigquery"

bigquery = Google::Cloud::Bigquery.new

multi_statement_sql = <<~SQL
  -- Declare a variable to hold names as an array.
  DECLARE top_names ARRAY<STRING>;
  -- Build an array of the top 100 names from the year 2017.
  SET top_names = (
  SELECT ARRAY_AGG(name ORDER BY number DESC LIMIT 100)
  FROM `bigquery-public-data.usa_names.usa_1910_current`
  WHERE year = 2017
  );
  -- Which names appear as words in Shakespeare's plays?
  SELECT
  name AS shakespeare_name
  FROM UNNEST(top_names) AS name
  WHERE name IN (
  SELECT word
  FROM `bigquery-public-data.samples.shakespeare`
  );
SQL

job = bigquery.query_job multi_statement_sql

job.wait_until_done!

child_jobs = bigquery.jobs parent_job: job

child_jobs.each do |child_job|
  script_statistics = child_job.script_statistics
  puts script_statistics.evaluation_kind
  script_statistics.stack_frames.each do |stack_frame|
    puts stack_frame.text
  end
end

Returns:



292
293
294
# File 'lib/google/cloud/bigquery/job.rb', line 292

def script_statistics
  ScriptStatistics.from_gapi @gapi.statistics.script_statistics if @gapi.statistics.script_statistics
end

#session_idString?

The ID of the session if this job is part of one. See the create_session param in Project#query_job and Dataset#query_job.

Returns:

  • (String, nil)

    The session ID, or nil if not associated with a session.



235
236
237
# File 'lib/google/cloud/bigquery/job.rb', line 235

def session_id
  @gapi.statistics.session_info&.session_id
end

#started_atTime?

The time when the job was started. This field is present after the job's state changes from PENDING to either RUNNING or DONE.

Returns:

  • (Time, nil)

    The start time from the job statistics.



186
187
188
# File 'lib/google/cloud/bigquery/job.rb', line 186

def started_at
  Convert.millis_to_time @gapi.statistics.start_time
end

#stateString

The current state of the job. A DONE state does not mean that the job completed successfully. Use #failed? to discover if an error occurred or if the job was successful.

Returns:

  • (String)

    The state code. The possible values are PENDING, RUNNING, and DONE.



122
123
124
125
# File 'lib/google/cloud/bigquery/job.rb', line 122

def state
  return nil if @gapi.status.nil?
  @gapi.status.state
end

#statisticsHash Also known as: stats

The statistics for the job. Returns a hash.

Returns:

  • (Hash)

    The job statistics.

See Also:



314
315
316
# File 'lib/google/cloud/bigquery/job.rb', line 314

def statistics
  JSON.parse @gapi.statistics.to_json
end

#statusHash

The job's status. Returns a hash. The values contained in the hash are also exposed by #state, #error, and #errors.

Returns:

  • (Hash)

    The job status.



325
326
327
# File 'lib/google/cloud/bigquery/job.rb', line 325

def status
  JSON.parse @gapi.status.to_json
end

#transaction_idString?

The ID of a multi-statement transaction.

Returns:

  • (String, nil)

    The transaction ID, or nil if not associated with a transaction.



244
245
246
# File 'lib/google/cloud/bigquery/job.rb', line 244

def transaction_id
  @gapi.statistics.transaction_info&.transaction_id
end

#user_emailString

The email address of the user who ran the job.

Returns:

  • (String)

    The email address.



110
111
112
# File 'lib/google/cloud/bigquery/job.rb', line 110

def user_email
  @gapi.user_email
end

#wait_until_done!Object

Refreshes the job until the job is DONE. The delay between refreshes starts at 5 seconds and increases exponentially to a maximum of 60 seconds.

Examples:

require "google/cloud/bigquery"

bigquery = Google::Cloud::Bigquery.new
dataset = bigquery.dataset "my_dataset"
table = dataset.table "my_table"

extract_job = table.extract_job "gs://my-bucket/file-name.json",
                                format: "json"
extract_job.wait_until_done!
extract_job.done? #=> true


490
491
492
493
494
495
496
497
498
499
500
501
# File 'lib/google/cloud/bigquery/job.rb', line 490

def wait_until_done!
  backoff = lambda do |retries|
    delay = [(retries**2) + 5, 60].min # Maximum delay is 60
    sleep delay
  end
  retries = 0
  until done?
    backoff.call retries
    retries += 1
    reload!
  end
end