Class: Google::Cloud::Bigquery::ExtractJob

Inherits:
Job
  • Object
show all
Defined in:
lib/google/cloud/bigquery/extract_job.rb

Overview

ExtractJob

A Job subclass representing an export operation that may be performed on a Table or Model. A ExtractJob instance is returned when you call Project#extract_job, Table#extract_job or Model#extract_job.

Examples:

Export table data

require "google/cloud/bigquery"

bigquery = Google::Cloud::Bigquery.new
dataset = bigquery.dataset "my_dataset"
table = dataset.table "my_table"

extract_job = table.extract_job "gs://my-bucket/file-name.json",
                                format: "json"
extract_job.wait_until_done!
extract_job.done? #=> true

Export a model

require "google/cloud/bigquery"

bigquery = Google::Cloud::Bigquery.new
dataset = bigquery.dataset "my_dataset"
model = dataset.model "my_model"

extract_job = model.extract_job "gs://my-bucket/#{model.model_id}"

extract_job.wait_until_done!
extract_job.done? #=> true

See Also:

Direct Known Subclasses

Updater

Defined Under Namespace

Classes: Updater

Instance Method Summary collapse

Methods inherited from Job

#cancel, #configuration, #created_at, #delete, #done?, #ended_at, #error, #errors, #failed?, #job_id, #labels, #location, #num_child_jobs, #parent_job_id, #pending?, #project_id, #reload!, #rerun!, #reservation_usage, #running?, #script_statistics, #session_id, #started_at, #state, #statistics, #status, #transaction_id, #user_email, #wait_until_done!

Instance Method Details

#avro?Boolean

Checks if the destination format for the table data is Avro. The default is false. Not applicable when extracting models.

Returns:

  • (Boolean)

    true when AVRO, false if not AVRO or not a table extraction.



151
152
153
154
# File 'lib/google/cloud/bigquery/extract_job.rb', line 151

def avro?
  return false unless table?
  @gapi.configuration.extract.destination_format == "AVRO"
end

#compression?Boolean

Checks if the export operation compresses the data using gzip. The default is false. Not applicable when extracting models.

Returns:

  • (Boolean)

    true when GZIP, false if not GZIP or not a table extraction.



110
111
112
113
# File 'lib/google/cloud/bigquery/extract_job.rb', line 110

def compression?
  return false unless table?
  @gapi.configuration.extract.compression == "GZIP"
end

#csv?Boolean

Checks if the destination format for the table data is CSV. Tables with nested or repeated fields cannot be exported as CSV. The default is true for tables. Not applicable when extracting models.

Returns:

  • (Boolean)

    true when CSV, or false if not CSV or not a table extraction.



136
137
138
139
140
141
# File 'lib/google/cloud/bigquery/extract_job.rb', line 136

def csv?
  return false unless table?
  val = @gapi.configuration.extract.destination_format
  return true if val.nil?
  val == "CSV"
end

#delimiterString?

The character or symbol the operation uses to delimit fields in the exported data. The default is a comma (,) for tables. Not applicable when extracting models.

Returns:

  • (String, nil)

    A string containing the character, such as ",", nil if not a table extraction.



190
191
192
193
194
195
# File 'lib/google/cloud/bigquery/extract_job.rb', line 190

def delimiter
  return unless table?
  val = @gapi.configuration.extract.field_delimiter
  val = "," if val.nil?
  val
end

#destinationsObject

The URI or URIs representing the Google Cloud Storage files to which the data is exported.



61
62
63
# File 'lib/google/cloud/bigquery/extract_job.rb', line 61

def destinations
  Array @gapi.configuration.extract.destination_uris
end

#destinations_countsHash<String, Integer>

A hash containing the URI or URI pattern specified in #destinations mapped to the counts of files per destination.

Returns:

  • (Hash<String, Integer>)

    A Hash with the URI patterns as keys and the counts as values.



229
230
231
# File 'lib/google/cloud/bigquery/extract_job.rb', line 229

def destinations_counts
  destinations.zip(destinations_file_counts).to_h
end

#destinations_file_countsArray<Integer>

The number of files per destination URI or URI pattern specified in #destinations.

Returns:

  • (Array<Integer>)

    An array of values in the same order as the URI patterns.



218
219
220
# File 'lib/google/cloud/bigquery/extract_job.rb', line 218

def destinations_file_counts
  Array @gapi.statistics.extract.destination_uri_file_counts
end

#json?Boolean

Checks if the destination format for the table data is newline-delimited JSON. The default is false. Not applicable when extracting models.

Returns:

  • (Boolean)

    true when NEWLINE_DELIMITED_JSON, false if not NEWLINE_DELIMITED_JSON or not a table extraction.



123
124
125
126
# File 'lib/google/cloud/bigquery/extract_job.rb', line 123

def json?
  return false unless table?
  @gapi.configuration.extract.destination_format == "NEWLINE_DELIMITED_JSON"
end

#ml_tf_saved_model?Boolean

Checks if the destination format for the model is TensorFlow SavedModel. The default is true for models. Not applicable when extracting tables.

Returns:

  • (Boolean)

    true when ML_TF_SAVED_MODEL, false if not ML_TF_SAVED_MODEL or not a model extraction.



163
164
165
166
167
168
# File 'lib/google/cloud/bigquery/extract_job.rb', line 163

def ml_tf_saved_model?
  return false unless model?
  val = @gapi.configuration.extract.destination_format
  return true if val.nil?
  val == "ML_TF_SAVED_MODEL"
end

#ml_xgboost_booster?Boolean

Checks if the destination format for the model is XGBoost. The default is false. Not applicable when extracting tables.

Returns:

  • (Boolean)

    true when ML_XGBOOST_BOOSTER, false if not ML_XGBOOST_BOOSTER or not a model extraction.



177
178
179
180
# File 'lib/google/cloud/bigquery/extract_job.rb', line 177

def ml_xgboost_booster?
  return false unless model?
  @gapi.configuration.extract.destination_format == "ML_XGBOOST_BOOSTER"
end

#model?Boolean

Whether the source of the export job is a model. See #source.

Returns:

  • (Boolean)

    true when the source is a model, false otherwise.



100
101
102
# File 'lib/google/cloud/bigquery/extract_job.rb', line 100

def model?
  !@gapi.configuration.extract.source_model.nil?
end

Checks if the exported data contains a header row. The default is true for tables. Not applicable when extracting models.

Returns:

  • (Boolean)

    true when the print header configuration is present or nil, false if disabled or not a table extraction.



204
205
206
207
208
209
# File 'lib/google/cloud/bigquery/extract_job.rb', line 204

def print_header?
  return false unless table?
  val = @gapi.configuration.extract.print_header
  val = true if val.nil?
  val
end

#source(view: nil) ⇒ Table, ...

The table or model which is exported.

Parameters:

  • view (String) (defaults to: nil)

    Specifies the view that determines which table information is returned. By default, basic table information and storage statistics (STORAGE_STATS) are returned. Accepted values include :unspecified, :basic, :storage, and :full. For more information, see BigQuery Classes. The default value is the :unspecified view type.

Returns:

  • (Table, Model, nil)

    A table or model instance, or nil.



76
77
78
79
80
81
82
# File 'lib/google/cloud/bigquery/extract_job.rb', line 76

def source view: nil
  if (table = @gapi.configuration.extract.source_table)
    retrieve_table table.project_id, table.dataset_id, table.table_id, metadata_view: view
  elsif (model = @gapi.configuration.extract.source_model)
    retrieve_model model.project_id, model.dataset_id, model.model_id
  end
end

#table?Boolean

Whether the source of the export job is a table. See #source.

Returns:

  • (Boolean)

    true when the source is a table, false otherwise.



90
91
92
# File 'lib/google/cloud/bigquery/extract_job.rb', line 90

def table?
  !@gapi.configuration.extract.source_table.nil?
end

#use_avro_logical_types?Boolean

If #avro? (#format is set to "AVRO"), this flag indicates whether to enable extracting applicable column types (such as TIMESTAMP) to their corresponding AVRO logical types (timestamp-micros), instead of only using their raw types (avro-long). Not applicable when extracting models.

Returns:

  • (Boolean)

    true when applicable column types will use their corresponding AVRO logical types, false if not enabled or not a table extraction.



244
245
246
247
# File 'lib/google/cloud/bigquery/extract_job.rb', line 244

def use_avro_logical_types?
  return false unless table?
  @gapi.configuration.extract.use_avro_logical_types
end