Class: Google::Cloud::Bigquery::ExtractJob

Inherits:
Job
  • Object
show all
Defined in:
lib/google/cloud/bigquery/extract_job.rb

Overview

ExtractJob

A Job subclass representing an export operation that may be performed on a Table or Model. A ExtractJob instance is returned when you call Project#extract_job, Table#extract_job or Model#extract_job.

Examples:

Export table data

require "google/cloud/bigquery"

bigquery = Google::Cloud::Bigquery.new
dataset = bigquery.dataset "my_dataset"
table = dataset.table "my_table"

extract_job = table.extract_job "gs://my-bucket/file-name.json",
                                format: "json"
extract_job.wait_until_done!
extract_job.done? #=> true

Export a model

require "google/cloud/bigquery"

bigquery = Google::Cloud::Bigquery.new
dataset = bigquery.dataset "my_dataset"
model = dataset.model "my_model"

extract_job = model.extract_job "gs://my-bucket/#{model.model_id}"

extract_job.wait_until_done!
extract_job.done? #=> true

See Also:

Direct Known Subclasses

Updater

Defined Under Namespace

Classes: Updater

Instance Method Summary collapse

Methods inherited from Job

#cancel, #configuration, #created_at, #delete, #done?, #ended_at, #error, #errors, #failed?, #job_id, #labels, #location, #num_child_jobs, #parent_job_id, #pending?, #project_id, #reload!, #rerun!, #reservation_usage, #running?, #script_statistics, #session_id, #started_at, #state, #statistics, #status, #transaction_id, #user_email, #wait_until_done!

Instance Method Details

#avro?Boolean

Checks if the destination format for the table data is Avro. The default is false. Not applicable when extracting models.



151
152
153
154
# File 'lib/google/cloud/bigquery/extract_job.rb', line 151

def avro?
  return false unless table?
  @gapi.configuration.extract.destination_format == "AVRO"
end

#compression?Boolean

Checks if the export operation compresses the data using gzip. The default is false. Not applicable when extracting models.



110
111
112
113
# File 'lib/google/cloud/bigquery/extract_job.rb', line 110

def compression?
  return false unless table?
  @gapi.configuration.extract.compression == "GZIP"
end

#csv?Boolean

Checks if the destination format for the table data is CSV. Tables with nested or repeated fields cannot be exported as CSV. The default is true for tables. Not applicable when extracting models.



136
137
138
139
140
141
# File 'lib/google/cloud/bigquery/extract_job.rb', line 136

def csv?
  return false unless table?
  val = @gapi.configuration.extract.destination_format
  return true if val.nil?
  val == "CSV"
end

#delimiterString?

The character or symbol the operation uses to delimit fields in the exported data. The default is a comma (,) for tables. Not applicable when extracting models.



190
191
192
193
194
195
# File 'lib/google/cloud/bigquery/extract_job.rb', line 190

def delimiter
  return unless table?
  val = @gapi.configuration.extract.field_delimiter
  val = "," if val.nil?
  val
end

#destinationsObject

The URI or URIs representing the Google Cloud Storage files to which the data is exported.



61
62
63
# File 'lib/google/cloud/bigquery/extract_job.rb', line 61

def destinations
  Array @gapi.configuration.extract.destination_uris
end

#destinations_countsHash<String, Integer>

A hash containing the URI or URI pattern specified in #destinations mapped to the counts of files per destination.



229
230
231
# File 'lib/google/cloud/bigquery/extract_job.rb', line 229

def destinations_counts
  destinations.zip(destinations_file_counts).to_h
end

#destinations_file_countsArray<Integer>

The number of files per destination URI or URI pattern specified in #destinations.



218
219
220
# File 'lib/google/cloud/bigquery/extract_job.rb', line 218

def destinations_file_counts
  Array @gapi.statistics.extract.destination_uri_file_counts
end

#json?Boolean

Checks if the destination format for the table data is newline-delimited JSON. The default is false. Not applicable when extracting models.



123
124
125
126
# File 'lib/google/cloud/bigquery/extract_job.rb', line 123

def json?
  return false unless table?
  @gapi.configuration.extract.destination_format == "NEWLINE_DELIMITED_JSON"
end

#ml_tf_saved_model?Boolean

Checks if the destination format for the model is TensorFlow SavedModel. The default is true for models. Not applicable when extracting tables.



163
164
165
166
167
168
# File 'lib/google/cloud/bigquery/extract_job.rb', line 163

def ml_tf_saved_model?
  return false unless model?
  val = @gapi.configuration.extract.destination_format
  return true if val.nil?
  val == "ML_TF_SAVED_MODEL"
end

#ml_xgboost_booster?Boolean

Checks if the destination format for the model is XGBoost. The default is false. Not applicable when extracting tables.



177
178
179
180
# File 'lib/google/cloud/bigquery/extract_job.rb', line 177

def ml_xgboost_booster?
  return false unless model?
  @gapi.configuration.extract.destination_format == "ML_XGBOOST_BOOSTER"
end

#model?Boolean

Whether the source of the export job is a model. See #source.



100
101
102
# File 'lib/google/cloud/bigquery/extract_job.rb', line 100

def model?
  !@gapi.configuration.extract.source_model.nil?
end

Checks if the exported data contains a header row. The default is true for tables. Not applicable when extracting models.



204
205
206
207
208
209
# File 'lib/google/cloud/bigquery/extract_job.rb', line 204

def print_header?
  return false unless table?
  val = @gapi.configuration.extract.print_header
  val = true if val.nil?
  val
end

#source(view: nil) ⇒ Table, ...

The table or model which is exported.



76
77
78
79
80
81
82
# File 'lib/google/cloud/bigquery/extract_job.rb', line 76

def source view: nil
  if (table = @gapi.configuration.extract.source_table)
    retrieve_table table.project_id, table.dataset_id, table.table_id, metadata_view: view
  elsif (model = @gapi.configuration.extract.source_model)
    retrieve_model model.project_id, model.dataset_id, model.model_id
  end
end

#table?Boolean

Whether the source of the export job is a table. See #source.



90
91
92
# File 'lib/google/cloud/bigquery/extract_job.rb', line 90

def table?
  !@gapi.configuration.extract.source_table.nil?
end

#use_avro_logical_types?Boolean

If #avro? (#format is set to "AVRO"), this flag indicates whether to enable extracting applicable column types (such as TIMESTAMP) to their corresponding AVRO logical types (timestamp-micros), instead of only using their raw types (avro-long). Not applicable when extracting models.



244
245
246
247
# File 'lib/google/cloud/bigquery/extract_job.rb', line 244

def use_avro_logical_types?
  return false unless table?
  @gapi.configuration.extract.use_avro_logical_types
end