Class: Google::Cloud::Bigquery::External::JsonSource

Inherits:
DataSource
  • Object
show all
Defined in:
lib/google/cloud/bigquery/external/json_source.rb

Overview

JsonSource

JsonSource is a subclass of DataSource and represents a JSON external data source that can be queried from directly, such as Google Cloud Storage or Google Drive, even though the data is not stored in BigQuery. Instead of loading or streaming the data, this object references the external data source.

Examples:

require "google/cloud/bigquery"

bigquery = Google::Cloud::Bigquery.new

require "google/cloud/bigquery"

bigquery = Google::Cloud::Bigquery.new

json_url = "gs://bucket/path/to/data.json"
json_table = bigquery.external json_url do |json|
  json.schema do |schema|
    schema.string "name", mode: :required
    schema.string "email", mode: :required
    schema.integer "age", mode: :required
    schema.boolean "active", mode: :required
  end
end

data = bigquery.query "SELECT * FROM my_ext_table",
                      external: { my_ext_table: json_table }

# Iterate over the first page of results
data.each do |row|
  puts row[:name]
end
# Retrieve the next page of results
data = data.next if data.next?

Instance Method Summary collapse

Methods inherited from DataSource

#autodetect, #autodetect=, #avro?, #backup?, #bigtable?, #compression, #compression=, #csv?, #format, #hive_partitioning?, #hive_partitioning_mode, #hive_partitioning_mode=, #hive_partitioning_require_partition_filter=, #hive_partitioning_require_partition_filter?, #hive_partitioning_source_uri_prefix, #hive_partitioning_source_uri_prefix=, #ignore_unknown, #ignore_unknown=, #json?, #max_bad_records, #max_bad_records=, #orc?, #parquet?, #sheets?, #urls

Instance Method Details

#fieldsArray<Schema::Field>

The fields of the schema.

Returns:



128
129
130
# File 'lib/google/cloud/bigquery/external/json_source.rb', line 128

def fields
  schema.fields
end

#headersArray<Symbol>

The names of the columns in the schema.

Returns:

  • (Array<Symbol>)

    An array of column names.



137
138
139
# File 'lib/google/cloud/bigquery/external/json_source.rb', line 137

def headers
  schema.headers
end

#param_typesHash

The types of the fields in the data in the schema, using the same format as the optional query parameter types.

Returns:

  • (Hash)

    A hash with field names as keys, and types as values.



147
148
149
# File 'lib/google/cloud/bigquery/external/json_source.rb', line 147

def param_types
  schema.param_types
end

#schema(replace: false) {|schema| ... } ⇒ Google::Cloud::Bigquery::Schema

The schema for the data.

Examples:

require "google/cloud/bigquery"

bigquery = Google::Cloud::Bigquery.new

json_url = "gs://bucket/path/to/data.json"
json_table = bigquery.external json_url do |json|
  json.schema do |schema|
    schema.string "name", mode: :required
    schema.string "email", mode: :required
    schema.integer "age", mode: :required
    schema.boolean "active", mode: :required
  end
end

Parameters:

  • replace (Boolean) (defaults to: false)

    Whether to replace the existing schema with the new schema. If true, the fields will replace the existing schema. If false, the fields will be added to the existing schema. The default value is false.

Yields:

  • (schema)

    a block for setting the schema

Yield Parameters:

  • schema (Schema)

    the object accepting the schema

Returns:



86
87
88
89
90
91
92
93
94
95
# File 'lib/google/cloud/bigquery/external/json_source.rb', line 86

def schema replace: false
  @schema ||= Schema.from_gapi @gapi.schema
  if replace
    frozen_check!
    @schema = Schema.from_gapi
  end
  @schema.freeze if frozen?
  yield @schema if block_given?
  @schema
end

#schema=(new_schema) ⇒ Object

Set the schema for the data.

Examples:

require "google/cloud/bigquery"

bigquery = Google::Cloud::Bigquery.new

json_shema = bigquery.schema do |schema|
  schema.string "name", mode: :required
  schema.string "email", mode: :required
  schema.integer "age", mode: :required
  schema.boolean "active", mode: :required
end

json_url = "gs://bucket/path/to/data.json"
json_table = bigquery.external json_url
json_table.schema = json_shema

Parameters:

  • new_schema (Schema)

    The schema object.



118
119
120
121
# File 'lib/google/cloud/bigquery/external/json_source.rb', line 118

def schema= new_schema
  frozen_check!
  @schema = new_schema
end