Class: Google::Cloud::Bigquery::External::BigtableSource

Inherits:
DataSource
  • Object
show all
Defined in:
lib/google/cloud/bigquery/external/bigtable_source.rb,
lib/google/cloud/bigquery/external/bigtable_source/column.rb,
lib/google/cloud/bigquery/external/bigtable_source/column_family.rb

Overview

BigtableSource

BigtableSource is a subclass of DataSource and represents a Bigtable external data source that can be queried from directly, even though the data is not stored in BigQuery. Instead of loading or streaming the data, this object references the external data source.

Examples:

require "google/cloud/bigquery"

bigquery = Google::Cloud::Bigquery.new

bigtable_url = "https://googleapis.com/bigtable/projects/..."
bigtable_table = bigquery.external bigtable_url do |bt|
  bt.rowkey_as_string = true
  bt.add_family "user" do |u|
    u.add_string "name"
    u.add_string "email"
    u.add_integer "age"
    u.add_boolean "active"
  end
end

data = bigquery.query "SELECT * FROM my_ext_table",
                      external: { my_ext_table: bigtable_table }

# Iterate over the first page of results
data.each do |row|
  puts row[:name]
end
# Retrieve the next page of results
data = data.next if data.next?

Defined Under Namespace

Classes: Column, ColumnFamily

Instance Method Summary collapse

Methods inherited from DataSource

#autodetect, #autodetect=, #avro?, #backup?, #bigtable?, #compression, #compression=, #csv?, #format, #hive_partitioning?, #hive_partitioning_mode, #hive_partitioning_mode=, #hive_partitioning_require_partition_filter=, #hive_partitioning_require_partition_filter?, #hive_partitioning_source_uri_prefix, #hive_partitioning_source_uri_prefix=, #ignore_unknown, #ignore_unknown=, #json?, #max_bad_records, #max_bad_records=, #orc?, #parquet?, #sheets?, #urls

Instance Method Details

#add_family(family_id, encoding: nil, latest: nil, type: nil) {|family| ... } ⇒ BigtableSource::ColumnFamily

Add a column family to expose in the table schema along with its types. Columns belonging to the column family may also be exposed.

Examples:

require "google/cloud/bigquery"

bigquery = Google::Cloud::Bigquery.new

bigtable_url = "https://googleapis.com/bigtable/projects/..."
bigtable_table = bigquery.external bigtable_url do |bt|
  bt.rowkey_as_string = true
  bt.add_family "user" do |u|
    u.add_string "name"
    u.add_string "email"
    u.add_integer "age"
    u.add_boolean "active"
  end
end

Parameters:

Yields:

  • (family)

    a block for setting the family

Yield Parameters:

Returns:



136
137
138
139
140
141
142
143
144
145
146
# File 'lib/google/cloud/bigquery/external/bigtable_source.rb', line 136

def add_family family_id, encoding: nil, latest: nil, type: nil
  frozen_check!
  fam = BigtableSource::ColumnFamily.new
  fam.family_id = family_id
  fam.encoding = encoding if encoding
  fam.latest = latest if latest
  fam.type = type if type
  yield fam if block_given?
  @families << fam
  fam
end

#familiesArray<BigtableSource::ColumnFamily>

List of column families to expose in the table schema along with their types. This list restricts the column families that can be referenced in queries and specifies their value types. You can use this list to do type conversions - see Google::Cloud::Bigquery::External::BigtableSource::ColumnFamily#type for more details. If you leave this list empty, all column families are present in the table schema and their values are read as BYTES. During a query only the column families referenced in that query are read from Bigtable.

Examples:

require "google/cloud/bigquery"

bigquery = Google::Cloud::Bigquery.new

bigtable_url = "https://googleapis.com/bigtable/projects/..."
bigtable_table = bigquery.external bigtable_url do |bt|
  bt.rowkey_as_string = true
  bt.add_family "user" do |u|
    u.add_string "name"
    u.add_string "email"
    u.add_integer "age"
    u.add_boolean "active"
  end
end

bigtable_table.families.count #=> 1

Returns:



97
98
99
# File 'lib/google/cloud/bigquery/external/bigtable_source.rb', line 97

def families
  @families
end

#rowkey_as_stringBoolean

Whether the rowkey column families will be read and converted to string. Otherwise they are read with BYTES type values and users need to manually cast them with CAST if necessary. The default value is false.

Examples:

require "google/cloud/bigquery"

bigquery = Google::Cloud::Bigquery.new

bigtable_url = "https://googleapis.com/bigtable/projects/..."
bigtable_table = bigquery.external bigtable_url do |bt|
  bt.rowkey_as_string = true
end

bigtable_table.rowkey_as_string #=> true

Returns:

  • (Boolean)


168
169
170
# File 'lib/google/cloud/bigquery/external/bigtable_source.rb', line 168

def rowkey_as_string
  @gapi.bigtable_options.read_rowkey_as_string
end

#rowkey_as_string=(row_rowkey) ⇒ Object

Set the number of rows at the top of a sheet that BigQuery will skip when reading the data.

Examples:

require "google/cloud/bigquery"

bigquery = Google::Cloud::Bigquery.new

bigtable_url = "https://googleapis.com/bigtable/projects/..."
bigtable_table = bigquery.external bigtable_url do |bt|
  bt.rowkey_as_string = true
end

bigtable_table.rowkey_as_string #=> true

Parameters:

  • row_rowkey (Boolean)

    New rowkey_as_string value



190
191
192
193
# File 'lib/google/cloud/bigquery/external/bigtable_source.rb', line 190

def rowkey_as_string= row_rowkey
  frozen_check!
  @gapi.bigtable_options.read_rowkey_as_string = row_rowkey
end