Class: Google::Cloud::Bigquery::External::SheetsSource

Inherits:
DataSource
  • Object
show all
Defined in:
lib/google/cloud/bigquery/external/sheets_source.rb

Overview

SheetsSource

SheetsSource is a subclass of DataSource and represents a Google Sheets external data source that can be queried from directly, even though the data is not stored in BigQuery. Instead of loading or streaming the data, this object references the external data source.

Examples:

require "google/cloud/bigquery"

bigquery = Google::Cloud::Bigquery.new

sheets_url = "https://docs.google.com/spreadsheets/d/1234567980"
sheets_table = bigquery.external sheets_url do |sheets|
  sheets.skip_leading_rows = 1
end

data = bigquery.query "SELECT * FROM my_ext_table",
                      external: { my_ext_table: sheets_table }

# Iterate over the first page of results
data.each do |row|
  puts row[:name]
end
# Retrieve the next page of results
data = data.next if data.next?

Instance Method Summary collapse

Methods inherited from DataSource

#autodetect, #autodetect=, #avro?, #backup?, #bigtable?, #compression, #compression=, #csv?, #format, #hive_partitioning?, #hive_partitioning_mode, #hive_partitioning_mode=, #hive_partitioning_require_partition_filter=, #hive_partitioning_require_partition_filter?, #hive_partitioning_source_uri_prefix, #hive_partitioning_source_uri_prefix=, #ignore_unknown, #ignore_unknown=, #json?, #max_bad_records, #max_bad_records=, #orc?, #parquet?, #sheets?, #urls

Instance Method Details

#rangeString

Range of a sheet to query from. Only used when non-empty. Typical format: {sheet_name}!{top_left_cell_id}:{bottom_right_cell_id}.

Examples:

require "google/cloud/bigquery"

bigquery = Google::Cloud::Bigquery.new

sheets_url = "https://docs.google.com/spreadsheets/d/1234567980"
sheets_table = bigquery.external sheets_url do |sheets|
  sheets.range = "sheet1!A1:B20"
end

sheets_table.range #=> "sheet1!A1:B20"

Returns:

  • (String)

    Range of a sheet to query from.



135
136
137
# File 'lib/google/cloud/bigquery/external/sheets_source.rb', line 135

def range
  @gapi.google_sheets_options.range
end

#range=(new_range) ⇒ Object

Set the range of a sheet to query from. Only used when non-empty. Typical format: {sheet_name}!{top_left_cell_id}:{bottom_right_cell_id}.

Examples:

require "google/cloud/bigquery"

bigquery = Google::Cloud::Bigquery.new

sheets_url = "https://docs.google.com/spreadsheets/d/1234567980"
sheets_table = bigquery.external sheets_url do |sheets|
  sheets.range = "sheet1!A1:B20"
end

sheets_table.range #=> "sheet1!A1:B20"

Parameters:

  • new_range (String)

    New range of a sheet to query from.



158
159
160
161
# File 'lib/google/cloud/bigquery/external/sheets_source.rb', line 158

def range= new_range
  frozen_check!
  @gapi.google_sheets_options.range = new_range
end

#skip_leading_rowsInteger

The number of rows at the top of a sheet that BigQuery will skip when reading the data. The default value is 0.

This property is useful if you have header rows that should be skipped. When autodetect is on, behavior is the following:

  • nil - Autodetect tries to detect headers in the first row. If they are not detected, the row is read as data. Otherwise data is read starting from the second row.
  • 0 - Instructs autodetect that there are no headers and data should be read starting from the first row.
  • N > 0 - Autodetect skips N-1 rows and tries to detect headers in row N. If headers are not detected, row N is just skipped. Otherwise row N is used to extract column names for the detected schema.

Examples:

require "google/cloud/bigquery"

bigquery = Google::Cloud::Bigquery.new

sheets_url = "https://docs.google.com/spreadsheets/d/1234567980"
sheets_table = bigquery.external sheets_url do |sheets|
  sheets.skip_leading_rows = 1
end

sheets_table.skip_leading_rows #=> 1

Returns:

  • (Integer)


90
91
92
# File 'lib/google/cloud/bigquery/external/sheets_source.rb', line 90

def skip_leading_rows
  @gapi.google_sheets_options.skip_leading_rows
end

#skip_leading_rows=(row_count) ⇒ Object

Set the number of rows at the top of a sheet that BigQuery will skip when reading the data.

Examples:

require "google/cloud/bigquery"

bigquery = Google::Cloud::Bigquery.new

sheets_url = "https://docs.google.com/spreadsheets/d/1234567980"
sheets_table = bigquery.external sheets_url do |sheets|
  sheets.skip_leading_rows = 1
end

sheets_table.skip_leading_rows #=> 1

Parameters:

  • row_count (Integer)

    New skip_leading_rows value



112
113
114
115
# File 'lib/google/cloud/bigquery/external/sheets_source.rb', line 112

def skip_leading_rows= row_count
  frozen_check!
  @gapi.google_sheets_options.skip_leading_rows = row_count
end