Class: ETL::Control::DatabaseSource
- Defined in:
- lib/etl/control/source/database_source.rb
Overview
Source object which extracts data from a database using ActiveRecord.
Instance Attribute Summary collapse
-
#table ⇒ Object
Returns the value of attribute table.
-
#target ⇒ Object
Returns the value of attribute target.
Attributes inherited from Source
#configuration, #control, #definition, #local_base, #store_locally
Instance Method Summary collapse
-
#columns ⇒ Object
Get the list of columns to read.
-
#count(use_cache = true) ⇒ Object
Get the number of rows in the source.
-
#each(&block) ⇒ Object
Returns each row from the source.
-
#group ⇒ Object
Get the group by part of the query, defaults to nil.
-
#initialize(control, configuration, definition) ⇒ DatabaseSource
constructor
Initialize the source.
-
#join ⇒ Object
Get the join part of the query, defaults to nil.
-
#local_directory ⇒ Object
Get the local directory to use, which is a combination of the local_base, the db hostname the db database name and the db table.
-
#new_records_only ⇒ Object
Return the column which is used for in the where clause to identify new rows.
-
#order ⇒ Object
Get the order for the query, defaults to nil.
-
#select ⇒ Object
Get the select part of the query, defaults to ‘*’.
-
#to_s ⇒ Object
Get a String identifier for the source.
Methods inherited from Source
class_for_name, #errors, #last_local_file, #last_local_file_trigger, #local_file, #local_file_trigger, #read_locally, #timestamp
Constructor Details
#initialize(control, configuration, definition) ⇒ DatabaseSource
Initialize the source.
Arguments:
-
control
: The ETL::Control::Control instance -
configuration
: The configuration Hash -
definition
: The source definition
Required configuration options:
-
:target
: The target connection -
:table
: The source table name -
:database
: The database name
Other options:
-
:join
: Optional join part for the query (ignored unless specified) -
:select
: Optional select part for the query (defaults to ‘*’) -
:group
: Optional group by part for the query (ignored unless specified) -
:order
: Optional order part for the query (ignored unless specified) -
:new_records_only
: Specify the column to use when comparing timestamps against the last successful ETL job execution for the current control file. -
:store_locally
: Set to false to not store a copy of the source data locally in a flat file (defaults to true)
40 41 42 43 44 |
# File 'lib/etl/control/source/database_source.rb', line 40 def initialize(control, configuration, definition) super @target = configuration[:target] @table = configuration[:table] end |
Instance Attribute Details
#table ⇒ Object
Returns the value of attribute table.
12 13 14 |
# File 'lib/etl/control/source/database_source.rb', line 12 def table @table end |
#target ⇒ Object
Returns the value of attribute target.
11 12 13 |
# File 'lib/etl/control/source/database_source.rb', line 11 def target @target end |
Instance Method Details
#columns ⇒ Object
Get the list of columns to read. This is defined in the source definition as either an Array or Hash
95 96 97 98 |
# File 'lib/etl/control/source/database_source.rb', line 95 def columns # weird default is required for writing to cache correctly @columns ||= query_rows.any? ? query_rows.first.keys : [''] end |
#count(use_cache = true) ⇒ Object
Get the number of rows in the source
84 85 86 87 88 89 90 91 |
# File 'lib/etl/control/source/database_source.rb', line 84 def count(use_cache=true) return @count if @count && use_cache if store_locally || read_locally @count = count_locally else @count = connection.select_value(query.gsub(/SELECT .* FROM/, 'SELECT count(1) FROM')) end end |
#each(&block) ⇒ Object
Returns each row from the source. If read_locally is specified then this method will attempt to read from the last stored local file. If no locally stored file exists or if the trigger file for the last locally stored file does not exist then this method will raise an error.
105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 |
# File 'lib/etl/control/source/database_source.rb', line 105 def each(&block) if read_locally # Read from the last stored source ETL::Engine.logger.debug "Reading from local cache" read_rows(last_local_file, &block) else # Read from the original source if store_locally file = local_file write_local(file) read_rows(file, &block) else query_rows.each do |row| row = ETL::Row.new(row.symbolize_keys) row.source = self yield row end end end end |
#group ⇒ Object
Get the group by part of the query, defaults to nil
68 69 70 |
# File 'lib/etl/control/source/database_source.rb', line 68 def group configuration[:group] end |
#join ⇒ Object
Get the join part of the query, defaults to nil
58 59 60 |
# File 'lib/etl/control/source/database_source.rb', line 58 def join configuration[:join] end |
#local_directory ⇒ Object
Get the local directory to use, which is a combination of the local_base, the db hostname the db database name and the db table.
53 54 55 |
# File 'lib/etl/control/source/database_source.rb', line 53 def local_directory File.join(local_base, host, database, configuration[:table]) end |
#new_records_only ⇒ Object
Return the column which is used for in the where clause to identify new rows
79 80 81 |
# File 'lib/etl/control/source/database_source.rb', line 79 def new_records_only configuration[:new_records_only] end |
#order ⇒ Object
Get the order for the query, defaults to nil
73 74 75 |
# File 'lib/etl/control/source/database_source.rb', line 73 def order configuration[:order] end |
#select ⇒ Object
Get the select part of the query, defaults to ‘*’
63 64 65 |
# File 'lib/etl/control/source/database_source.rb', line 63 def select configuration[:select] || '*' end |
#to_s ⇒ Object
Get a String identifier for the source
47 48 49 |
# File 'lib/etl/control/source/database_source.rb', line 47 def to_s "#{host}/#{database}/#{table}" end |