Class: ETL::Control::DatabaseSource
- Defined in:
- lib/etl/control/source/database_source.rb
Overview
Source object which extracts data from a database using ActiveRecord.
Instance Attribute Summary collapse
-
#table ⇒ Object
Returns the value of attribute table.
-
#target ⇒ Object
Returns the value of attribute target.
Attributes inherited from Source
#configuration, #control, #definition, #local_base, #store_locally
Instance Method Summary collapse
-
#columns ⇒ Object
Get the list of columns to read.
-
#count(use_cache = true) ⇒ Object
Get the number of rows in the source.
-
#each(&block) ⇒ Object
Returns each row from the source.
-
#group ⇒ Object
Get the group by part of the query, defaults to nil.
-
#initialize(control, configuration, definition) ⇒ DatabaseSource
constructor
Initialize the source.
-
#join ⇒ Object
Get the join part of the query, defaults to nil.
-
#local_directory ⇒ Object
Get the local directory to use, which is a combination of the local_base, the db hostname the db database name and the db table.
-
#new_records_only ⇒ Object
Return the column which is used for in the where clause to identify new rows.
-
#order ⇒ Object
Get the order for the query, defaults to nil.
-
#select ⇒ Object
Get the select part of the query, defaults to ‘*’.
-
#to_s ⇒ Object
Get a String identifier for the source.
Methods inherited from Source
class_for_name, #errors, #last_local_file, #last_local_file_trigger, #local_file, #local_file_trigger, #read_locally, #timestamp
Constructor Details
#initialize(control, configuration, definition) ⇒ DatabaseSource
Initialize the source.
Arguments:
-
control
: The ETL::Control::Control instance -
configuration
: The configuration Hash -
definition
: The source definition
Required configuration options:
-
:target
: The target connection -
:table
: The source table name -
:database
: The database name
Other options:
-
:join
: Optional join part for the query (ignored unless specified) -
:select
: Optional select part for the query (defaults to ‘*’) -
:group
: Optional group by part for the query (ignored unless specified) -
:order
: Optional order part for the query (ignored unless specified) -
:new_records_only
: Specify the column to use when comparing timestamps against the last successful ETL job execution for the current control file. -
:store_locally
: Set to false to not store a copy of the source data locally in a flat file (defaults to true)
42 43 44 45 46 47 |
# File 'lib/etl/control/source/database_source.rb', line 42 def initialize(control, configuration, definition) super @target = configuration[:target] @table = configuration[:table] @query = configuration[:query] end |
Instance Attribute Details
#table ⇒ Object
Returns the value of attribute table.
14 15 16 |
# File 'lib/etl/control/source/database_source.rb', line 14 def table @table end |
#target ⇒ Object
Returns the value of attribute target.
13 14 15 |
# File 'lib/etl/control/source/database_source.rb', line 13 def target @target end |
Instance Method Details
#columns ⇒ Object
Get the list of columns to read. This is defined in the source definition as either an Array or Hash
98 99 100 101 |
# File 'lib/etl/control/source/database_source.rb', line 98 def columns # weird default is required for writing to cache correctly @columns ||= query_rows.any? ? query_rows.first.keys : [''] end |
#count(use_cache = true) ⇒ Object
Get the number of rows in the source
87 88 89 90 91 92 93 94 |
# File 'lib/etl/control/source/database_source.rb', line 87 def count(use_cache=true) return @count if @count && use_cache if @store_locally || read_locally @count = count_locally else @count = connection.select_value(query.gsub(/SELECT .* FROM/, 'SELECT count(1) FROM')) end end |
#each(&block) ⇒ Object
Returns each row from the source. If read_locally is specified then this method will attempt to read from the last stored local file. If no locally stored file exists or if the trigger file for the last locally stored file does not exist then this method will raise an error.
108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 |
# File 'lib/etl/control/source/database_source.rb', line 108 def each(&block) if read_locally # Read from the last stored source ETL::Engine.logger.debug "Reading from local cache" read_rows(last_local_file, &block) else # Read from the original source if @store_locally file = local_file write_local(file) read_rows(file, &block) else query_rows.each do |r| row = ETL::Row.new() r.symbolize_keys.each_pair { |key, value| row[key] = value } row.source = self yield row end end end end |
#group ⇒ Object
Get the group by part of the query, defaults to nil
71 72 73 |
# File 'lib/etl/control/source/database_source.rb', line 71 def group configuration[:group] end |
#join ⇒ Object
Get the join part of the query, defaults to nil
61 62 63 |
# File 'lib/etl/control/source/database_source.rb', line 61 def join configuration[:join] end |
#local_directory ⇒ Object
Get the local directory to use, which is a combination of the local_base, the db hostname the db database name and the db table.
56 57 58 |
# File 'lib/etl/control/source/database_source.rb', line 56 def local_directory File.join(local_base, to_s) end |
#new_records_only ⇒ Object
Return the column which is used for in the where clause to identify new rows
82 83 84 |
# File 'lib/etl/control/source/database_source.rb', line 82 def new_records_only configuration[:new_records_only] end |
#order ⇒ Object
Get the order for the query, defaults to nil
76 77 78 |
# File 'lib/etl/control/source/database_source.rb', line 76 def order configuration[:order] end |
#select ⇒ Object
Get the select part of the query, defaults to ‘*’
66 67 68 |
# File 'lib/etl/control/source/database_source.rb', line 66 def select configuration[:select] || '*' end |
#to_s ⇒ Object
Get a String identifier for the source
50 51 52 |
# File 'lib/etl/control/source/database_source.rb', line 50 def to_s "#{host}/#{database}/#{@table}" end |