Class: ETL::Control::DatabaseSource
- Defined in:
- lib/etl/control/source/database_source.rb
Overview
Source object which extracts data from a database using ActiveRecord.
Instance Attribute Summary collapse
-
#table ⇒ Object
Returns the value of attribute table.
-
#target ⇒ Object
Returns the value of attribute target.
Attributes inherited from Source
#configuration, #control, #definition, #local_base, #store_locally
Instance Method Summary collapse
-
#columns ⇒ Object
Get the list of columns to read.
-
#count(use_cache = true) ⇒ Object
Get the number of rows in the source.
-
#each(&block) ⇒ Object
Returns each row from the source.
-
#group ⇒ Object
Get the group by part of the query, defaults to nil.
-
#initialize(control, configuration, definition) ⇒ DatabaseSource
constructor
Initialize the source.
-
#join ⇒ Object
Get the join part of the query, defaults to nil.
-
#local_directory ⇒ Object
Get the local directory to use, which is a combination of the local_base, the db hostname the db database name and the db table.
-
#new_records_only ⇒ Object
Return the column which is used for in the where clause to identify new rows.
-
#order ⇒ Object
Get the order for the query, defaults to nil.
-
#select ⇒ Object
Get the select part of the query, defaults to ‘*’.
-
#to_s ⇒ Object
Get a String identifier for the source.
Methods inherited from Source
class_for_name, #errors, #last_local_file, #last_local_file_trigger, #local_file, #local_file_trigger, #read_locally, #timestamp
Constructor Details
#initialize(control, configuration, definition) ⇒ DatabaseSource
Initialize the source.
Arguments:
-
control
: The ETL::Control::Control instance -
configuration
: The configuration Hash -
definition
: The source definition
Required configuration options:
-
:target
: The target connection -
:table
: The source table name -
:database
: The database name
Other options:
-
:join
: Optional join part for the query (ignored unless specified) -
:select
: Optional select part for the query (defaults to ‘*’) -
:group
: Optional group by part for the query (ignored unless specified) -
:order
: Optional order part for the query (ignored unless specified) -
:new_records_only
: Specify the column to use when comparing timestamps against the last successful ETL job execution for the current control file. -
:store_locally
: Set to false to not store a copy of the source data locally in a flat file (defaults to true)
40 41 42 43 44 45 |
# File 'lib/etl/control/source/database_source.rb', line 40 def initialize(control, configuration, definition) super @target = configuration[:target] @table = configuration[:table] @query = configuration[:query] end |
Instance Attribute Details
#table ⇒ Object
Returns the value of attribute table.
12 13 14 |
# File 'lib/etl/control/source/database_source.rb', line 12 def table @table end |
#target ⇒ Object
Returns the value of attribute target.
11 12 13 |
# File 'lib/etl/control/source/database_source.rb', line 11 def target @target end |
Instance Method Details
#columns ⇒ Object
Get the list of columns to read. This is defined in the source definition as either an Array or Hash
96 97 98 99 |
# File 'lib/etl/control/source/database_source.rb', line 96 def columns # weird default is required for writing to cache correctly @columns ||= query_rows.any? ? query_rows.first.keys : [''] end |
#count(use_cache = true) ⇒ Object
Get the number of rows in the source
85 86 87 88 89 90 91 92 |
# File 'lib/etl/control/source/database_source.rb', line 85 def count(use_cache=true) return @count if @count && use_cache if @store_locally || read_locally @count = count_locally else @count = connection.select_value(query.gsub(/SELECT .* FROM/, 'SELECT count(1) FROM')) end end |
#each(&block) ⇒ Object
Returns each row from the source. If read_locally is specified then this method will attempt to read from the last stored local file. If no locally stored file exists or if the trigger file for the last locally stored file does not exist then this method will raise an error.
106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 |
# File 'lib/etl/control/source/database_source.rb', line 106 def each(&block) if read_locally # Read from the last stored source ETL::Engine.logger.debug "Reading from local cache" read_rows(last_local_file, &block) else # Read from the original source if @store_locally file = local_file write_local(file) read_rows(file, &block) else query_rows.each do |r| row = ETL::Row.new() r.symbolize_keys.each_pair { |key, value| row[key] = value } row.source = self yield row end end end end |
#group ⇒ Object
Get the group by part of the query, defaults to nil
69 70 71 |
# File 'lib/etl/control/source/database_source.rb', line 69 def group configuration[:group] end |
#join ⇒ Object
Get the join part of the query, defaults to nil
59 60 61 |
# File 'lib/etl/control/source/database_source.rb', line 59 def join configuration[:join] end |
#local_directory ⇒ Object
Get the local directory to use, which is a combination of the local_base, the db hostname the db database name and the db table.
54 55 56 |
# File 'lib/etl/control/source/database_source.rb', line 54 def local_directory File.join(local_base, to_s) end |
#new_records_only ⇒ Object
Return the column which is used for in the where clause to identify new rows
80 81 82 |
# File 'lib/etl/control/source/database_source.rb', line 80 def new_records_only configuration[:new_records_only] end |
#order ⇒ Object
Get the order for the query, defaults to nil
74 75 76 |
# File 'lib/etl/control/source/database_source.rb', line 74 def order configuration[:order] end |
#select ⇒ Object
Get the select part of the query, defaults to ‘*’
64 65 66 |
# File 'lib/etl/control/source/database_source.rb', line 64 def select configuration[:select] || '*' end |
#to_s ⇒ Object
Get a String identifier for the source
48 49 50 |
# File 'lib/etl/control/source/database_source.rb', line 48 def to_s "#{host}/#{database}/#{@table}" end |