Class: ETL::Control::DatabaseSource
- Defined in:
- lib/etl/control/source/database_source.rb
Overview
Source object which extracts data from a database using ActiveRecord.
Instance Attribute Summary
Attributes inherited from Source
#configuration, #control, #definition, #local_base, #store_locally
Instance Method Summary collapse
-
#columns ⇒ Object
Get the list of columns to read.
-
#count(use_cache = true) ⇒ Object
Get the number of rows in the source.
-
#each(&block) ⇒ Object
Returns each row from the source.
-
#group ⇒ Object
Get the group by part of the query, defaults to nil.
-
#initialize(control, configuration, definition) ⇒ DatabaseSource
constructor
Initialize the source.
-
#join ⇒ Object
Get the join part of the query, defaults to nil.
-
#local_directory ⇒ Object
Get the local directory to use, which is a combination of the local_base, the db hostname the db database name and the db table.
-
#new_records_only ⇒ Object
Return the column which is used for in the where clause to identify new rows.
-
#order ⇒ Object
Get the order for the query, defaults to nil.
-
#select ⇒ Object
Get the select part of the query, defaults to ‘*’.
-
#to_s ⇒ Object
Get a String identifier for the source.
Methods inherited from Source
class_for_name, #errors, #last_local_file, #last_local_file_trigger, #local_file, #local_file_trigger, #read_locally, #timestamp
Constructor Details
#initialize(control, configuration, definition) ⇒ DatabaseSource
Initialize the source.
Arguments:
-
control: The ETL::Control::Control instance -
configuration: The configuration Hash -
definition: The source definition
Required configuration options:
-
:table: The source table name -
:database: The database name
Other options:
-
:adapter: The adapter to use (defaults to :mysql) -
:username: The database username (defaults to ‘root’) -
:password: The password to the database (defaults to nothing) -
:host: The host for the database (defaults to ‘localhost’) -
:join: Optional join part for the query (ignored unless specified) -
:select: Optional select part for the query (defaults to ‘*’) -
:order: Optional order part for the query (ignored unless specified) -
:store_locally: Set to false to not store a copy of the source data locally in a flat file (defaults to true)
37 38 39 40 |
# File 'lib/etl/control/source/database_source.rb', line 37 def initialize(control, configuration, definition) super connect end |
Instance Method Details
#columns ⇒ Object
Get the list of columns to read. This is defined in the source definition as either an Array or Hash
91 92 93 94 95 96 97 98 99 100 |
# File 'lib/etl/control/source/database_source.rb', line 91 def columns case definition when Array definition.collect(&:to_sym) when Hash definition.keys.collect(&:to_sym) else raise "Definition must be either an Array or a Hash" end end |
#count(use_cache = true) ⇒ Object
Get the number of rows in the source
80 81 82 83 84 85 86 87 |
# File 'lib/etl/control/source/database_source.rb', line 80 def count(use_cache=true) return @count if @count && use_cache if store_locally || read_locally @count = count_locally else @count = connection.select_value(query.gsub(/SELECT .* FROM/, 'SELECT count(1) FROM')) end end |
#each(&block) ⇒ Object
Returns each row from the source. If read_locally is specified then this method will attempt to read from the last stored local file. If no locally stored file exists or if the trigger file for the last locally stored file does not exist then this method will raise an error.
107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 |
# File 'lib/etl/control/source/database_source.rb', line 107 def each(&block) if read_locally # Read from the last stored source ETL::Engine.logger.debug "Reading from local cache" read_rows(last_local_file, &block) else # Read from the original source if store_locally file = local_file write_local(file) read_rows(file, &block) else connection.select_all(query).each do |row| row = ETL::Row.new(row.symbolize_keys) row.source = self yield row end end end end |
#group ⇒ Object
Get the group by part of the query, defaults to nil
64 65 66 |
# File 'lib/etl/control/source/database_source.rb', line 64 def group configuration[:group] end |
#join ⇒ Object
Get the join part of the query, defaults to nil
54 55 56 |
# File 'lib/etl/control/source/database_source.rb', line 54 def join configuration[:join] end |
#local_directory ⇒ Object
Get the local directory to use, which is a combination of the local_base, the db hostname the db database name and the db table.
49 50 51 |
# File 'lib/etl/control/source/database_source.rb', line 49 def local_directory File.join(local_base, host, configuration[:database], configuration[:table]) end |
#new_records_only ⇒ Object
Return the column which is used for in the where clause to identify new rows
75 76 77 |
# File 'lib/etl/control/source/database_source.rb', line 75 def new_records_only configuration[:new_records_only] end |
#order ⇒ Object
Get the order for the query, defaults to nil
69 70 71 |
# File 'lib/etl/control/source/database_source.rb', line 69 def order configuration[:order] end |
#select ⇒ Object
Get the select part of the query, defaults to ‘*’
59 60 61 |
# File 'lib/etl/control/source/database_source.rb', line 59 def select configuration[:select] || '*' end |
#to_s ⇒ Object
Get a String identifier for the source
43 44 45 |
# File 'lib/etl/control/source/database_source.rb', line 43 def to_s "#{host}/#{configuration[:database]}/#{configuration[:table]}" end |