Class: InCSV::Database
- Inherits:
-
Object
- Object
- InCSV::Database
- Defined in:
- lib/incsv/database.rb
Overview
Represents a database file, handling the creation of the database and of the table within the database, as well as the importing of data from a CSV file into the database.
Instance Attribute Summary collapse
-
#db ⇒ Object
readonly
Returns the value of attribute db.
Instance Method Summary collapse
-
#create_table ⇒ Object
Creates a table in the database, with one column in the database for each column in the CSV, the type of which is the best guess for the data found in that column in the CSV data.
-
#db_path ⇒ Object
Returns the path to the database file, generated based on the filename of the CSV passed to the class.
-
#exists? ⇒ Boolean
Returns true if the database file exists; makes no effort to check whether it is in fact a valid SQLite database.
-
#import ⇒ Object
Imports data from the CSV file into the database, applying any preprocessing specified by the column type (e.g. stripping currency prefixes).
-
#imported? ⇒ Boolean
Returns true if there is data in the primary table.
-
#initialize(csv) ⇒ Database
constructor
A new instance of Database.
-
#table_created? ⇒ Boolean
Returns true if the primary database table within the database has been created.
-
#table_name ⇒ Object
Returns the table name, by default generated based on the filename of the CSV.
Constructor Details
#initialize(csv) ⇒ Database
Returns a new instance of Database.
10 11 12 13 14 15 16 |
# File 'lib/incsv/database.rb', line 10 def initialize(csv) @csv = csv @db = Sequel.sqlite(db_path) # require "logger" # @db.loggers << Logger.new($stdout) end |
Instance Attribute Details
#db ⇒ Object (readonly)
Returns the value of attribute db.
18 19 20 |
# File 'lib/incsv/database.rb', line 18 def db @db end |
Instance Method Details
#create_table ⇒ Object
Creates a table in the database, with one column in the database for each column in the CSV, the type of which is the best guess for the data found in that column in the CSV data.
61 62 63 64 65 66 67 68 69 70 71 |
# File 'lib/incsv/database.rb', line 61 def create_table @db.create_table!(table_name) do primary_key :_incsv_id end schema.columns.each do |c| @db.alter_table(table_name) do add_column c.name, c.type.for_database end end end |
#db_path ⇒ Object
Returns the path to the database file, generated based on the filename of the CSV passed to the class. For example, a CSV called ‘products.csv` will be stored in a database called `products.db` in the same directory.
44 45 46 47 |
# File 'lib/incsv/database.rb', line 44 def db_path path = Pathname(csv) (path.dirname + (path.basename(".csv").to_s + ".db")).to_s end |
#exists? ⇒ Boolean
Returns true if the database file exists; makes no effort to check whether it is in fact a valid SQLite database.
36 37 38 |
# File 'lib/incsv/database.rb', line 36 def exists? File.exist?(db_path) end |
#import ⇒ Object
Imports data from the CSV file into the database, applying any preprocessing specified by the column type (e.g. stripping currency prefixes).
Data is imported in transactions, in chunks of 200 rows at a time.
78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 |
# File 'lib/incsv/database.rb', line 78 def import return if imported? create_table unless table_created? columns = schema.columns column_names = columns.map(&:name) chunks(200) do |chunk| rows = chunk.map do |row| row.to_hash.values.each_with_index.map do |column, n| columns[n].type.clean_value(column) end end @db[table_name].import(column_names, rows) end end |
#imported? ⇒ Boolean
Returns true if there is data in the primary table. There are perhaps more accurate ways to calculate this, but only by comparing samples from the CSV to the table; this is faster and will in practice be accurate.
30 31 32 |
# File 'lib/incsv/database.rb', line 30 def imported? table_created? && @db[table_name].count > 0 end |
#table_created? ⇒ Boolean
Returns true if the primary database table within the database has been created.
22 23 24 |
# File 'lib/incsv/database.rb', line 22 def table_created? @db.table_exists?(table_name) end |
#table_name ⇒ Object
Returns the table name, by default generated based on the filename of the CSV. For example, a CSV called ‘products.csv` will produce a table called `products`.
52 53 54 55 56 |
# File 'lib/incsv/database.rb', line 52 def table_name @table_name ||= begin File.basename(csv, ".csv").downcase.gsub(/[^a-z_]/, "").to_sym end end |