Module: ScraperUtils::DbUtils
- Defined in:
- lib/scraper_utils/db_utils.rb
Overview
Utilities for database operations in scrapers
Class Method Summary collapse
-
.collect_saves! ⇒ Object
Enable in-memory collection mode instead of saving to SQLite.
-
.collected_saves ⇒ Array<Array>
Get all collected save calls.
-
.save_immediately! ⇒ Object
Save to disk rather than collect.
-
.save_record(record) ⇒ void
Saves a record to the SQLite database with validation and logging.
Class Method Details
.collect_saves! ⇒ Object
Enable in-memory collection mode instead of saving to SQLite
9 10 11 |
# File 'lib/scraper_utils/db_utils.rb', line 9 def self.collect_saves! @collected_saves = [] end |
.collected_saves ⇒ Array<Array>
Get all collected save calls
20 21 22 |
# File 'lib/scraper_utils/db_utils.rb', line 20 def self.collected_saves @collected_saves end |
.save_immediately! ⇒ Object
Save to disk rather than collect
14 15 16 |
# File 'lib/scraper_utils/db_utils.rb', line 14 def self.save_immediately! @collected_saves = nil end |
.save_record(record) ⇒ void
This method returns an undefined value.
Saves a record to the SQLite database with validation and logging
29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 |
# File 'lib/scraper_utils/db_utils.rb', line 29 def self.save_record(record) # Validate required fields required_fields = %w[council_reference address description info_url date_scraped] required_fields.each do |field| if record[field].to_s.empty? raise ScraperUtils::UnprocessableRecord, "Missing required field: #{field}" end end # Validate date formats %w[date_scraped date_received on_notice_from on_notice_to].each do |date_field| Date.parse(record[date_field]) unless record[date_field].to_s.empty? rescue ArgumentError raise ScraperUtils::UnprocessableRecord, "Invalid date format for #{date_field}: #{record[date_field].inspect}" end # Determine primary key based on presence of authority_label primary_key = if record.key?("authority_label") %w[authority_label council_reference] else ["council_reference"] end if @collected_saves @collected_saves << record else ScraperWiki.save_sqlite(primary_key, record) ScraperUtils::DataQualityMonitor.log_saved_record(record) end end |