Class: ModsulatorSheet
- Inherits:
-
Object
- Object
- ModsulatorSheet
- Defined in:
- lib/modsulator/modsulator_sheet.rb
Overview
This class provides methods to parse Stanford’s MODS spreadsheets into either an array of hashes, or a JSON string.
Instance Attribute Summary collapse
-
#file ⇒ Object
readonly
Returns the value of attribute file.
-
#filename ⇒ Object
readonly
Returns the value of attribute filename.
Instance Method Summary collapse
-
#headers ⇒ Object
Get the headers used in the spreadsheet.
-
#initialize(file, filename) ⇒ ModsulatorSheet
constructor
A new instance of ModsulatorSheet.
-
#rows ⇒ Array<Hash>
Loads the input spreadsheet into an array of hashes.
-
#spreadsheet ⇒ Roo::CSV, ...
Opens a spreadsheet based on its filename extension.
-
#to_json ⇒ String
Convert the loaded spreadsheet to a JSON string.
Constructor Details
#initialize(file, filename) ⇒ ModsulatorSheet
Returns a new instance of ModsulatorSheet.
13 14 15 16 |
# File 'lib/modsulator/modsulator_sheet.rb', line 13 def initialize file, filename @file = file @filename = filename end |
Instance Attribute Details
#file ⇒ Object (readonly)
Returns the value of attribute file.
9 10 11 |
# File 'lib/modsulator/modsulator_sheet.rb', line 9 def file @file end |
#filename ⇒ Object (readonly)
Returns the value of attribute filename.
9 10 11 |
# File 'lib/modsulator/modsulator_sheet.rb', line 9 def filename @filename end |
Instance Method Details
#headers ⇒ Object
Get the headers used in the spreadsheet
47 48 49 |
# File 'lib/modsulator/modsulator_sheet.rb', line 47 def headers rows.first.keys end |
#rows ⇒ Array<Hash>
Loads the input spreadsheet into an array of hashes. This spreadsheet should conform to the Stanford MODS template format, which has three header rows. The first row is a kind of “super header”, the second row is an intermediate header and the third row is the header row that names the fields. The data rows are in the fourth row onwards.
25 26 27 28 29 30 |
# File 'lib/modsulator/modsulator_sheet.rb', line 25 def rows # Parse the spreadsheet, automatically finding the header row by looking for "druid" and "sourceId" and leave the # header row itself out of the resulting array. Everything preceding the header row is discarded. Would like to use # clean: true here, but the latest release of Roo 1.13.2 crashes. 2.0.0beta1 seems to work though. @rows ||= spreadsheet.parse(header_search: ["druid", "sourceId"]).drop(1) end |
#spreadsheet ⇒ Roo::CSV, ...
Opens a spreadsheet based on its filename extension.
36 37 38 39 40 41 42 43 |
# File 'lib/modsulator/modsulator_sheet.rb', line 36 def spreadsheet @spreadsheet ||= case File.extname(@filename) when ".csv" then Roo::Spreadsheet.open(@file, extension: :csv) when ".xls" then Roo::Spreadsheet.open(@file, extension: :xls) when ".xlsx" then Roo::Spreadsheet.open(@file, extension: :xlsx) else raise "Unknown file type: #{@filename}" end end |
#to_json ⇒ String
Convert the loaded spreadsheet to a JSON string.
54 55 56 57 58 59 |
# File 'lib/modsulator/modsulator_sheet.rb', line 54 def to_json json_hash = Hash.new json_hash["filename"] = File.basename(filename) json_hash["rows"] = rows json_hash.to_json end |