Class: IOStreams::Tabular::Header
- Inherits:
-
Object
- Object
- IOStreams::Tabular::Header
- Defined in:
- lib/io_streams/tabular/header.rb
Overview
Process files / streams that start with a header.
Instance Attribute Summary collapse
-
#allowed_columns ⇒ Object
Returns the value of attribute allowed_columns.
-
#columns ⇒ Object
Returns the value of attribute columns.
-
#required_columns ⇒ Object
Returns the value of attribute required_columns.
-
#skip_unknown ⇒ Object
Returns the value of attribute skip_unknown.
Instance Method Summary collapse
-
#cleanse! ⇒ Object
Returns [Array<String>] list columns that were ignored during cleansing.
-
#initialize(columns: nil, allowed_columns: nil, required_columns: nil, skip_unknown: true) ⇒ Header
constructor
Header.
- #to_array(row, cleanse = true) ⇒ Object
-
#to_hash(row, cleanse = true) ⇒ Object
Marshal to Hash from Array or Hash by applying this header.
Constructor Details
#initialize(columns: nil, allowed_columns: nil, required_columns: nil, skip_unknown: true) ⇒ Header
Header
Parameters
columns [Array<String>]
Columns in this header.
Note:
It is recommended to keep all columns as strings to avoid any issues when persistence
with MongoDB when it converts symbol keys to strings.
allowed_columns [Array<String>]
List of columns to allow.
Default: nil ( Allow all columns )
Note:
When supplied any columns that are rejected will be returned in the cleansed columns
as nil so that they can be ignored during processing.
required_columns [Array<String>]
List of columns that must be present, otherwise an Exception is raised.
skip_unknown [true|false]
true:
Skip columns not present in the whitelist by cleansing them to nil.
#as_hash will skip these additional columns entirely as if they were not in the file at all.
false:
Raises Tabular::InvalidHeader when a column is supplied that is not in the whitelist.
32 33 34 35 36 37 |
# File 'lib/io_streams/tabular/header.rb', line 32 def initialize(columns: nil, allowed_columns: nil, required_columns: nil, skip_unknown: true) @columns = columns @required_columns = required_columns @allowed_columns = allowed_columns @skip_unknown = skip_unknown end |
Instance Attribute Details
#allowed_columns ⇒ Object
Returns the value of attribute allowed_columns.
5 6 7 |
# File 'lib/io_streams/tabular/header.rb', line 5 def allowed_columns @allowed_columns end |
#columns ⇒ Object
Returns the value of attribute columns.
5 6 7 |
# File 'lib/io_streams/tabular/header.rb', line 5 def columns @columns end |
#required_columns ⇒ Object
Returns the value of attribute required_columns.
5 6 7 |
# File 'lib/io_streams/tabular/header.rb', line 5 def required_columns @required_columns end |
#skip_unknown ⇒ Object
Returns the value of attribute skip_unknown.
5 6 7 |
# File 'lib/io_streams/tabular/header.rb', line 5 def skip_unknown @skip_unknown end |
Instance Method Details
#cleanse! ⇒ Object
Returns [Array<String>] list columns that were ignored during cleansing.
Each column is cleansed as follows:
-
Leading and trailing whitespace is stripped.
-
All characters converted to lower case.
-
Spaces and ‘-’ are converted to ‘_’.
-
All characters except for letters, digits, and ‘_’ are stripped.
Notes
-
Raises Tabular::InvalidHeader when there are no non-nil columns left after cleansing.
49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 |
# File 'lib/io_streams/tabular/header.rb', line 49 def cleanse! return [] if columns.nil? || columns.empty? ignored_columns = [] self.columns = columns.collect do |column| cleansed = cleanse_column(column) if allowed_columns.nil? || allowed_columns.include?(cleansed) cleansed else ignored_columns << column nil end end if !skip_unknown && !ignored_columns.empty? raise(IOStreams::Errors::InvalidHeader, "Unknown columns after cleansing: #{ignored_columns.join(',')}") end if ignored_columns.size == columns.size raise(IOStreams::Errors::InvalidHeader, "All columns are unknown after cleansing: #{ignored_columns.join(',')}") end if required_columns missing_columns = required_columns - columns unless missing_columns.empty? raise(IOStreams::Errors::InvalidHeader, "Missing columns after cleansing: #{missing_columns.join(',')}") end end ignored_columns end |
#to_array(row, cleanse = true) ⇒ Object
102 103 104 105 106 107 108 109 |
# File 'lib/io_streams/tabular/header.rb', line 102 def to_array(row, cleanse = true) if row.is_a?(Hash) && columns row = cleanse_hash(row) if cleanse row = columns.collect { |column| row[column] } end raise(IOStreams::Errors::TypeMismatch, "Don't know how to convert #{row.class.name} to an Array without the header columns being set.") unless row.is_a?(Array) row end |
#to_hash(row, cleanse = true) ⇒ Object
Marshal to Hash from Array or Hash by applying this header
Parameters:
cleanse [true|false]
Whether to cleanse and narrow the supplied hash to just those columns in this header.
Only Applies to when the hash is already a Hash.
Useful to turn off narrowing when the input data is already trusted.
88 89 90 91 92 93 94 95 96 97 98 99 100 |
# File 'lib/io_streams/tabular/header.rb', line 88 def to_hash(row, cleanse = true) return if IOStreams.blank?(row) case row when Array raise(IOStreams::Errors::InvalidHeader, "Missing mandatory header when trying to convert a row into a hash") unless columns array_to_hash(row) when Hash cleanse && columns ? cleanse_hash(row) : row else raise(IOStreams::Errors::TypeMismatch, "Don't know how to convert #{row.class.name} to a Hash") end end |