Class: IOStreams::Tabular
- Inherits:
-
Object
- Object
- IOStreams::Tabular
- Defined in:
- lib/io_streams/tabular.rb,
lib/io_streams/tabular/header.rb,
lib/io_streams/tabular/parser/csv.rb,
lib/io_streams/tabular/parser/psv.rb,
lib/io_streams/tabular/parser/base.rb,
lib/io_streams/tabular/parser/hash.rb,
lib/io_streams/tabular/parser/json.rb,
lib/io_streams/tabular/parser/array.rb,
lib/io_streams/tabular/parser/fixed.rb,
lib/io_streams/tabular/utility/csv_row.rb
Overview
Common handling for efficiently processing tabular data such as CSV, spreadsheet or other tabular files on a line by line basis.
Tabular consists of a table of data where the first row is usually the header, and subsequent rows are the data elements.
Tabular applies the header information to every row of data when #as_hash is called.
Example using the default CSV parser:
tabular = Tabular.new
tabular.parse_header("first field,Second,thirD")
# => ["first field", "Second", "thirD"]
tabular.cleanse_header!
# => ["first_field", "second", "third"]
tabular.record_parse("1,2,3")
# => {"first_field"=>"1", "second"=>"2", "third"=>"3"}
tabular.record_parse([1,2,3])
# => {"first_field"=>1, "second"=>2, "third"=>3}
tabular.render([5,6,9])
# => "5,6,9"
tabular.render({"third"=>"3", "first_field"=>"1" })
# => "1,,3"
Defined Under Namespace
Modules: Parser, Utility Classes: Header
Instance Attribute Summary collapse
-
#format ⇒ Object
readonly
Returns the value of attribute format.
-
#header ⇒ Object
readonly
Returns the value of attribute header.
-
#parser ⇒ Object
readonly
Returns the value of attribute parser.
Class Method Summary collapse
-
.deregister_format(format) ⇒ Object
De-Register a file format.
-
.register_format(format, parser) ⇒ Object
Register a format and the parser class for it.
-
.registered_formats ⇒ Object
Returns [Array<Symbol>] the list of registered formats.
Instance Method Summary collapse
-
#cleanse_header! ⇒ Object
Returns [Array<String>] the cleansed columns.
-
#header? ⇒ Boolean
Returns [true|false] whether a header is still required in order to parse or render the current format.
-
#initialize(format: nil, file_name: nil, format_options: nil, **args) ⇒ Tabular
constructor
Parse a delimited data source.
-
#parse_header(line) ⇒ Object
Returns [Array] the header row/line after parsing and cleansing.
-
#record_parse(line) ⇒ Object
Returns [Hash<String,Object>] the line as a hash.
-
#render(row) ⇒ Object
Renders the output row.
-
#render_header ⇒ Object
Returns [String] the header rendered for the output format Return nil if no header is required.
-
#requires_header? ⇒ Boolean
Returns [true|false] whether a header row show be rendered on output.
-
#row_parse(line) ⇒ Object
Returns [Array] the row/line as a parsed Array of values.
Constructor Details
#initialize(format: nil, file_name: nil, format_options: nil, **args) ⇒ Tabular
Parse a delimited data source.
Parameters
format: [Symbol]
:csv, :hash, :array, :json, :psv, :fixed
For all other parameters, see Tabular::Header.new
56 57 58 59 60 61 62 63 64 65 |
# File 'lib/io_streams/tabular.rb', line 56 def initialize(format: nil, file_name: nil, format_options: nil, **args) @header = Header.new(**args) klass = if file_name && format.nil? self.class.parser_class_for_file_name(file_name) else self.class.parser_class(format) end @parser = ? klass.new() : klass.new end |
Instance Attribute Details
#format ⇒ Object (readonly)
Returns the value of attribute format.
47 48 49 |
# File 'lib/io_streams/tabular.rb', line 47 def format @format end |
#header ⇒ Object (readonly)
Returns the value of attribute header.
47 48 49 |
# File 'lib/io_streams/tabular.rb', line 47 def header @header end |
#parser ⇒ Object (readonly)
Returns the value of attribute parser.
47 48 49 |
# File 'lib/io_streams/tabular.rb', line 47 def parser @parser end |
Class Method Details
.deregister_format(format) ⇒ Object
De-Register a file format
Returns [Symbol] the format removed, or nil if the format was not registered
Example:
register_extension(:xls)
144 145 146 147 |
# File 'lib/io_streams/tabular.rb', line 144 def self.deregister_format(format) raise(ArgumentError, "Invalid format #{format.inspect}") unless format.to_s =~ /\A\w+\Z/ @formats.delete(format.to_sym) end |
.register_format(format, parser) ⇒ Object
133 134 135 136 |
# File 'lib/io_streams/tabular.rb', line 133 def self.register_format(format, parser) raise(ArgumentError, "Invalid format #{format.inspect}") unless format.nil? || format.to_s =~ /\A\w+\Z/ @formats[format.nil? ? nil : format.to_sym] = parser end |
.registered_formats ⇒ Object
Returns [Array<Symbol>] the list of registered formats
150 151 152 |
# File 'lib/io_streams/tabular.rb', line 150 def self.registered_formats @formats.keys end |
Instance Method Details
#cleanse_header! ⇒ Object
Returns [Array<String>] the cleansed columns
124 125 126 127 |
# File 'lib/io_streams/tabular.rb', line 124 def cleanse_header! header.cleanse! header.columns end |
#header? ⇒ Boolean
Returns [true|false] whether a header is still required in order to parse or render the current format.
68 69 70 |
# File 'lib/io_streams/tabular.rb', line 68 def header? parser.requires_header? && IOStreams.blank?(header.columns) end |
#parse_header(line) ⇒ Object
Returns [Array] the header row/line after parsing and cleansing. Returns ‘nil` if the row/line is blank, or a header is not required for the supplied format (:json, :hash).
Notes:
-
Call ‘header?` first to determine if the header should be parsed first.
-
The header columns are set after parsing the row, but the header is not cleansed.
83 84 85 86 87 |
# File 'lib/io_streams/tabular.rb', line 83 def parse_header(line) return if IOStreams.blank?(line) || !parser.requires_header? header.columns = parser.parse(line) end |
#record_parse(line) ⇒ Object
Returns [Hash<String,Object>] the line as a hash. Returns nil if the line is blank.
91 92 93 94 |
# File 'lib/io_streams/tabular.rb', line 91 def record_parse(line) line = row_parse(line) header.to_hash(line) if line end |
#render(row) ⇒ Object
Renders the output row
105 106 107 108 109 |
# File 'lib/io_streams/tabular.rb', line 105 def render(row) return if IOStreams.blank?(row) parser.render(row, header) end |
#render_header ⇒ Object
Returns [String] the header rendered for the output format Return nil if no header is required.
113 114 115 116 117 118 119 120 121 |
# File 'lib/io_streams/tabular.rb', line 113 def render_header return unless requires_header? if IOStreams.blank?(header.columns) raise(Errors::MissingHeader, "Header columns must be set before attempting to render a header for format: #{format.inspect}") end parser.render(header.columns, header) end |
#requires_header? ⇒ Boolean
Returns [true|false] whether a header row show be rendered on output.
73 74 75 |
# File 'lib/io_streams/tabular.rb', line 73 def requires_header? parser.requires_header? end |