Class: CSVH::Reader
- Inherits:
-
Object
- Object
- CSVH::Reader
- Extended by:
- Forwardable
- Defined in:
- lib/csvh/reader.rb
Overview
Sequantially and lazily reads from CSV-formatted data that has a header row. Allows accessing headers before reading any subsequent data rows and/or when no additional data rows are present in the data.
Constant Summary collapse
- DEFAULT_CSV_OPTS =
{ headers: :first_row, return_headers: true }.freeze
Class Method Summary collapse
-
.from_file(file_path, **opts) {|the| ... } ⇒ Reader, object
(also: foreach)
When called without a block argument, returns an open reader for data from the file at the given file_path.
-
.from_string_or_io(data, **opts) ⇒ Reader
(also: parse)
Returns an open reader for data from given string or readable IO stream.
Instance Method Summary collapse
-
#each {|| ... } ⇒ Object
When given a block, yields each remaining data row of the data source in turn as a ‘CSV::Row` instance.
-
#headers ⇒ Array<String>
Returns the list of column header values from the CSV data.
-
#initialize(csv) ⇒ Reader
constructor
Returns a new reader based on the given CSV object.
-
#read ⇒ CSV::Table
(also: #readlines)
Slurps the remaining data rows and returns a ‘CSV::Table`.
-
#shift ⇒ CSV::Row
(also: #gets, #readline)
A single data row is pulled from the data source, parsed and returned as a CSV::Row.
-
#to_csvh_reader ⇒ Reader
The target of the method call.
Constructor Details
#initialize(csv) ⇒ Reader
Returns a new reader based on the given CSV object. The CSV object must be configured to return a header row (a ‘CSV::ROW` that returns true from its `#header?` method as its first item. The header item must also not have been read yet.
116 117 118 119 120 121 122 123 124 125 |
# File 'lib/csvh/reader.rb', line 116 def initialize(csv) unless csv.return_headers? raise \ InappropreateCsvInstanceError, "%{self.class} requires a CSV instance that returns headers." \ " It needs to have been initialized with non-false/nil values" \ " for :headers and :return_headers options." end @csv = csv end |
Class Method Details
.from_file(file_path, **opts) {|the| ... } ⇒ Reader, object Also known as: foreach
When called without a block argument, returns an open reader for data from the file at the given file_path.
When called with a block argument, passes an open reader for data from the file to the given block, closes the reader (and its underlying file IO channel) before returning, and then returns the value that was returned by the block.
By default, the underlying CSV object is initialized with default options for data with a header row and to return the header row. Any oadditional options you supply will be added to those defaults or override them.
A [Reader] created using this method will delegate all of the same IO methods that a ‘CSV` created using `CSV#open` does except `close_write`, `flush`, `fsync`, `sync`, `sync=`, and `truncate`. You may call:
-
binmode()
-
binmode?()
-
close()
-
close_read()
-
closed?()
-
eof()
-
eof?()
-
external_encoding()
-
fcntl()
-
fileno()
-
flock()
-
flush()
-
internal_encoding()
-
ioctl()
-
isatty()
-
path()
-
pid()
-
pos()
-
pos=()
-
reopen()
-
seek()
-
fstat()
-
tell()
-
to_i()
-
to_io()
-
tty?()
71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 |
# File 'lib/csvh/reader.rb', line 71 def from_file(file_path, **opts) opts = default_csv_opts.merge(opts) io = File.open(file_path, 'r') csv = CSV.new(io, **opts) instance = new(csv) if block_given? begin yield instance ensure instance.close unless instance.closed? end else instance end end |
.from_string_or_io(data, **opts) ⇒ Reader Also known as: parse
Returns an open reader for data from given string or readable IO stream.
94 95 96 97 98 |
# File 'lib/csvh/reader.rb', line 94 def from_string_or_io(data, **opts) opts = default_csv_opts.merge(opts) csv = CSV.new(data, **opts) new(csv) end |
Instance Method Details
#each {|| ... } ⇒ Object
When given a block, yields each remaining data row of the data source in turn as a ‘CSV::Row` instance. When called without a block, returns an Enumerator over those rows.
Will never yield the header row, however, the headers are available via the #headers method of either the reader or the row object.
210 211 212 213 214 215 216 217 |
# File 'lib/csvh/reader.rb', line 210 def each headers if block_given? @csv.each { |row| yield row } else @csv.each end end |
#headers ⇒ Array<String>
Returns the list of column header values from the CSV data.
If any rows have already been read, then the result is immediately returned, having been recorded when the header row was initially encountered.
If no rows have been read yet, then the first row is read from the data in order to return the result.
142 143 144 145 146 147 148 149 150 151 152 |
# File 'lib/csvh/reader.rb', line 142 def headers @headers ||= begin row = @csv.readline unless row.header_row? raise \ CsvPrematurelyShiftedError, "the header row was prematurely read from the underlying CSV object." end row.headers end end |
#read ⇒ CSV::Table Also known as: readlines
Slurps the remaining data rows and returns a ‘CSV::Table`.
This is essentially the same behavior as ‘CSV#read`, but ensures that the header info has been fetched first, and the resulting table will never include the header row.
Note that the Ruby documentation (at least as of 2.2.2) is for ‘CSV#read` is incomplete and simply says that it returns “an Array of Arrays”, but it actually returns a table if a truthy `:headers` option was used when creating the `CSV` object.
254 255 256 257 |
# File 'lib/csvh/reader.rb', line 254 def read headers @csv.read end |
#shift ⇒ CSV::Row Also known as: gets, readline
A single data row is pulled from the data source, parsed and returned as a CSV::Row.
This is essentially the same behavior as ‘CSV#shift`, but ensures that the header info has been fetched first, and #shift will never return the header row.
269 270 271 272 |
# File 'lib/csvh/reader.rb', line 269 def shift headers @csv.shift end |
#to_csvh_reader ⇒ Reader
Returns the target of the method call.
128 129 130 |
# File 'lib/csvh/reader.rb', line 128 def to_csvh_reader self end |