Class: Jinx::CsvIO
Overview
CsvIO reads or writes CSV records. This class wraps a FasterCSV with the following modifications:
-
relax the date parser to allow dd/mm/yyyy dates
-
don’t convert integer text with a leading zero to an octal number
-
allow one custom converter with different semantics: if the converter block call returns nil, then continue conversion, otherwise return the converter result. This differs from FasterCSV converter semantics which calls converters as long the result equals the input field value. The CsvIO converter semantics supports converters that intend a String result to be the converted result.
CsvIO is Enumerable, but does not implement the complete Ruby IO interface.
Constant Summary collapse
- MMM_MM_MAP =
3-letter months => month sequence hash.
['jan', 'feb', 'mar', 'apr', 'may', 'jun', 'jul', 'aug', 'sep', 'oct', 'nov', 'dec'].to_compact_hash_with_index do |mmm, index| index < 9 ? ('0' + index.succ.to_s) : index.succ.to_s end
- DateMatcher =
DateMatcher relaxes the FasterCSV DateMatcher to allow dd/mm/yyyy dates.
/ \A(?: (\w+,?\s+)?\w+\s+\d{1,2},?\s+\d{2,4} | \d{1,2}-\w{3}-\d{2,4} | \d{4}[-\/]\d{1,2}[-\/]\d{1,2} | \d{1,2}[-\/]\d{1,2}[-\/]\d{2,4} )\z /x
- DD_MMM_YYYY_RE =
/^(\d{1,2})-([[:alpha:]]{3})-(\d{2,4})$/
Instance Attribute Summary collapse
-
#accessors ⇒ <Symbol>
(also: #headers)
readonly
The CSV field value accessor.
-
#field_names ⇒ <String>
readonly
The CSV field names.
Class Method Summary collapse
-
.foreach(file, opts = nil) {|row| ... } ⇒ Object
Opens the given CSV file and calls #each with the given block.
-
.join(source, opts) {|rec| ... } ⇒ Object
Joins the source to the target and writes the output.
-
.open(dev, opts = nil) {|csvio| ... } ⇒ Object
Opens the CSV file and calls the given block with this CsvIO as the argument.
Instance Method Summary collapse
- #accessor(name) ⇒ Object
-
#close ⇒ Object
Closes the CSV file.
-
#convert(f, info) ⇒ Object
private
The converted value.
-
#convert_date(f) ⇒ Date
private
The converted date.
-
#each {|row| ... } ⇒ Object
Iterates over each CSV row, yielding a row for each iteration.
-
#initialize(dev, opts = nil) {|value, info| ... } ⇒ CsvIO
constructor
Creates a new CsvIO for the specified source file.
-
#readline ⇒ Object
(also: #shift, #next)
Reads the next CSV row.
-
#reformat_dd_mmm_yy_date(f) ⇒ String
private
The reformatted date String in mm/dd/yy format.
-
#write(row) ⇒ Object
(also: #<<)
Writes the given row to the CSV file.
Constructor Details
#initialize(dev, opts = nil) {|value, info| ... } ⇒ CsvIO
Creates a new CsvIO for the specified source file. If a converter block is given, then it is added to the CSV converters list.
84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 |
# File 'lib/jinx/csv/csvio.rb', line 84 def initialize(dev, opts=nil, &converter) raise ArgumentError.new("CSV input argument is missing") if dev.nil? # the CSV file open mode mode = Options.get(:mode, opts, 'r') # the CSV headers option; can be boolean or array hdr_opt = Options.get(:headers, opts) # there is a header record by default for an input CSV file hdr_opt ||= true if mode =~ /^r/ # make parent directories if necessary for an output CSV file File.makedirs(File.dirname(dev)) if String == dev and mode =~ /^w/ # if headers aren't given, then convert the input CSV header record names to underscore symbols hdr_cvtr = :symbol unless Enumerable === hdr_opt # make a custom converter custom = Proc.new { |value, info| convert(value, info, &converter) } # collect the options csv_opts = {:headers => hdr_opt, :header_converters => hdr_cvtr, :return_headers => true, :write_headers => true, :converters => custom} # Make the parent directory if necessary. FileUtils.mkdir_p(File.dirname(dev)) if String === dev and mode !~ /^r/ # open the CSV file @csv = String === dev ? FasterCSV.open(dev, mode, csv_opts) : FasterCSV.new(dev, csv_opts) # the header => field name hash: # if the header option is set to true, then read the input header line. # otherwise, parse an empty string which mimics an input header line. hdr_row = case hdr_opt when true then @csv.shift when Enumerable then ''.parse_csv(:headers => hdr_opt, :header_converters => :symbol, :return_headers => true) else raise ArgumentError.new("CSV headers option value not supported: #{hdr_opt}") end # The field value accessors consist of the header row headers converted to a symbol. @accessors = hdr_row.headers # The field names consist of the header row values. @field_names = @accessors.map { |sym| hdr_row[sym] } # the header name => symbol map @hdr_sym_hash = hdr_row.to_hash.invert end |
Instance Attribute Details
#accessors ⇒ <Symbol> (readonly) Also known as: headers
Returns the CSV field value accessor.
26 27 28 |
# File 'lib/jinx/csv/csvio.rb', line 26 def accessors @accessors end |
#field_names ⇒ <String> (readonly)
Returns the CSV field names.
23 24 25 |
# File 'lib/jinx/csv/csvio.rb', line 23 def field_names @field_names end |
Class Method Details
.foreach(file, opts = nil) {|row| ... } ⇒ Object
Opens the given CSV file and calls #each with the given block.
52 53 54 |
# File 'lib/jinx/csv/csvio.rb', line 52 def self.foreach(file, opts=nil, &block) open(file, opts) { |csvio| csvio.each(&block) } end |
.join(source, opts) {|rec| ... } ⇒ Object
Joins the source to the target and writes the output. The match is on all fields held in common. If there is more than one match, then all but the first match has empty values for the merged fields. Both files must be sorted in order of the common fields, sequenced by their occurence in the source header.
68 69 70 71 |
# File 'lib/jinx/csv/csvio.rb', line 68 def self.join(source, opts, &block) flds = opts[:for] || Array::EMPTY_ARRAY Csv::Joiner.new(source, opts[:to], opts[:as]).join(*flds, &block) end |
.open(dev, opts = nil) {|csvio| ... } ⇒ Object
Opens the CSV file and calls the given block with this CsvIO as the argument.
35 36 37 38 39 40 41 42 43 44 |
# File 'lib/jinx/csv/csvio.rb', line 35 def self.open(dev, opts=nil) csvio = new(dev, opts) if block_given? then begin yield csvio ensure csvio.close end end end |
Instance Method Details
#accessor(name) ⇒ Object
130 131 132 |
# File 'lib/jinx/csv/csvio.rb', line 130 def accessor(name) @hdr_sym_hash[name] end |
#close ⇒ Object
Closes the CSV file.
124 125 126 |
# File 'lib/jinx/csv/csvio.rb', line 124 def close @csv.close end |
#convert(f, info) ⇒ Object (private)
Returns the converted value.
179 180 181 182 183 184 185 186 187 188 189 190 191 |
# File 'lib/jinx/csv/csvio.rb', line 179 def convert(f, info) return if f.nil? # the block has precedence value = yield(f, info) if block_given? # integer conversion value ||= Integer(f) if f =~ /^[1-9]\d*$/ # date conversion value ||= convert_date(f) if f =~ CsvIO::DateMatcher # float conversion value ||= (Float(f) rescue f) if f =~ /^\d+\.\d*$/ or f =~ /^\d*\.\d+$/ # return converted value or the input field if there was no conversion value || f end |
#convert_date(f) ⇒ Date (private)
Returns the converted date.
195 196 197 198 199 200 201 202 203 204 |
# File 'lib/jinx/csv/csvio.rb', line 195 def convert_date(f) # If input value is in dd-mmm-yy format, then reformat. # Otherwise, parse as a Date if possible. if f =~ DD_MMM_YYYY_RE then ddmmyy = reformat_dd_mmm_yy_date(f) || return convert_date(ddmmyy) else Date.parse(f, true) rescue nil end end |
#each {|row| ... } ⇒ Object
Iterates over each CSV row, yielding a row for each iteration.
138 139 140 |
# File 'lib/jinx/csv/csvio.rb', line 138 def each(&block) @csv.each(&block) end |
#readline ⇒ Object Also known as: shift, next
Reads the next CSV row.
146 147 148 |
# File 'lib/jinx/csv/csvio.rb', line 146 def readline @csv.shift end |
#reformat_dd_mmm_yy_date(f) ⇒ String (private)
Returns the reformatted date String in mm/dd/yy format.
208 209 210 211 212 |
# File 'lib/jinx/csv/csvio.rb', line 208 def reformat_dd_mmm_yy_date(f) dd, mmm, yy = DD_MMM_YYYY_RE.match(f).captures mm = MMM_MM_MAP[mmm.downcase] || return "#{mm}/#{dd}/#{yy}" end |
#write(row) ⇒ Object Also known as: <<
Writes the given row to the CSV file.
157 158 159 160 |
# File 'lib/jinx/csv/csvio.rb', line 157 def write(row) @csv << row @csv.flush end |