Method: CSV#initialize
- Defined in:
- lib/csv.rb
#initialize(data, options = Hash.new) ⇒ CSV
This constructor will wrap either a String or IO object passed in data for reading and/or writing. In addition to the CSV instance methods, several IO methods are delegated. (See CSV::open() for a complete list.) If you pass a String for data, you can later retrieve it (after writing to it, for example) with CSV.string().
Note that a wrapped String will be positioned at at the beginning (for reading). If you want it at the end (for writing), use CSV::generate(). If you want any other positioning, pass a preset StringIO object instead.
You may set any reading and/or writing preferences in the options Hash. Available options are:
:col_sep-
The String placed between each field. This String will be transcoded into the data’s Encoding before parsing.
:row_sep-
The String appended to the end of each row. This can be set to the special
:autosetting, which requests that CSV automatically discover this from the data. Auto-discovery reads ahead in the data looking for the next"\r\n","\n", or"\r"sequence. A sequence will be selected even if it occurs in a quoted field, assuming that you would have the same line endings there. If none of those sequences is found,dataisARGF,STDIN,STDOUT, orSTDERR, or the stream is only available for output, the default$INPUT_RECORD_SEPARATOR($/) is used. Obviously, discovery takes a little time. Set manually if speed is important. Also note that IO objects should be opened in binary mode on Windows if this feature will be used as the line-ending translation can cause problems with resetting the document position to where it was before the read ahead. This String will be transcoded into the data’s Encoding before parsing. :quote_char-
The character used to quote fields. This has to be a single character String. This is useful for application that incorrectly use
'as the quote character instead of the correct". CSV will always consider a double sequence of this character to be an escaped quote. This String will be transcoded into the data’s Encoding before parsing. :field_size_limit-
This is a maximum size CSV will read ahead looking for the closing quote for a field. (In truth, it reads to the first line ending beyond this size.) If a quote cannot be found within the limit CSV will raise a MalformedCSVError, assuming the data is faulty. You can use this limit to prevent what are effectively DoS attacks on the parser. However, this limit can cause a legitimate parse to fail and thus is set to
nil, or off, by default. :converters-
An Array of names from the Converters Hash and/or lambdas that handle custom conversion. A single converter doesn’t have to be in an Array. All built-in converters try to transcode fields to UTF-8 before converting. The conversion will fail if the data cannot be transcoded, leaving the field unchanged.
:unconverted_fields-
If set to
true, an unconverted_fields() method will be added to all returned rows (Array or CSV::Row) that will return the fields as they were before conversion. Note that:headerssupplied by Array or String were not fields of the document and thus will have an empty Array attached. :headers-
If set to
:first_rowortrue, the initial row of the CSV file will be treated as a row of headers. If set to an Array, the contents will be used as the headers. If set to a String, the String is run through a call of CSV::parse_line() with the same:col_sep,:row_sep, and:quote_charas this instance to produce an Array of headers. This setting causes CSV#shift() to return rows as CSV::Row objects instead of Arrays and CSV#read() to return CSV::Table objects instead of an Array of Arrays. :return_headers-
When
false, header rows are silently swallowed. If set totrue, header rows are returned in a CSV::Row object with identical headers and fields (save that the fields do not go through the converters). :write_headers-
When
trueand:headersis set, a header row will be added to the output. :header_converters-
Identical in functionality to
:converterssave that the conversions are only made to header rows. All built-in converters try to transcode headers to UTF-8 before converting. The conversion will fail if the data cannot be transcoded, leaving the header unchanged. :skip_blanks-
When set to a
truevalue, CSV will skip over any empty rows. Note that this setting will not skip rows that contain column separators, even if the rows contain no actual data. If you want to skip rows that contain separators but no content, consider using:skip_lines, or inspecting fields.compact.empty? on each row. :force_quotes-
When set to a
truevalue, CSV will quote all CSV fields it creates. :skip_lines-
When set to an object responding to
match, every line matching it is considered a comment and ignored during parsing. When set to a String, it is first converted to a Regexp. When set tonilno line is considered a comment. If the passed object does not respond tomatch,ArgumentErroris thrown.
See CSV::DEFAULT_OPTIONS for the default settings.
Options cannot be overridden in the instance methods for performance reasons, so be sure to set what you want here.
1499 1500 1501 1502 1503 1504 1505 1506 1507 1508 1509 1510 1511 1512 1513 1514 1515 1516 1517 1518 1519 1520 1521 1522 1523 1524 1525 1526 1527 1528 1529 1530 1531 1532 1533 1534 1535 1536 1537 1538 1539 1540 1541 1542 1543 1544 |
# File 'lib/csv.rb', line 1499 def initialize(data, = Hash.new) if data.nil? raise ArgumentError.new("Cannot parse nil as CSV") end # build the options for this read/write = DEFAULT_OPTIONS.merge() # create the IO object we will read from @io = data.is_a?(String) ? StringIO.new(data) : data # honor the IO encoding if we can, otherwise default to ASCII-8BIT @encoding = raw_encoding(nil) || ( if encoding = .delete(:internal_encoding) case encoding when Encoding; encoding else Encoding.find(encoding) end end ) || ( case encoding = .delete(:encoding) when Encoding; encoding when /\A[^:]+/; Encoding.find($&) end ) || Encoding.default_internal || Encoding.default_external # # prepare for building safe regular expressions in the target encoding, # if we can transcode the needed characters # @re_esc = "\\".encode(@encoding) rescue "" @re_chars = /#{%"[-\\]\\[\\.^$?*+{}()|# \r\n\t\f\v]".encode(@encoding)}/ init_separators() init_parsers() init_converters() init_headers() init_comments() @force_encoding = !!(encoding || .delete(:encoding)) .delete(:internal_encoding) .delete(:external_encoding) unless .empty? raise ArgumentError, "Unknown options: #{.keys.join(', ')}." end # track our own lineno since IO gets confused about line-ends is CSV fields @lineno = 0 end |