Class: Traject::DelimitedWriter
- Inherits:
-
LineWriter
- Object
- LineWriter
- Traject::DelimitedWriter
- Defined in:
- lib/traject/delimited_writer.rb
Overview
A simple line writer that uses configuration to determine how to produce a tab-delimited file
Appropos settings:
- output_file -- the file to write to
- output_stream -- the stream to write to, if defined and output_file is not
- delimited_writer.delimiter -- What to separate fields with; default is tab
- delimited_writer.internal_delimiter -- Delimiter within a field, for multiple values. Default is pipe ( | )
- delimited_writer.fields -- comma-separated list of the fields to output
- delimited_writer.header (true/false) -- boolean that determines if we should output a header row. Default is true
- delimited_writer.escape -- If a value actually contains the delimited or internal_delimiter, what to do? If unset, will follow the procedure below. If set, will turn it into the character(s) given
If delimited_writer.escape
is not set, the writer will automatically
escape delimiters/internal_delimiters in the following way:
- If the delimiter is a tab, replace tabs in values with a single space
- If the delimiter is anything else, prefix it with a backslash
Direct Known Subclasses
Instance Attribute Summary collapse
-
#delimiter ⇒ Object
Returns the value of attribute delimiter.
-
#edelim ⇒ Object
readonly
Returns the value of attribute edelim.
-
#eidelim ⇒ Object
readonly
Returns the value of attribute eidelim.
-
#header ⇒ Object
Returns the value of attribute header.
-
#internal_delimiter ⇒ Object
Returns the value of attribute internal_delimiter.
Attributes inherited from LineWriter
#output_file, #settings, #write_mutex
Instance Method Summary collapse
- #_write(data) ⇒ Object
-
#escape(x) ⇒ Object
Escape the delimiters in whatever way has been defined.
- #escaped_delimiter(d) ⇒ Object
-
#initialize(settings) ⇒ DelimitedWriter
constructor
A new instance of DelimitedWriter.
-
#output_values(raw) ⇒ Object
Derive actual output field values from the raw values.
-
#raw_output_values(context) ⇒ Object
Get the output values out of the context.
-
#serialize(context) ⇒ Object
Spit out the escaped values joined by the delimiter.
- #write_header ⇒ Object
Methods inherited from LineWriter
#close, #open_output_file, #put, #should_close_stream?
Constructor Details
#initialize(settings) ⇒ DelimitedWriter
Returns a new instance of DelimitedWriter.
29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 |
# File 'lib/traject/delimited_writer.rb', line 29 def initialize(settings) super # fields to output begin @fields = settings['delimited_writer.fields'].split(",") rescue NoMethodError => e end if e or @fields.empty? raise ArgumentError.new("#{self.class.name} must have a comma-delimited list of field names to output set in setting 'delimited_writer.fields'") end self.delimiter = settings['delimited_writer.delimiter'] || "\t" self.internal_delimiter = settings['delimited_writer.internal_delimiter'] || '|' self.header = settings['delimited_writer.header'].to_s != 'false' # Output the header if need be write_header if @header end |
Instance Attribute Details
#delimiter ⇒ Object
Returns the value of attribute delimiter.
26 27 28 |
# File 'lib/traject/delimited_writer.rb', line 26 def delimiter @delimiter end |
#edelim ⇒ Object (readonly)
Returns the value of attribute edelim.
26 27 28 |
# File 'lib/traject/delimited_writer.rb', line 26 def edelim @edelim end |
#eidelim ⇒ Object (readonly)
Returns the value of attribute eidelim.
26 27 28 |
# File 'lib/traject/delimited_writer.rb', line 26 def eidelim @eidelim end |
#header ⇒ Object
Returns the value of attribute header.
27 28 29 |
# File 'lib/traject/delimited_writer.rb', line 27 def header @header end |
#internal_delimiter ⇒ Object
Returns the value of attribute internal_delimiter.
26 27 28 |
# File 'lib/traject/delimited_writer.rb', line 26 def internal_delimiter @internal_delimiter end |
Instance Method Details
#_write(data) ⇒ Object
74 75 76 |
# File 'lib/traject/delimited_writer.rb', line 74 def _write(data) output_file.puts(data.join(delimiter)) end |
#escape(x) ⇒ Object
Escape the delimiters in whatever way has been defined
84 85 86 87 88 89 |
# File 'lib/traject/delimited_writer.rb', line 84 def escape(x) x = x.to_s x.gsub! @delimiter, @edelim if @delimiter x.gsub! @internal_delimiter, @eidelim x end |
#escaped_delimiter(d) ⇒ Object
51 52 53 54 |
# File 'lib/traject/delimited_writer.rb', line 51 def escaped_delimiter(d) return nil if d.nil? d == "\t" ? ' ' : '\\' + d end |
#output_values(raw) ⇒ Object
Derive actual output field values from the raw values
93 94 95 96 97 98 99 100 101 102 |
# File 'lib/traject/delimited_writer.rb', line 93 def output_values(raw) raw.map do |x| if x.is_a? Array x.map!{|s| escape(s)} x.join(@internal_delimiter) else escape(x) end end end |
#raw_output_values(context) ⇒ Object
Get the output values out of the context
79 80 81 |
# File 'lib/traject/delimited_writer.rb', line 79 def raw_output_values(context) context.output_hash.values_at(*@fields) end |
#serialize(context) ⇒ Object
Spit out the escaped values joined by the delimiter
105 106 107 |
# File 'lib/traject/delimited_writer.rb', line 105 def serialize(context) output_values(raw_output_values(context)) end |
#write_header ⇒ Object
70 71 72 |
# File 'lib/traject/delimited_writer.rb', line 70 def write_header _write(@fields) end |