Class: Traject::LineWriter

Inherits:
Object
  • Object
show all
Defined in:
lib/traject/line_writer.rb

Overview

A writer for Traject::Indexer, that just writes out all the output as serialized text with #puts.

Should be thread-safe (ie, multiple worker threads can be calling #put concurrently), by wrapping write to actual output file in a mutex synchronize. This does not seem to effect performance much, as far as I could tell benchmarking.

Output will be sent to settings["output_file"] string path, or else settings["output_stream"] (ruby IO object), or else stdout.

This class can be sub-classed to write out different serialized reprentations -- subclasses will just override the #serialize method. For instance, see JsonWriter.

Direct Known Subclasses

DebugWriter, JsonWriter, YamlWriter

Instance Attribute Summary collapse

Instance Method Summary collapse

Constructor Details

#initialize(argSettings) ⇒ LineWriter

Returns a new instance of LineWriter.



21
22
23
24
25
26
27
# File 'lib/traject/line_writer.rb', line 21

def initialize(argSettings)
  @settings     = argSettings
  @write_mutex  = Mutex.new

  # trigger lazy loading now for thread-safety
  output_file
end

Instance Attribute Details

#settingsObject (readonly)

Returns the value of attribute settings.



18
19
20
# File 'lib/traject/line_writer.rb', line 18

def settings
  @settings
end

#write_mutexObject (readonly)

Returns the value of attribute write_mutex.



19
20
21
# File 'lib/traject/line_writer.rb', line 19

def write_mutex
  @write_mutex
end

Instance Method Details

#closeObject



55
56
57
# File 'lib/traject/line_writer.rb', line 55

def close
  @output_file.close unless (@output_file.nil? || @output_file.tty?)
end

#output_fileObject



41
42
43
44
45
46
47
48
49
50
51
52
53
# File 'lib/traject/line_writer.rb', line 41

def output_file
  unless defined? @output_file
    @output_file =
      if settings["output_file"]
        File.open(settings["output_file"], 'w:UTF-8')
      elsif settings["output_stream"]
        settings["output_stream"]
      else
        $stdout
      end
  end
  return @output_file
end

#put(context) ⇒ Object



34
35
36
37
38
39
# File 'lib/traject/line_writer.rb', line 34

def put(context)
  serialized = serialize(context)
  write_mutex.synchronize do
    output_file.puts(serialized)
  end
end

#serialize(context) ⇒ Object



30
31
32
# File 'lib/traject/line_writer.rb', line 30

def serialize(context)
  context.output_hash
end