Class: BioDSL::PlotResidueDistribution

Inherits:
Object
  • Object
show all
Includes:
AuxHelper
Defined in:
lib/BioDSL/commands/plot_residue_distribution.rb

Overview

Plot the residue distribution of sequences in the stream.

plot_residue_distribution creates a residue distribution plot per sequence position of sequences in the stream. Plotting is done using GNUplot which allows for different types of output the default one being crufty ASCII graphics.

If plotting distributions from sequences of variable length you can use the count option to co-plot the relative count at each base position. This allow you to explain areas with a scewed distribution.

GNUplot must be installed for plot_residue_distribution to work. Read more here:

www.gnuplot.info/

Usage

plot_residue_distribution([count: <bool>[, output: <file>
                          [, force: <bool> [, terminal: <string>
                          [, title: <string>[, xlabel: <string>
                          [, ylabel: <string>[, test: <bool>]]]]]]])

Options

  • count: <bool> - Plot relative count (default=false).

  • output: <file> - Output file.

  • force: <bool> - Force overwrite existing output file.

  • terminal: <string> - Terminal for output: dumb|post|svg|x11|aqua|png|pdf

    (default=dumb).
    
  • title: <string> - Plot title (default=“Heatmap”).

  • xlabel: <string> - X-axis label (default=“x”).

  • ylabel: <string> - Y-axis label (default=“y”).

  • test: <bool> - Output Gnuplot script instead of plot.

Examples

Here we plot a residue distribution of a FASTA file:

BD.new.read_fasta(input: "test.fna").plot_residue_distribution.run

rubocop: disable ClassLength

Constant Summary collapse

STATS =
i(records_in records_out sequences_in sequences_out residues_in
residues_out)

Instance Method Summary collapse

Methods included from AuxHelper

#aux_exist

Constructor Details

#initialize(options) ⇒ PlotResidueDistribution

Constructo for PlotResidueDistribution.

Parameters:

  • options (Hash)

    Options hash.

Options Hash (options):

  • :count (Boolean)
  • :output (String)
  • :force (Boolean)
  • :terminal (:Symbol)
  • :title (String)
  • :xlabel (String)
  • :ylabel (String)
  • :test (Boolean)


94
95
96
97
98
99
100
101
102
103
104
105
# File 'lib/BioDSL/commands/plot_residue_distribution.rb', line 94

def initialize(options)
  @options  = options
  @counts   = Hash.new { |h, k| h[k] = Hash.new(0) }
  @total    = Hash.new(0)
  @residues = Set.new
  @gp       = nil
  @offset   = Set.new # Hackery thing to offset datasets 1 postion.

  aux_exist('gnuplot')
  check_options
  defaults
end

Instance Method Details

#lmbProc

Return command lambda for PlotResidueDistribution.

Returns:

  • (Proc)

    Command lambda.



110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
# File 'lib/BioDSL/commands/plot_residue_distribution.rb', line 110

def lmb
  lambda do |input, output, status|
    status_init(status, STATS)

    input.each do |record|
      @status[:records_in] += 1

      count_residues(record) if record.key? :SEQ

      next unless output
      output << record
      @status[:records_out] += 1

      if record.key? :SEQ
        @status[:sequences_out] += 1
        @status[:residues_out] += record[:SEQ].length
      end
    end

    plot_create
    plot_output
  end
end