Class: Bio::Big::ShortFrameState

Inherits:
Object
  • Object
show all
Includes:
FrameCodonHelpers
Defined in:
lib/bigbio/db/emitters/orf_emitter.rb

Overview

The short frame uses the simplest concept to find ORFs. The sequence is immutable, always forward and in frame 0. That makes it easy to reason. It also return all ORF’s in one go, with the left/right locations.

Direct Known Subclasses

ShortReversedFrameState

Constant Summary

Constants included from FrameCodonHelpers

FrameCodonHelpers::START_CODONS, FrameCodonHelpers::STOP_CODONS

Instance Attribute Summary collapse

Instance Method Summary collapse

Constructor Details

#initialize(seq, ntseq_pos, ntmin_size) ⇒ ShortFrameState

Returns a new instance of ShortFrameState.



116
117
118
119
120
121
122
123
124
125
126
127
128
129
# File 'lib/bigbio/db/emitters/orf_emitter.rb', line 116

def initialize seq, ntseq_pos, ntmin_size
  @reversed = nil
  # @seq = seq.upcase  
  @seq = seq
  @min_size_codons = if ntmin_size > 3
                       (ntmin_size/3).to_i
                     else
                       2  # otherwise we get single STOP codons
                     end
 
  @codons = FrameCodonSequence.new(seq,ntseq_pos)
  @ntseq_pos = ntseq_pos # nucleotides
  # @codons.track_sequence_pos = seq_pos
end

Instance Attribute Details

#codonsObject (readonly)

Returns the value of attribute codons.



114
115
116
# File 'lib/bigbio/db/emitters/orf_emitter.rb', line 114

def codons
  @codons
end

#min_size_codonsObject (readonly)

Returns the value of attribute min_size_codons.



114
115
116
# File 'lib/bigbio/db/emitters/orf_emitter.rb', line 114

def min_size_codons
  @min_size_codons
end

#ntseq_posObject (readonly)

Returns the value of attribute ntseq_pos.



114
115
116
# File 'lib/bigbio/db/emitters/orf_emitter.rb', line 114

def ntseq_pos
  @ntseq_pos
end

#seqObject (readonly)

Returns the value of attribute seq.



114
115
116
# File 'lib/bigbio/db/emitters/orf_emitter.rb', line 114

def seq
  @seq
end

Instance Method Details

#get_codon_orfs1(splitter_func, do_include_leftmost_orf, do_strip_leading_codon) ⇒ Object

Splitter for one delimiter function. include_leftmost decides the first sequence is returned when incomplete. strip_leading is used to remove the shared codon with the last sequence.



147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
# File 'lib/bigbio/db/emitters/orf_emitter.rb', line 147

def get_codon_orfs1 splitter_func,do_include_leftmost_orf,do_strip_leading_codon
  orfs = split(@codons,splitter_func)
  return [] if orfs.size == 0
  # Drop the first sequence, if there is no match on the first position
  orfs.shift if !do_include_leftmost_orf and !splitter_func.call(orfs.first[0])
  orfs = orfs.map { |codons| 
    codons = codons.shift if do_strip_leading_codon and splitter_func.call(codons[0])
    codons
  }
  if @reversed == nil
    TrackSequenceTrait.update_sequence_pos(orfs,@ntseq_pos) # nail against parent
  else
    TrackSequenceTrait.update_reversed_sequence_pos(orfs,@ntseq_pos) # nail against parent
  end
end

#get_codon_orfs2(splitter_func, start_func) ⇒ Object

Splitter for two delimeter functions



164
165
166
167
# File 'lib/bigbio/db/emitters/orf_emitter.rb', line 164

def get_codon_orfs2 splitter_func, start_func
  orfs = get_codon_orfs1(splitter_func,true,true)
  orfs.find_all { | orf | start_func.call(orf[0]) }
end

#get_startstop_orfsObject

Return a list of ORFs delimited by START-STOP codons



137
138
139
140
141
# File 'lib/bigbio/db/emitters/orf_emitter.rb', line 137

def get_startstop_orfs 
  get_codon_orfs2(
           Proc.new { | codon | STOP_CODONS.include?(codon) },
           Proc.new { | codon | START_CODONS.include?(codon) })
end

#get_stopstop_orfsObject

Return a list of ORFs delimited by STOP codons.



132
133
134
# File 'lib/bigbio/db/emitters/orf_emitter.rb', line 132

def get_stopstop_orfs 
  get_codon_orfs1(Proc.new { | codon | STOP_CODONS.include?(codon) },false,true)
end

#split(codons, is_splitter_func) ⇒ Object

Return list of codon sequences, split on the is_splitter function.



172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
# File 'lib/bigbio/db/emitters/orf_emitter.rb', line 172

def split codons, is_splitter_func
  list = []
  node = []
  codons.each_with_index do | c, pos |
    # p [c,pos]
    if is_splitter_func.call(c)
      node.push c
      size = node.size
      # p node
      list.push FrameCodonSequence.new(node,pos+1-size) if size > @min_size_codons
      node = []
    end
    node.push c  # always push boundary codon
  end
  list
end