Module: Bio::Alignment::SiteMethods

Includes:
PropertyMethods
Included in:
Site
Defined in:
lib/bio/alignment.rb

Overview

Bio::Alignment::SiteMethods is a set of methods for Bio::Alignment::Site. It can also be used for extending an array of single-letter strings.

Constant Summary collapse

IUPAC_NUC =

IUPAC nucleotide groups. Internal use only.

[
%w( t           u ),
%w( m   a c       ),
%w( r   a   g     ),
%w( w   a     t u ),
%w( s     c g     ),
%w( y     c   t u ),
%w( k       g t u ),
%w( v   a c g     m r   s     ),
%w( h   a c   t u m   w   y   ),
%w( d   a   g t u   r w     k ),
%w( b     c g t u       s y k ),
%w( n   a c g t u m r w s y k v h d b )
StrongConservationGroups =

Table of strongly conserved amino-acid groups.

The value of the tables are taken from BioPerl (Bio/SimpleAlign.pm in BioPerl 1.0), and the BioPerl’s document says that it is taken from Clustalw documentation and

These are all the positively scoring groups that occur in the 
Gonnet Pam250 matrix. The strong and weak groups are 
defined as strong score >0.5 and weak score =<0.5 respectively.
%w(STA NEQK NHQK NDEQ QHRK MILV MILF
HY FYW).collect { |x| x.split('').sort }
WeakConservationGroups =

Table of weakly conserved amino-acid groups.

Please refer StrongConservationGroups document for the origin of the table.

%w(CSA ATV SAG STNK STPA SGND SNDEQK
NDEQHK NEQHRK FVLIM HFY).collect { |x| x.split('').sort }

Constants included from PropertyMethods

PropertyMethods::GAP_CHAR, PropertyMethods::GAP_REGEXP, PropertyMethods::MISSING_CHAR

Instance Attribute Summary

Attributes included from PropertyMethods

#gap_char, #gap_regexp, #missing_char, #seqclass

Instance Method Summary collapse

Methods included from PropertyMethods

#get_all_property, #is_gap?, #set_all_property

Instance Method Details

#consensus_iupacObject

Returns an IUPAC consensus base for the site. If consensus is found, eturns a single-letter string. If not, returns nil.



218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
# File 'lib/bio/alignment.rb', line 218

def consensus_iupac
  a = self.collect { |x| x.downcase }.sort.uniq
  if a.size == 1 then
    case a[0]
    when 'a', 'c', 'g', 't'
      a[0]
    when 'u'
      't'
    else
      IUPAC_NUC.find { |x| a[0] == x[0] } ? a[0] : nil
    end
  elsif r = IUPAC_NUC.find { |x| (a - x).size <= 0 } then
    r[0]
  else
    nil
  end
end

#consensus_string(threshold = 1.0) ⇒ Object

Returns consensus character of the site. If consensus is found, eturns a single-letter string. If not, returns nil.



181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
# File 'lib/bio/alignment.rb', line 181

def consensus_string(threshold = 1.0)
  return nil if self.size <= 0
  return self[0] if self.sort.uniq.size == 1
  h = Hash.new(0)
  self.each { |x| h[x] += 1 }
  total = self.size
  b = h.to_a.sort do |x,y|
    z = (y[1] <=> x[1])
    z = (self.index(x[0]) <=> self.index(y[0])) if z == 0
    z
  end
  if total * threshold <= b[0][1] then
    b[0][0]
  else
    nil
  end
end

#has_gap?Boolean

If there are gaps, returns true. Otherwise, returns false.

Returns:

  • (Boolean)


164
165
166
# File 'lib/bio/alignment.rb', line 164

def has_gap?
  (find { |x| is_gap?(x) }) ? true : false
end

#match_line_amino(opt = {}) ⇒ Object

Returns the match-line character for the site. This is amino-acid version.



258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
# File 'lib/bio/alignment.rb', line 258

def match_line_amino(opt = {})
  # opt[:match_line_char]   ==> 100% equal    default: '*'
  # opt[:strong_match_char] ==> strong match  default: ':'
  # opt[:weak_match_char]   ==> weak match    default: '.'
  # opt[:mismatch_char]     ==> mismatch      default: ' '
  mlc = (opt[:match_line_char]   or '*')
  smc = (opt[:strong_match_char] or ':')
  wmc = (opt[:weak_match_char]   or '.')
  mmc = (opt[:mismatch_char]     or ' ')
  a = self.collect { |c| c.upcase }.sort.uniq
  a.extend(SiteMethods)
  if a.has_gap? then
    mmc
  elsif a.size == 1 then
    mlc
  elsif StrongConservationGroups.find { |x| (a - x).empty? } then
    smc
  elsif WeakConservationGroups.find { |x| (a - x).empty? } then
    wmc
  else
    mmc
  end
end

#match_line_nuc(opt = {}) ⇒ Object

Returns the match-line character for the site. This is nucleic-acid version.



284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
# File 'lib/bio/alignment.rb', line 284

def match_line_nuc(opt = {})
  # opt[:match_line_char]   ==> 100% equal    default: '*'
  # opt[:mismatch_char]     ==> mismatch      default: ' '
  mlc = (opt[:match_line_char]   or '*')
  mmc = (opt[:mismatch_char]     or ' ')
  a = self.collect { |c| c.upcase }.sort.uniq
  a.extend(SiteMethods)
  if a.has_gap? then
    mmc
  elsif a.size == 1 then
    mlc
  else
    mmc
  end
end

#remove_gaps!Object

Removes gaps in the site. (destructive method)



169
170
171
172
173
174
175
176
# File 'lib/bio/alignment.rb', line 169

def remove_gaps!
  flag = nil
  self.collect! do |x|
    if is_gap?(x) then flag = self; nil; else x; end
  end
  self.compact!
  flag
end