Module: Bio::Alignment::SiteMethods
Overview
Bio::Alignment::SiteMethods is a set of methods for Bio::Alignment::Site. It can also be used for extending an array of single-letter strings.
Constant Summary collapse
- IUPAC_NUC =
IUPAC nucleotide groups. Internal use only.
[ %w( t u ), %w( m a c ), %w( r a g ), %w( w a t u ), %w( s c g ), %w( y c t u ), %w( k g t u ), %w( v a c g m r s ), %w( h a c t u m w y ), %w( d a g t u r w k ), %w( b c g t u s y k ), %w( n a c g t u m r w s y k v h d b )
- StrongConservationGroups =
Table of strongly conserved amino-acid groups.
The value of the tables are taken from BioPerl (Bio/SimpleAlign.pm in BioPerl 1.0), and the BioPerl’s document says that it is taken from Clustalw documentation and
These are all the positively scoring groups that occur in the Gonnet Pam250 matrix. The strong and weak groups are defined as strong score >0.5 and weak score =<0.5 respectively.
%w(STA NEQK NHQK NDEQ QHRK MILV MILF HY FYW).collect { |x| x.split('').sort }
- WeakConservationGroups =
Table of weakly conserved amino-acid groups.
Please refer StrongConservationGroups document for the origin of the table.
%w(CSA ATV SAG STNK STPA SGND SNDEQK NDEQHK NEQHRK FVLIM HFY).collect { |x| x.split('').sort }
Constants included from PropertyMethods
PropertyMethods::GAP_CHAR, PropertyMethods::GAP_REGEXP, PropertyMethods::MISSING_CHAR
Instance Attribute Summary
Attributes included from PropertyMethods
#gap_char, #gap_regexp, #missing_char, #seqclass
Instance Method Summary collapse
-
#consensus_iupac ⇒ Object
Returns an IUPAC consensus base for the site.
-
#consensus_string(threshold = 1.0) ⇒ Object
Returns consensus character of the site.
-
#has_gap? ⇒ Boolean
If there are gaps, returns true.
-
#match_line_amino(opt = {}) ⇒ Object
Returns the match-line character for the site.
-
#match_line_nuc(opt = {}) ⇒ Object
Returns the match-line character for the site.
-
#remove_gaps! ⇒ Object
Removes gaps in the site.
Methods included from PropertyMethods
#get_all_property, #is_gap?, #set_all_property
Instance Method Details
#consensus_iupac ⇒ Object
Returns an IUPAC consensus base for the site. If consensus is found, eturns a single-letter string. If not, returns nil.
218 219 220 221 222 223 224 225 226 227 228 229 230 231 232 233 234 |
# File 'lib/bio/alignment.rb', line 218 def consensus_iupac a = self.collect { |x| x.downcase }.sort.uniq if a.size == 1 then case a[0] when 'a', 'c', 'g', 't' a[0] when 'u' 't' else IUPAC_NUC.find { |x| a[0] == x[0] } ? a[0] : nil end elsif r = IUPAC_NUC.find { |x| (a - x).size <= 0 } then r[0] else nil end end |
#consensus_string(threshold = 1.0) ⇒ Object
Returns consensus character of the site. If consensus is found, eturns a single-letter string. If not, returns nil.
181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 |
# File 'lib/bio/alignment.rb', line 181 def consensus_string(threshold = 1.0) return nil if self.size <= 0 return self[0] if self.sort.uniq.size == 1 h = Hash.new(0) self.each { |x| h[x] += 1 } total = self.size b = h.to_a.sort do |x,y| z = (y[1] <=> x[1]) z = (self.index(x[0]) <=> self.index(y[0])) if z == 0 z end if total * threshold <= b[0][1] then b[0][0] else nil end end |
#has_gap? ⇒ Boolean
If there are gaps, returns true. Otherwise, returns false.
164 165 166 |
# File 'lib/bio/alignment.rb', line 164 def has_gap? (find { |x| is_gap?(x) }) ? true : false end |
#match_line_amino(opt = {}) ⇒ Object
Returns the match-line character for the site. This is amino-acid version.
258 259 260 261 262 263 264 265 266 267 268 269 270 271 272 273 274 275 276 277 278 279 280 |
# File 'lib/bio/alignment.rb', line 258 def match_line_amino(opt = {}) # opt[:match_line_char] ==> 100% equal default: '*' # opt[:strong_match_char] ==> strong match default: ':' # opt[:weak_match_char] ==> weak match default: '.' # opt[:mismatch_char] ==> mismatch default: ' ' mlc = (opt[:match_line_char] or '*') smc = (opt[:strong_match_char] or ':') wmc = (opt[:weak_match_char] or '.') mmc = (opt[:mismatch_char] or ' ') a = self.collect { |c| c.upcase }.sort.uniq a.extend(SiteMethods) if a.has_gap? then mmc elsif a.size == 1 then mlc elsif StrongConservationGroups.find { |x| (a - x).empty? } then smc elsif WeakConservationGroups.find { |x| (a - x).empty? } then wmc else mmc end end |
#match_line_nuc(opt = {}) ⇒ Object
Returns the match-line character for the site. This is nucleic-acid version.
284 285 286 287 288 289 290 291 292 293 294 295 296 297 298 |
# File 'lib/bio/alignment.rb', line 284 def match_line_nuc(opt = {}) # opt[:match_line_char] ==> 100% equal default: '*' # opt[:mismatch_char] ==> mismatch default: ' ' mlc = (opt[:match_line_char] or '*') mmc = (opt[:mismatch_char] or ' ') a = self.collect { |c| c.upcase }.sort.uniq a.extend(SiteMethods) if a.has_gap? then mmc elsif a.size == 1 then mlc else mmc end end |
#remove_gaps! ⇒ Object
Removes gaps in the site. (destructive method)
169 170 171 172 173 174 175 176 |
# File 'lib/bio/alignment.rb', line 169 def remove_gaps! flag = nil self.collect! do |x| if is_gap?(x) then flag = self; nil; else x; end end self.compact! flag end |