Module: RGFA::Sequence

Included in:
String
Defined in:
lib/rgfa/sequence.rb

Overview

Extensions of the String class to handle nucleotidic sequences

Constant Summary collapse

WCC =

Watson-Crick Complements

{"a"=>"t","t"=>"a","A"=>"T","T"=>"A",
"c"=>"g","g"=>"c","C"=>"G","G"=>"C",
"b"=>"v","B"=>"V","v"=>"b","V"=>"B",
"h"=>"d","H"=>"D","d"=>"h","D"=>"H",
"R"=>"Y","Y"=>"R","r"=>"y","y"=>"r",
"K"=>"M","M"=>"K","k"=>"m","m"=>"k",
"S"=>"S","s"=>"s","w"=>"w","W"=>"W",
"n"=>"n","N"=>"N","u"=>"a","U"=>"A",
"-"=>"-","."=>".","="=>"=",
" "=>"","\n"=>""}

Instance Method Summary collapse

• Computes the reverse complement of a nucleotidic sequence.

Instance Method Details

#rc(tolerant: false, rnasequence: false) ⇒ String

Computes the reverse complement of a nucleotidic sequence

Examples:

"ACTG".rc  # => "CAGT"
"acGT".rc  # => "ACgt"

Undefined sequence is represented by “*”:

"*".rc     # => "*"

Extended IUPAC Alphabet:

"ARBN".rc  # => "NVYT"

Usage with RNA sequences:

"ACUG".rc                    # => "CAGU"
"ACG".rc(rnasequence: true)  # => "CGU"
"ACUT".rc                    # (raises RuntimeError, both U and T)

Parameters:

• tolerant (Boolean) (defaults to: false)

(defaults to: false) if true, anything non-sequence is complemented to itself

• rnasequence (Boolean) (defaults to: false)

(defaults to: false) if true, any A and a is complemented into u and U; otherwise it is so, only if an U is found; otherwise DNA is assumed

Returns:

• (String)

reverse complement, without newlines and spaces

• (String)

“*” if string is “*”

Raises:

• (RuntimeError)

if not tolerant and chars are found for which no Watson-Crick complement is defined

• (RuntimeError)

if sequence contains both U and T

 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 # File 'lib/rgfa/sequence.rb', line 32 def rc(tolerant: false, rnasequence: false) return "*" if self == "*" retval = each_char.map do |c| if c == "U" or c == "u" rnasequence = true elsif rnasequence and (c == "T" or c == "t") raise "String contains both U/u and T/t" end wcc = WCC.fetch(c, tolerant ? c : nil) raise "#{self}: no Watson-Crick complement for #{c}" if wcc.nil? wcc end.reverse.join if rnasequence retval.tr!("tT","uU") end retval end