Class: String

Inherits:
Object
  • Object
show all
Defined in:
lib/viral_seq/string.rb

Overview

functions added to Class::String for direct operation on sequence as a String object

Instance Method Summary collapse

Instance Method Details

#compare_with(seq2) ⇒ Integer

compare two sequences as String objects, two sequence strings need to aligned first

Examples:

compare two sequence strings, without alignment and with alignment

seq1 = 'AAGGCGTAGGAC'
seq2 = 'AAGCTTAGGACG'
seq1.compare_with(seq2) # no alignment
=> 8
aligned_seqs = ViralSeq::Muscle.align(seq1,seq2) # align using MUSCLE
aligned_seqs[0].compare_with(aligned_seqs[1])
=> 4


108
109
110
111
112
113
114
115
116
117
118
# File 'lib/viral_seq/string.rb', line 108

def compare_with(seq2)
  seq1 = self
  length = seq1.size
  diff = 0
  (0..(length-1)).each do |position|
    nt1 = seq1[position]
    nt2 = seq2[position]
    diff += 1 unless nt1 == nt2
  end
  return diff
end

#mutation(error_rate = 0.01) ⇒ String

mutate a nt sequence (String class) randomly

Examples:

mutate a sequence at an error rate of 0.05

seq = "TGGAAGGGCTAATTCACTCCCAACGAAGACAAGATATCCTTGATCTGTGGATCTACCACACACAAGGCTACTTCCCTG"
seq.mutation(0.05)
=> "TGGAAGGGCTAATGCACTCCCAACGAAGACACGATATCCTTGATCTGTGGATCTACGACACACAAGGCTGCTTCCCTG"


23
24
25
26
27
28
29
30
31
32
33
34
35
36
# File 'lib/viral_seq/string.rb', line 23

def mutation(error_rate = 0.01)
  new_string = ""
  self.split("").each do |nt|
    pool = ["A","C","T","G"]
    pool.delete(nt)
    s = error_rate * 10000
    r = rand(10000)
    if r < s
      nt = pool.sample
    end
    new_string << nt
  end
  return new_string
end

#nt_parserRegexp

parse the nucleotide sequences as a String object

and return a Regexp object for possible matches

Examples:

parse a sequence with ambiguities

"ATRWCG".nt_parser
=> /AT[A|G][A|T]CG/


45
46
47
48
49
50
51
52
53
54
55
56
57
# File 'lib/viral_seq/string.rb', line 45

def nt_parser
  match = ""
  self.each_char.each do |base|
    base_array = base.to_list
    if base_array.size == 1
      match += base_array[0]
    else
      pattern = "[" + base_array.join("|") + "]"
      match += pattern
    end
  end
  Regexp.new match
end

#rcString

reverse complement

Examples:

Reverse complement

"ACAGA".rc
=> "TCTGT"


11
12
13
# File 'lib/viral_seq/string.rb', line 11

def rc
    self.reverse.tr("ACTG","TGAC")
end

#to_listArray

parse IUPAC nucleotide ambiguity codes (W S M K R Y B D H V N) as String if String.size == 1

Examples:

parse IUPAC R

'R'.to_list
=> ["A", "G"]


65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
# File 'lib/viral_seq/string.rb', line 65

def to_list
  list = []
  case self.upcase
  when /[A|T|C|G]/
    list << self
  when "W"
    list = ['A','T']
  when "S"
    list = ['C','G']
  when "M"
    list = ['A','C']
  when 'K'
    list = ['G','C']
  when 'R'
    list = ['A','G']
  when 'Y'
    list = ['C','T']
  when 'B'
    list = ['C','G','T']
  when 'D'
    list = ['A','G','T']
  when 'H'
    list = ['A','C','T']
  when 'V'
    list = ['A','C','G']
  when 'N'
    list = ['A','T','C','G']
  end
  return list
end