Module: Bio::Sequence::SequenceMasker
- Included in:
- Bio::Sequence
- Defined in:
- lib/bio/sequence/sequence_masker.rb
Overview
Bio::Sequence::SequenceMasker is a mix-in module to provide helpful methods for masking a sequence.
It is only expected to be included in Bio::Sequence. In the future, methods in this module might be moved to Bio::Sequence or other module and this module might be removed. Please do not depend on this module.
Instance Method Summary collapse
-
#mask_with_enumerator(enum, mask_char) ⇒ Object
Masks the sequence with each value in the enum.
-
#mask_with_error_probability(threshold, mask_char) ⇒ Object
Masks high error-probability sequence regions.
-
#mask_with_quality_score(threshold, mask_char) ⇒ Object
Masks low quality sequence regions.
Instance Method Details
#mask_with_enumerator(enum, mask_char) ⇒ Object
Masks the sequence with each value in the enum. The enum<em> should be an array or enumerator. A block must be given. When the block returns true, the sequence is masked with <em>mask_char.
Arguments:
-
(required) enum : Enumerator
-
(required) mask_char : (String) character used for masking
- Returns
-
Bio::Sequence object
39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 |
# File 'lib/bio/sequence/sequence_masker.rb', line 39 def mask_with_enumerator(enum, mask_char) offset = 0 unit = mask_char.length - 1 s = self.seq.class.new(self.seq) j = 0 enum.each_with_index do |item, index| if yield item then j = index + offset if j < s.length then s[j, 1] = mask_char offset += unit end end end newseq = self.dup newseq.seq = s newseq end |
#mask_with_error_probability(threshold, mask_char) ⇒ Object
Masks high error-probability sequence regions. For each sequence position, if the error probability is larger than the threshold, the sequence in the position is replaced with mask_char.
Arguments:
-
(required) threshold : (Numeric) threshold
-
(required) mask_char : (String) character used for masking
- Returns
-
Bio::Sequence object
86 87 88 89 90 91 |
# File 'lib/bio/sequence/sequence_masker.rb', line 86 def mask_with_error_probability(threshold, mask_char) values = self.error_probabilities || [] mask_with_enumerator(values, mask_char) do |item| item > threshold end end |
#mask_with_quality_score(threshold, mask_char) ⇒ Object
Masks low quality sequence regions. For each sequence position, if the quality score is smaller than the threshold, the sequence in the position is replaced with mask_char.
Note: This method does not care quality_score_type.
Arguments:
-
(required) threshold : (Numeric) threshold
-
(required) mask_char : (String) character used for masking
- Returns
-
Bio::Sequence object
69 70 71 72 73 74 |
# File 'lib/bio/sequence/sequence_masker.rb', line 69 def mask_with_quality_score(threshold, mask_char) scores = self.quality_scores || [] mask_with_enumerator(scores, mask_char) do |item| item < threshold end end |