Class: BioDSL::ClipPrimer
- Inherits:
-
Object
- Object
- BioDSL::ClipPrimer
- Defined in:
- lib/BioDSL/commands/clip_primer.rb
Overview
Clip sequences in the stream at a specified primer location.
clip_primer
locates a specified primer
in sequences in the stream and clips the sequence after the match if the direction
is forward or before the match is the direction
is reverse. Using the reverse_complement
option the primer sequence will be reverse complemented prior to matching. Using the search_distance
option will limit the primer search to the beginning of the sequence if the direction
is forward and to the end if the direction is reverse
.
Non-perfect matching can be allowed by setting the allowed mismatch_percent
, insertion_percent
and deletion_percent
.
The following keys are added to clipped records:
-
CLIP_PRIMER_DIR - Direction of clip.
-
CLIP_PRIMER_POS - Sequence position of clip (0 based).
-
CLIP_PRIMER_LEN - Length of clip match.
-
CLIP_PRIMER_PAT - Clip match pattern.
Usage
clip_primer(<primer: <string>>, <direction: <:forward|:reverse>
[, reverse_complement: <bool>[, search_distance: <uint>
[, mismatch_percent: <uint>
[, insertion_percent: <uint>
[, deletion_percent: <uint>]]]]])
Options
-
primer: <string> - Primer sequence to search for.
-
direction: <:forward|:reverse> - Clip direction.
-
reverse_complement: <bool> - Reverse complement primer (default=false).
-
search_distance: <uint> - Search distance from forward or reverse end.
-
mismatch_percent: <unit> - Allowed percent mismatches (default=0).
-
insertion_percent: <unit> - Allowed percent insertions (default=0).
-
deletion_percent: <unit> - Allowed percent mismatches (default=0).
Examples
Consider the following FASTA entry in the file test.fq:
>test
actgactgaTCGTATGCCGTCTTCTGCTTactacgt
To clip this sequence in the forward direction with the primer ‘TGACTACGACTACGACTACT’ do:
BD.new.
read_fasta(input: "test.fna").
clip_primer(primer: "TGACTACGACTACGACTACT", direction: :forward).
dump.
run
{:SEQ_NAME=>"test",
:SEQ=>"actacgt",
:SEQ_LEN=>7,
:CLIP_PRIMER_DIR=>"FORWARD",
:CLIP_PRIMER_POS=>9,
:CLIP_PRIMER_LEN=>20,
:CLIP_PRIMER_PAT=>"TGACTACGACTACGACTACT"}
Or in the reverse direction:
BD.new.
read_fasta(input: "test.fna").
clip_primer(primer: "TGACTACGACTACGACTACT", direction: :reverse).
dump.
run
{:SEQ_NAME=>"test",
:SEQ=>"actgactga",
:SEQ_LEN=>9,
:CLIP_PRIMER_DIR=>"REVERSE",
:CLIP_PRIMER_POS=>9,
:CLIP_PRIMER_LEN=>20,
:CLIP_PRIMER_PAT=>"TGACTACGACTACGACTACT"}
rubocop:disable ClassLength
Constant Summary collapse
- STATS =
%i(records_in records_out sequences_in sequences_out residues_in residues_out pattern_hits pattern_misses)
Instance Method Summary collapse
-
#initialize(options) ⇒ ClipPrimer
constructor
Constructor for ClipPrimer.
-
#lmb ⇒ Proc
Lambda for ClipPrimer command.
Constructor Details
#initialize(options) ⇒ ClipPrimer
Constructor for ClipPrimer.
122 123 124 125 126 127 128 129 130 131 |
# File 'lib/BioDSL/commands/clip_primer.rb', line 122 def initialize() @options = defaults @primer = primer @mis = calc_mis @ins = calc_ins @del = calc_del end |
Instance Method Details
#lmb ⇒ Proc
Lambda for ClipPrimer command.
136 137 138 139 140 141 142 143 144 145 146 147 148 149 |
# File 'lib/BioDSL/commands/clip_primer.rb', line 136 def lmb lambda do |input, output, status| status_init(status, STATS) input.each do |record| @status[:records_in] += 1 clip_primer(record) if record[:SEQ] && record[:SEQ].length > 0 output << record @status[:records_out] += 1 end end end |