Class: Molecules::Libraries::Polypeptide
- Inherits:
-
EmpiricalFormula
- Object
- EmpiricalFormula
- Molecules::Libraries::Polypeptide
- Defined in:
- lib/molecules/libraries/polypeptide.rb
Overview
Represents a polypeptide as a sequence of residues. For convenience, polypeptides may contain whitespace in their sequences (thus allowing direct use with parsed FASTA formatted peptides sequences).
Currently polypeptide only handles sequences with common residues.
Defined Under Namespace
Classes: UnknownResidueError
Constant Summary collapse
- SEQUENCE_TOKENS =
An array of tokens that may occur in a sequence, grouped as patterns (ie one token for all whitespace characters, and one token for each residue). Used to count the number of each type of residue in a sequence.
["\s\t\r\n"] + Residue.common.collect {|r| r.letter}
Constants inherited from EmpiricalFormula
EmpiricalFormula::ELEMENT_INDEX, EmpiricalFormula::ELEMENT_INDEX_ORDER
Instance Attribute Summary collapse
-
#length ⇒ Object
readonly
The number of residues in self (may differ from sequence.length if sequence contains whitespace).
-
#residue_composition ⇒ Object
readonly
A hash of (Residue, Integer) pairs defining the number of a given residue in self.
-
#sequence ⇒ Object
readonly
The sequence of self (including whitespace).
Attributes inherited from EmpiricalFormula
Class Method Summary collapse
-
.normalize(sequence) ⇒ Object
Normalizes the input sequence by removing whitespace and capitalizing.
Instance Method Summary collapse
-
#each_residue ⇒ Object
Sequentially passes each residue in sequence to the block.
-
#initialize(sequence) ⇒ Polypeptide
constructor
A new instance of Polypeptide.
Methods inherited from EmpiricalFormula
#*, #+, #-, #==, #each, mass, #mass, parse, parse_simple, #to_s
Methods included from Utils
Constructor Details
#initialize(sequence) ⇒ Polypeptide
Returns a new instance of Polypeptide.
36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 |
# File 'lib/molecules/libraries/polypeptide.rb', line 36 def initialize(sequence) @sequence = sequence @length = 0 @residue_composition = {} @formula = Array.new(5, 0) # count up the number of whitespaces and residues in self tokens = Utils.count(sequence, SEQUENCE_TOKENS) whitespace = tokens.shift if whitespace == sequence.length # as per the Base specification, factors # should have no trailing zeros @formula.clear return end # add the residue masses and factors Residue.common.each do |residue| # benchmarks indicated that counting for each residue # is quicker than trying anything like: # # sequence.each_byte {|b| bytes[b] += 1} # # This is particularly an issue for long sequences. The # count operation could be optimized for isobaric residues n = tokens.shift next if n == 0 @length += n @residue_composition[residue] = n Utils.add(@formula, residue.formula, n) end if @length + whitespace != sequence.length # raise an error if there are unaccounted characters raise UnknownResidueError, "unknown characters in sequence: #{sequence}" end end |
Instance Attribute Details
#length ⇒ Object (readonly)
The number of residues in self (may differ from sequence.length if sequence contains whitespace).
28 29 30 |
# File 'lib/molecules/libraries/polypeptide.rb', line 28 def length @length end |
#residue_composition ⇒ Object (readonly)
A hash of (Residue, Integer) pairs defining the number of a given residue in self.
24 25 26 |
# File 'lib/molecules/libraries/polypeptide.rb', line 24 def residue_composition @residue_composition end |
#sequence ⇒ Object (readonly)
The sequence of self (including whitespace)
21 22 23 |
# File 'lib/molecules/libraries/polypeptide.rb', line 21 def sequence @sequence end |
Class Method Details
.normalize(sequence) ⇒ Object
Normalizes the input sequence by removing whitespace and capitalizing.
15 16 17 |
# File 'lib/molecules/libraries/polypeptide.rb', line 15 def normalize(sequence) sequence.gsub(/\s/, "").upcase end |
Instance Method Details
#each_residue ⇒ Object
Sequentially passes each residue in sequence to the block.
78 79 80 81 82 83 84 |
# File 'lib/molecules/libraries/polypeptide.rb', line 78 def each_residue residues = Residue.residue_index sequence.each_byte do |byte| residue = residues[byte] yield(residue) if residue end end |