Class: Molecules::EmpiricalFormula
- Inherits:
-
Object
- Object
- Molecules::EmpiricalFormula
- Includes:
- Enumerable, Utils
- Defined in:
- lib/molecules/empirical_formula.rb
Overview
EmpiricalFormula represents the empirical formula (ex ‘H(2)0’) for a molecule. The formula is stored as an array of integers aligned to the elements in EmpiricalFormula::ELEMENT_INDEX. Hence:
EmpiricalFormula::ELEMENT_INDEX[0].name # => "Hydrogen"
EmpiricalFormula::ELEMENT_INDEX[1].name # => "Oxygen"
water = EmpiricalFormula.new [2,1]
water.to_s # => 'H(2)O'
water.mass # => 18.0105646863
EmpiricalFormula may be added, subtracted, and multiplied to perform the expected operations:
alanine = EmpiricalFormula.new [5,1,3,1]
(alanine - water).formula # => [3,0,3,1]
Direct Known Subclasses
Defined Under Namespace
Classes: ParseError, UnknownElementError
Constant Summary collapse
- ELEMENT_INDEX_ORDER =
An array of all element symbols ordered roughly by their occurence in common biological molecules (ex water, carbohydrates, proteins).
['H', 'O', 'C', 'N', 'S', 'P', 'Fe', 'Ni', 'Se']
- ELEMENT_INDEX =
An array of all elements ordered as in ELEMENT_INDEX_ORDER
Element.library.collect :element_index do |e| unless ELEMENT_INDEX_ORDER.include?(e.symbol) ELEMENT_INDEX_ORDER << e.symbol end [e, ELEMENT_INDEX_ORDER.index(e.symbol)] end
Instance Attribute Summary collapse
-
#formula ⇒ Object
readonly
An array defining the number of a given element in the formula.
Class Method Summary collapse
-
.mass(formula, &block) ⇒ Object
Parses the input formula into an EmpiricalFormula and calculates the mass therefrom.
-
.parse(chemical_formula, &block) ⇒ Object
Parses a generalized chemical formula into an EmpiricalFormula.
-
.parse_simple(chemical_formula) ⇒ Object
Parses a simple formula (formatted like those returned by EmpiricalFormula#to_s) into a EmpiricalFormula.
Instance Method Summary collapse
-
#*(factor) ⇒ Object
Returns a new EmpiricalFormula multiplying the formula of self by factor.
-
#+(another) ⇒ Object
Returns a new EmpiricalFormula summing the formula of another and self.
-
#-(another) ⇒ Object
Returns a new EmpiricalFormula subtracting the formula of another from self.
-
#==(another) ⇒ Object
True if another is an EmpiricalFormula and the formula of another equals the formula of self.
-
#each ⇒ Object
Yields each element and the number of times that element occurs in self.
-
#initialize(formula = [], normalize = true) ⇒ EmpiricalFormula
constructor
A new instance of EmpiricalFormula.
-
#mass(&block) ⇒ Object
Calculates and returns the mass of self using the element masses returned by the block.
-
#to_s ⇒ Object
Returns a formula string formatted like ‘H(2)O’ with the elements are sorted alphabetically by symbol.
Methods included from Utils
Constructor Details
#initialize(formula = [], normalize = true) ⇒ EmpiricalFormula
Returns a new instance of EmpiricalFormula.
230 231 232 233 234 235 236 237 238 239 240 241 |
# File 'lib/molecules/empirical_formula.rb', line 230 def initialize(formula=[], normalize=true) @formula = formula if normalize # normalize by converting nils to zero and remove trailing zeros @formula.collect! {|factor| factor == nil ? 0 : factor} @formula.pop while @formula.last == 0 end # ensure the formula cannot be changed @formula.freeze end |
Instance Attribute Details
#formula ⇒ Object (readonly)
An array defining the number of a given element in the formula. The order of elements in ELEMENT_INDEX correspond to order of forumula, such that formula indicates the number of ELEMENT_INDEX elements in self.
228 229 230 |
# File 'lib/molecules/empirical_formula.rb', line 228 def formula @formula end |
Class Method Details
.mass(formula, &block) ⇒ Object
Parses the input formula into an EmpiricalFormula and calculates the mass therefrom. By default the mass will be the monoisotopic mass of the formula.
See EmpericalFormula#mass for more details.
191 192 193 |
# File 'lib/molecules/empirical_formula.rb', line 191 def mass(formula, &block) # :yields: element mass = parse(formula).mass(&block) end |
.parse(chemical_formula, &block) ⇒ Object
Parses a generalized chemical formula into an EmpiricalFormula. Formula sections can be nested with parenthesis, and multiple sections can be added or subtracted within the formula.
EmpiricalFormula.parse("H2O").to_s # => "H(2)O"
EmpiricalFormula.parse("CH3(CH2)50CH3").to_s # => "C(52)H(106)"
EmpiricalFormula.parse("C2H3NO - H2O + NH3").to_s # => "C(2)H(4)N(2)"
Note that the format for EmpiricalFormula#to_s differs from the format that parse utilizes.
To extend the functionality of parse, provide a block to receive formula sections with unexpected punctuation and calculate an EmpiricalFormula therefrom. If the block returns nil, then parse will raise an error.
block = lambda do |formula|
case formula
when /\[(.*)\]/
factors = $1.split(/,/).collect {|i| i.strip.to_i }
EmpiricalFormula.new(factors)
else nil
end
end
EmpiricalFormula.parse("H2O + [2, 1]", &block).to_s # => "H(4)O(2)"
EmpiricalFormula.parse("H2O + :not_expected", &block) # !> ParseError
89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 |
# File 'lib/molecules/empirical_formula.rb', line 89 def parse(chemical_formula, &block) # Remove whitespace formula = chemical_formula.to_s.gsub(/\s+/, "") # Split and handle multipart formulae case formula when /\+/ return formula.split(/\+/).inject(EmpiricalFormula.new) do |current, formula| current + parse(formula, &block) end when /-/ splits = formula.split(/-/) first = parse(splits.shift, &block) return splits.inject(first) do |current, formula| current - parse(formula, &block) end when /[^A-Za-z0-9\\(\\)]/ result = block_given? ? yield(formula) : nil return result unless result == nil raise ParseError.new("unexpected characters in formula: #{chemical_formula}") end # factor is the number following an element, as 6 and 12 in 'C6H12' # factor == -1 indicates that a number has not been read for the # next element. This state is used later to check for hanging # factors, as in '2C6' or (8OH) factor = nil # multiplier is the latest cumulative factor for a parenthesis # expression. A new multiplier is pushed on the stack for every new # parenthesis set, and popped off when the set terminates. # ex: for CH3(C(H)2)7CH # At the period Integer at the top of the stack equals # CH3(C(H)2)7.CH 1 # CH3(C(H)2.)7CH 7 # CH3(C(H.)2)7CH 14 # CH3(C.(H)2)7CH 7 # CH3.(CH)2)7CH 1 multiplier = [] multiplier << 1 # composition will store the formula composition as it is parsed composition = Hash.new(0) # Parse elements and factors out of the formula from right to left scanner = StringScanner.new(formula.reverse) while scanner.rest_size > 0 case when scanner.scan_full(/(\d+)/, true, false) # found a factor factor = scanner[1].reverse.to_i when scanner.scan_full(/([a-z]?[A-Z])/, true, false) # found an element # Adjust the factor by the multiplier. If factor == nil # then a factor has not been read for the element, as would # be seen in NaOH; use 1 in this case instead. factor = (factor.nil? ? 1 : factor) * multiplier.last # Add the current factor to composition, remembering to reverse the symbol composition[ scanner[1].reverse ] += factor # reset factor to nil factor = nil when scanner.scan_full(/\)/, true, false) # When a parenthesis ends, the current multiplier must be # adujusted by the current factor. If factor == nil then a # factor has not been read for the parenthesis, use 1 instead multiplier << (factor.nil? ? 1 : factor) * multiplier.last # reset factor to nil factor = nil when scanner.scan_full(/\(/, true, false) # When a parenthesis starts, the current multiplier is # popped off. Check for hanging factors and that after # popping a multiplier will remain. If no multiplier will # remain, then the parenthesis must be mismatched raise ParseError.new("the formula contains a hanging factor: #{chemical_formula}") unless factor.nil? raise ParseError.new("the formula contains mismatched parenthesis: #{chemical_formula}") unless multiplier.length > 1 multiplier.pop else raise ParseError.new("could not parse formula: #{chemical_formula}") end end # Check for hanging factors, that a multiplier remains, and that # elements were found during parsing raise ParseError.new("the formula contains a hanging factor: #{chemical_formula}") unless factor.nil? raise ParseError.new("the formula contains mismatched parenthesis: #{chemical_formula}") unless multiplier.length == 1 raise ParseError.new("no elements could be found in the formula: #{chemical_formula}") if composition.length == 0 && !formula.empty? EmpiricalFormula.new(composition_to_factors(composition)) end |
.parse_simple(chemical_formula) ⇒ Object
Parses a simple formula (formatted like those returned by EmpiricalFormula#to_s) into a EmpiricalFormula. Whitespace is allowed in the formula.
EmpiricalFormula.parse("H(2)O").to_s # => "H(2)O"
EmpiricalFormula.parse("H (2) O").to_s # => "H(2)O"
EmpiricalFormula.parse("HO(-1)O(2)H").to_s # => "H(2)O"
36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 |
# File 'lib/molecules/empirical_formula.rb', line 36 def parse_simple(chemical_formula) formula = chemical_formula.to_s.gsub(/\s+/, "") factor = nil composition = Hash.new(0) scanner = StringScanner.new(formula.reverse) while scanner.rest_size > 0 case when scanner.scan_full(/\)(\d+-?)\(/, true, false) # found a factor factor = scanner[1].reverse.to_i when scanner.scan_full(/([a-z]?[A-Z])/, true, false) # found an element composition[scanner[1].reverse] += (factor == nil ? 1 : factor) # reset factor to nil factor = nil else raise ParseError.new("could not parse formula: #{chemical_formula}") end end factors = composition_to_factors(composition) block_given? ? yield(factors) : EmpiricalFormula.new(factors) end |
Instance Method Details
#*(factor) ⇒ Object
Returns a new EmpiricalFormula multiplying the formula of self by factor.
254 255 256 |
# File 'lib/molecules/empirical_formula.rb', line 254 def *(factor) EmpiricalFormula.new(multiply(self.formula.dup, factor), false) end |
#+(another) ⇒ Object
Returns a new EmpiricalFormula summing the formula of another and self.
244 245 246 |
# File 'lib/molecules/empirical_formula.rb', line 244 def +(another) EmpiricalFormula.new(add(self.formula.dup, another.formula), false) end |
#-(another) ⇒ Object
Returns a new EmpiricalFormula subtracting the formula of another from self.
249 250 251 |
# File 'lib/molecules/empirical_formula.rb', line 249 def -(another) EmpiricalFormula.new(add(self.formula.dup, another.formula, -1), false) end |
#==(another) ⇒ Object
True if another is an EmpiricalFormula and the formula of another equals the formula of self.
259 260 261 |
# File 'lib/molecules/empirical_formula.rb', line 259 def ==(another) another.kind_of?(EmpiricalFormula) && self.formula == another.formula end |
#each ⇒ Object
Yields each element and the number of times that element occurs in self.
264 265 266 267 268 269 |
# File 'lib/molecules/empirical_formula.rb', line 264 def each # :yields: element, n formula.each_with_index do |n, index| next if n == 0 yield(ELEMENT_INDEX[index], n) end end |
#mass(&block) ⇒ Object
Calculates and returns the mass of self using the element masses returned by the block. Returns the monoisotopic mass for the formula (ie the mass calculated from the most abundant natural isotope of each element) if no block is given.
water = EmpiricalFormula.new [2,1]
# monoisotopic mass calculation
water.mass # => 18.0105646863
water.mass {|e| e.mass } # => 18.0105646863
# average mass calculation
water.mass {|e| e.std_atomic_weight.value } # => 18.01528
Notes
-
The definition of monoisotopic mass conforms to that presented in ‘Standard Definitions of Terms Relating to Mass Spectrometry, Phil. Price, J. Am. Soc. Mass Spectrom. (1991) 2 336-348’ (see Unimod Mass Help)
-
Masses are calculated such that mathematical operations are performed on the return of the block.
302 303 304 305 306 307 308 309 310 |
# File 'lib/molecules/empirical_formula.rb', line 302 def mass(&block) # :yields: element if block_given? mass = 0 each {|e, n| mass = (yield(e) * n) + mass } mass else @monoisotopic_mass ||= mass {|e| e.mass} end end |
#to_s ⇒ Object
Returns a formula string formatted like ‘H(2)O’ with the elements are sorted alphabetically by symbol.
273 274 275 276 277 |
# File 'lib/molecules/empirical_formula.rb', line 273 def to_s collect do |element, n| element.symbol + (n == 1 ? "" : "(#{n})") end.sort.join('') end |