ms-msrun
A library for working with LC/MS runs.
Examples
The following example works on ALL versions of mzXML and mzML (including support for compressed peak data).
require "ms/msrun"
file = "file.mzXML" # works identical for "file.mzML"
Ms::Msrun.open(file) do |ms|
# Run level information:
ms.start_time # in seconds, gives date for mzML
ms.end_time # in seconds, returns nil in mzML
ms.scan_count # number of scans
ms.scan_count(1) # number of MS scans
ms.scan_count(2) # number of MS/MS scans, etc.
ms.parent_basename_noext # "file" (as recorded _in the xml_)
ms.filename # "file.mzXML"
# Random scan access (blazing fast)
ms.scan(22) # a scan object
# Complete scan access
ms.each do |scan|
scan.num # scan number
scan.ms_level # ms_level
scan.time # retention time in seconds
scan.start_mz # the first m/z value, returns nil in mzML
scan.end_mz # the last m/z value, returns nil in mzML
# Precursor information
pr = scan.precursor # an Ms::Precursor object
pr.mz
pr.intensity # does fast binary search if info not already given
pr.parent # the parent scan
pr.charge_states # Array of possible charge states
# Spectral information
spectrum = scan.spectrum
spectrum.mzs # Array of m/z values
spectrum.intensities # Array of m/z values
spectrum.peaks do |mz, inten|
puts "#{mz} #{inten}" # print each peak on own line
end
end
# supports pre-filtering for faster access
## get just precursor info:
ms.each(:ms_level => 2, :spectrum => false) {|scan| scan.precursor }
## get just level one spectra:
ms.each(:ms_level => 1, :precursor => false) {|scan| scan.spectrum }
end
# Quicker way to get at the scans:
Ms::Msrun.foreach("file.mzXML") {|scan| scan <do something> }
Conversion
Can convert mzXML or mzML to mgf or ms2
Ms::Msrun.open(mzmlFile) do |ms|
mgfFile = mzmlFile.chomp(".mzML") + ".ms2"
ms.to_ms2(:output => mgfFile)
end
Or it can be done through the command line program ms_to_search.rb
"usage: <file>.mz[XML | ML] ... <type>"
Other output formats can be included in future versions.
Features
- Fast
-
Uses Nokogiri and a dash of regular expressions to achieve very fast random access of scans (also supports accessing all scans or subsets of scans).
- Unified
-
One interface for all formats.
- Lazy evaluation at scan and spectrum level
-
Scans are only read from IO when requested. Spectra are also decoded only when explicitly accessed.
- Extensively tested
-
To release, the parser must pass an extensive specification for each file version (a total of ~1500 tests).
- Long-term support
-
We will continue to support newer versions and fix any bugs or edge cases that are found. Please alert us of any mzXML or mzML file that is not parsed correctly.
Installation
gem install ms-msrun
Copying
See LICENSE