Class: FastqFile

Inherits:
File
  • Object
show all
Defined in:
lib/parse_fasta/fastq_file.rb

Overview

Provides simple interface for parsing four-line-per-record fastq format files. Gzipped files are no problem.

Instance Method Summary collapse

Instance Method Details

#each_record {|header, sequence, description, quality_string| ... } ⇒ Object

Analagous to IO#each_line, #each_record is used to go through a fastq file record by record. It will accept gzipped files as well.

Examples:

Parsing a fastq file

FastqFile.open('reads.fq').each_record do |head, seq, desc, qual|
  # do some fun stuff here!
end

Use the same syntax for gzipped files!

FastqFile.open('reads.fq.gz').each_record do |head, seq, desc, qual|
  # do some fun stuff here!
end

Yields:

  • The header, sequence, description and quality string for each record in the fastq file to the block

Yield Parameters:

  • header (String)

    The header of the fastq record without the leading ‘@’

  • sequence (Sequence)

    The sequence of the fastq record

  • description (String)

    The description line of the fastq record without the leading ‘+’

  • quality_string (Quality)

    The quality string of the fastq record



65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
# File 'lib/parse_fasta/fastq_file.rb', line 65

def each_record
  count = 0
  header = ''
  sequence = ''
  description = ''
  quality = ''

  begin
    f = Zlib::GzipReader.open(self)
  rescue Zlib::GzipFile::Error => e
    f = self
  end

  f.each_line do |line|
    line.chomp!

    case count % 4
    when 0
      header = line.sub(/^@/, '')
    when 1
      sequence = Sequence.new(line)
    when 2
      description = line.sub(/^\+/, '')
    when 3
      quality = Quality.new(line)
      yield(header, sequence, description, quality)
    end

    count += 1
  end

  f.close if f.instance_of?(Zlib::GzipReader)
  return f
end

#to_hashHash

Returns the records in the fastq file as a hash map with the headers as keys pointing to a hash map like so { “seq1” => { head: “seq1”, seq: “ACTG”, desc: “”, qual: “II3*”} }

Examples:

Read a fastQ into a hash table.

seqs = FastqFile.open('reads.fq.gz').to_hash

Returns:

  • (Hash)

    A hash with headers as keys, and a hash map as the value with keys :head, :seq, :desc, :qual, for header, sequence, description, and quality.



35
36
37
38
39
40
41
42
# File 'lib/parse_fasta/fastq_file.rb', line 35

def to_hash
  hash = {}
  self.each_record do |head, seq, desc, qual|
    hash[head] = { head: head, seq: seq, desc: desc, qual: qual }
  end

  hash
end