Class: EuPathDBGeneInformationTable

Inherits:
Object
  • Object
show all
Includes:
Enumerable
Defined in:
lib/protk/eupathdb_gene_information_table.rb

Overview

A class for parsing the ‘gene information table’ files from EuPathDB, such as cryptodb.org/common/downloads/release-4.3/Cmuris/txt/CmurisGene_CryptoDB-4.3.txt

The usual way of interacting with these is the use of the each method, which returns a EuPathDBGeneInformation object with all of the recorded information in it.

Instance Method Summary collapse

Constructor Details

#initialize(io) ⇒ EuPathDBGeneInformationTable

Initialise using an IO object, say File.open(‘/path/to/CmurisGene_CryptoDB-4.3.txt’). After opening, the #each method can be used to iterate over the genes that are present in the file



54
55
56
# File 'lib/protk/eupathdb_gene_information_table.rb', line 54

def initialize(io)
  @io = io
end

Instance Method Details

#eachObject

Return a EuPathDBGeneInformation object with the contained info in it, one at a time



60
61
62
63
64
# File 'lib/protk/eupathdb_gene_information_table.rb', line 60

def each
  while g = next_gene
    yield g
  end
end

#next_geneObject

Returns a EuPathDBGeneInformation object with all the data you could possibly want.



68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
# File 'lib/protk/eupathdb_gene_information_table.rb', line 68

def next_gene
  info = EuPathDBGeneInformation.new
  
  # first, read the table, which should start with the ID column
  line = @io.readline.strip
  while line == ''
    return nil if @io.eof?
    line = @io.readline.strip
  end
  
  while line != ''
    if matches = line.match(/^(.*?)\: (.*)$/)
      info.add_information(matches[1], matches[2])
    else
      raise Exception, "EuPathDBGeneInformationTable Couldn't parse this line: #{line}"
    end
    
    line = @io.readline.strip
  end
  
  # now read each of the tables, which should start with the
  # 'TABLE: <name>' entry
  line = @io.readline.strip
  table_name = nil
  headers = nil
  data = []
  while line != '------------------------------------------------------------'
    if line == ''
      # add it to the stack unless we are just starting out
      info.add_table(table_name, headers, data) unless table_name.nil?
      
      # reset things
      table_name = nil
      headers = nil
      data = []
    elsif matches = line.match(/^TABLE\: (.*)$/)
      # name of a table
      table_name = matches[1]
    elsif line.match(/^\[.*\]/)
      # headings of the table
      headers = line.split("\t").collect do |header|
        header.gsub(/^\[/,'').gsub(/\]$/,'')
      end
    else
      # a proper data row
      data.push line.split("\t")
    end
    line = @io.readline.strip      
  end
          
  # return the object that has been created
  return info
end