Class: Bio::Go

Inherits:
Object
  • Object
show all
Defined in:
lib/go.rb

Defined Under Namespace

Classes: SubsumeTester

Instance Method Summary collapse

Constructor Details

#initializeGo

Returns a new instance of Go.



7
8
9
10
11
# File 'lib/go.rb', line 7

def initialize
  @r = RSRuby.instance
  @r.library('GO.db')
  #      @r.print('initing') #debug to test if this is being loaded twice
end

Instance Method Details

#biological_process_offspring(go_term) ⇒ Object

Return an array of GO identifiers that are the offspring (all the descendents) of the given GO term given that it is a biological process GO term.



47
48
49
# File 'lib/go.rb', line 47

def biological_process_offspring(go_term)
  go_get(go_term, 'GOBPOFFSPRING')
end

#cc_pdb_to_go(pdb_id) ⇒ Object

Retrieve the GO annotations associated with a PDB id, using Bio::Fetch PDB and UniprotKB at EBI



90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
# File 'lib/go.rb', line 90

def cc_pdb_to_go(pdb_id)
  # retrieve the pdb file from EBI, to extract the UniprotKB Identifiers
  pdb = Bio::Fetch.new('http://www.ebi.ac.uk/cgi-bin/dbfetch').fetch('pdb', pdb_id)
  
  # parse the PDB and return the uniprot accessions (there may be >1 because of chains)
  uniprots = Bio::PDB.new(pdb).dbref.select{|s| s.database=='UNP'}.collect{|s| s.dbAccession}
  
  gos = []
  uniprots.uniq.each do |uniprot|
    u = Bio::Fetch.new('http://www.ebi.ac.uk/cgi-bin/dbfetch').fetch('uniprot', uniprot)
    
    unp = Bio::SPTR.new(u)
    
    gos.push unp.dr('GO').select{|a|
      a['Version'].match(/^C\:/)
    }.collect{ |g|
      g['Accession']
    }
  end
  
  return gos.flatten.uniq
end

#cellular_component_offspring(go_term) ⇒ Object

Return an array of GO identifiers that are the offspring (all the descendents) of the given GO term given that it is a cellular component GO term.



33
34
35
# File 'lib/go.rb', line 33

def cellular_component_offspring(go_term)
  go_get(go_term, 'GOCCOFFSPRING')
end

#go_get(go_term, partition) ⇒ Object

Generic method for retrieving e.g offspring(‘GO:0042717’, ‘GOCCCHILDREN’)



53
54
55
56
57
# File 'lib/go.rb', line 53

def go_get(go_term, partition)
  answers = @r.eval_R("get('#{go_term}', #{partition})")
  return [] if answers.kind_of?(Bignum) # returns this for some reason when there's no children
  return answers
end

#go_offspring(go_id) ⇒ Object

Return an array of GO identifiers that are the offspring (all the descendents) of the given GO term from any ontology (cellular component, biological process or molecular function)



16
17
18
19
20
21
22
23
24
25
26
27
28
# File 'lib/go.rb', line 16

def go_offspring(go_id)
  o = ontology_abbreviation(go_id)
  case o
  when 'MF'
    return molecular_function_offspring(go_id)
  when 'CC'
    return cellular_component_offspring(go_id)
  when 'BP'
    return biological_process_offspring(go_id)
  else
    raise Exception, "Unknown ontology abbreviation found: #{o} for go id: #{go_id}"
  end
end

#molecular_function_offspring(go_term) ⇒ Object

Return an array of GO identifiers that are the offspring (all the descendents) of the given GO term given that it is a molecular function GO term.



40
41
42
# File 'lib/go.rb', line 40

def molecular_function_offspring(go_term)
  go_get(go_term, 'GOMFOFFSPRING')
end

#ontology_abbreviation(go_id) ⇒ Object

Return ‘MF’, ‘CC’ or ‘BP’ corresponding to the



137
138
139
# File 'lib/go.rb', line 137

def ontology_abbreviation(go_id)
  @r.eval_R("Ontology(get('#{go_id}', GOTERM))")
end

#primary_go_id(go_id_or_synonym_id) ⇒ Object

Given a GO ID such as GO:0048253, return the GO term that is the primary ID (GO:0050333), so that offspring functions can be used properly.



61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
# File 'lib/go.rb', line 61

def primary_go_id(go_id_or_synonym_id)
  # > get('GO:0048253', GOSYNONYM)
  #GOID: GO:0050333
  #Term: thiamin-triphosphatase activity
  #Ontology: MF
  #Definition: Catalysis of the reaction: thiamin triphosphate + H2O =
  #    thiamin diphosphate + phosphate.
  #Synonym: thiamine-triphosphatase activity
  #Synonym: thiamine-triphosphate phosphohydrolase activity
  #Synonym: ThTPase activity
  #Synonym: GO:0048253
  #Secondary: GO:0048253

  begin
    # try to find the synonym
    return @r.eval_R("GOID(get('#{go_id_or_synonym_id}', GOSYNONYM))")
  rescue RException
    # if no synonym is found, try to find the primary ID. raise RException if none is found
    return @r.eval_R("GOID(get('#{go_id_or_synonym_id}', GOTERM))")
  end
end

#subsume?(subsumer_go_id, subsumee_go_id) ⇒ Boolean

Does the subsumer subsume the subsumee? i.e. Does it include the subsumee as one of its children in the GO tree?

For repetitively testing one GO term subsumes others, it might be faster to use subsume_tester

Returns:

  • (Boolean)


118
119
120
121
122
123
124
125
126
127
128
# File 'lib/go.rb', line 118

def subsume?(subsumer_go_id, subsumee_go_id)
  # map the subsumee to non-synonomic id
  primaree = self.primary_go_id(subsumee_go_id)
  primarer = self.primary_go_id(subsumer_go_id)

  # return if they are the same - the obvious case
  return true if primaree == primarer

  # return if subsumee is a descendent of sumsumer
  return go_offspring(primarer).include?(primaree)
end

#subsume_tester(subsumer_go_id, check_for_synonym = true) ⇒ Object

Return a subsume tester for a given GO term. This method is faster than repeatedly calling subsume? because the list of children is cached



132
133
134
# File 'lib/go.rb', line 132

def subsume_tester(subsumer_go_id, check_for_synonym=true)
  Go::SubsumeTester.new(self, subsumer_go_id, check_for_synonym)
end

#term(go_id) ⇒ Object

Retrieve the string description of the given go identifier



84
85
86
# File 'lib/go.rb', line 84

def term(go_id)
  @r.eval_R("Term(get('#{go_id}', GOTERM))")
end