Class: Bio::Go
- Inherits:
-
Object
- Object
- Bio::Go
- Defined in:
- lib/go.rb
Defined Under Namespace
Classes: SubsumeTester
Instance Method Summary collapse
-
#ancestors_cc(primary_go_id) ⇒ Object
Return an array of GO ids that correspond to the parent GO terms in the ontology.
-
#biological_process_offspring(go_term) ⇒ Object
Return an array of GO identifiers that are the offspring (all the descendents) of the given GO term given that it is a biological process GO term.
-
#cc_pdb_to_go(pdb_id) ⇒ Object
Retrieve the GO annotations associated with a PDB id, using Bio::Fetch PDB and UniprotKB at EBI.
-
#cellular_component_offspring(go_term) ⇒ Object
Return an array of GO identifiers that are the offspring (all the descendents) of the given GO term given that it is a cellular component GO term.
-
#cordial_cc(primary_go_id) ⇒ Object
Return an array of ancestors of the GO term or any of the GO terms’ children, in no particular order.
-
#go_get(go_term, partition) ⇒ Object
Generic method for retrieving e.g offspring(‘GO:0042717’, ‘GOCCCHILDREN’).
-
#go_offspring(go_id) ⇒ Object
Return an array of GO identifiers that are the offspring (all the descendents) of the given GO term from any ontology (cellular component, biological process or molecular function).
-
#initialize ⇒ Go
constructor
A new instance of Go.
-
#molecular_function_offspring(go_term) ⇒ Object
Return an array of GO identifiers that are the offspring (all the descendents) of the given GO term given that it is a molecular function GO term.
-
#ontology_abbreviation(go_id) ⇒ Object
Return ‘MF’, ‘CC’ or ‘BP’ corresponding to the.
-
#primary_go_id(go_id_or_synonym_id) ⇒ Object
Given a GO ID such as GO:0048253, return the GO term that is the primary ID (GO:0050333), so that offspring functions can be used properly.
-
#subsume?(subsumer_go_id, subsumee_go_id) ⇒ Boolean
Does the subsumer subsume the subsumee? i.e.
-
#subsume_tester(subsumer_go_id, check_for_synonym = true) ⇒ Object
Return a subsume tester for a given GO term.
-
#term(go_id) ⇒ Object
Retrieve the string description of the given go identifier.
Constructor Details
#initialize ⇒ Go
Returns a new instance of Go.
7 8 9 10 |
# File 'lib/go.rb', line 7 def initialize @r = RSRuby.instance @r.library('GO.db') end |
Instance Method Details
#ancestors_cc(primary_go_id) ⇒ Object
Return an array of GO ids that correspond to the parent GO terms in the ontology. This isn’t the most efficient this could be, because it probably gets the parents for a single id multiple times.
151 152 153 |
# File 'lib/go.rb', line 151 def ancestors_cc(primary_go_id) go_get(primary_go_id, 'GOCCANCESTOR') end |
#biological_process_offspring(go_term) ⇒ Object
Return an array of GO identifiers that are the offspring (all the descendents) of the given GO term given that it is a biological process GO term.
46 47 48 |
# File 'lib/go.rb', line 46 def biological_process_offspring(go_term) go_get(go_term, 'GOBPOFFSPRING') end |
#cc_pdb_to_go(pdb_id) ⇒ Object
Retrieve the GO annotations associated with a PDB id, using Bio::Fetch PDB and UniprotKB at EBI
97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 |
# File 'lib/go.rb', line 97 def cc_pdb_to_go(pdb_id) # retrieve the pdb file from EBI, to extract the UniprotKB Identifiers pdb = Bio::Fetch.new('http://www.ebi.ac.uk/cgi-bin/dbfetch').fetch('pdb', pdb_id) # parse the PDB and return the uniprot accessions (there may be >1 because of chains) uniprots = Bio::PDB.new(pdb).dbref.select{|s| s.database=='UNP'}.collect{|s| s.dbAccession} gos = [] uniprots.uniq.each do |uniprot| u = Bio::Fetch.new('http://www.ebi.ac.uk/cgi-bin/dbfetch').fetch('uniprot', uniprot) unp = Bio::SPTR.new(u) gos.push unp.dr('GO').select{|a| a['Version'].match(/^C\:/) }.collect{ |g| g['Accession'] } end return gos.flatten.uniq end |
#cellular_component_offspring(go_term) ⇒ Object
Return an array of GO identifiers that are the offspring (all the descendents) of the given GO term given that it is a cellular component GO term.
32 33 34 |
# File 'lib/go.rb', line 32 def cellular_component_offspring(go_term) go_get(go_term, 'GOCCOFFSPRING') end |
#cordial_cc(primary_go_id) ⇒ Object
Return an array of ancestors of the GO term or any of the GO terms’ children, in no particular order. This is useful when wanting to know if a term has an annotation that is non-overlapping with a particular go term. For instance, ‘membrane’ is cordial with ‘nucleus’, they are boths is an ancestors of ‘nuclear membrane’. However, ‘mitochondrion’ and ‘nucleus’ are not cordial, since they share no common offspring.
162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 |
# File 'lib/go.rb', line 162 def cordial_cc(primary_go_id) # cordial can be direct ancestors of a term - then the common term # is this term itself cordial_ids = ancestors_cc(primary_go_id) # collect all ancestors of all offspring offspring = cellular_component_offspring(primary_go_id) offspring.each do |o| cordial_ids.push ancestors_cc(o) cordial_ids.push o end # remove the term itself and any children - they are not # merely cordial cordial_ids = cordial_ids.flatten.uniq.reject do |i| offspring.include?(i) or primary_go_id==i end # return a uniq array of cordial terms cordial_ids end |
#go_get(go_term, partition) ⇒ Object
Generic method for retrieving e.g offspring(‘GO:0042717’, ‘GOCCCHILDREN’)
52 53 54 55 56 |
# File 'lib/go.rb', line 52 def go_get(go_term, partition) answers = @r.eval_R("get('#{go_term}', #{partition})") return [] if answers.kind_of?(Bignum) # returns this for some reason when there's no children return answers end |
#go_offspring(go_id) ⇒ Object
Return an array of GO identifiers that are the offspring (all the descendents) of the given GO term from any ontology (cellular component, biological process or molecular function)
15 16 17 18 19 20 21 22 23 24 25 26 27 |
# File 'lib/go.rb', line 15 def go_offspring(go_id) o = ontology_abbreviation(go_id) case o when 'MF' return molecular_function_offspring(go_id) when 'CC' return cellular_component_offspring(go_id) when 'BP' return biological_process_offspring(go_id) else raise Exception, "Unknown ontology abbreviation found: #{o} for go id: #{go_id}" end end |
#molecular_function_offspring(go_term) ⇒ Object
Return an array of GO identifiers that are the offspring (all the descendents) of the given GO term given that it is a molecular function GO term.
39 40 41 |
# File 'lib/go.rb', line 39 def molecular_function_offspring(go_term) go_get(go_term, 'GOMFOFFSPRING') end |
#ontology_abbreviation(go_id) ⇒ Object
Return ‘MF’, ‘CC’ or ‘BP’ corresponding to the
144 145 146 |
# File 'lib/go.rb', line 144 def ontology_abbreviation(go_id) @r.eval_R("Ontology(get('#{go_id}', GOTERM))") end |
#primary_go_id(go_id_or_synonym_id) ⇒ Object
Given a GO ID such as GO:0048253, return the GO term that is the primary ID (GO:0050333), so that offspring functions can be used properly.
60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 |
# File 'lib/go.rb', line 60 def primary_go_id(go_id_or_synonym_id) # > get('GO:0048253', GOSYNONYM) #GOID: GO:0050333 #Term: thiamin-triphosphatase activity #Ontology: MF #Definition: Catalysis of the reaction: thiamin triphosphate + H2O = # thiamin diphosphate + phosphate. #Synonym: thiamine-triphosphatase activity #Synonym: thiamine-triphosphate phosphohydrolase activity #Synonym: ThTPase activity #Synonym: GO:0048253 #Secondary: GO:0048253 # A performance note: # According to some tests that I ran, finding GOID by searching GOTERM # is much faster than by GOSYNONYM. A begin # Assume it is a primary ID, as it likely will be most of the time. return @r.eval_R("GOID(get('#{go_id_or_synonym_id}', GOTERM))") rescue RException # if no primary is found, try to finding it by synonym. raise RException if none is found begin return @r.eval_R("GOID(get('#{go_id_or_synonym_id}', GOSYNONYM))") rescue RException => e raise RException, "#{e.}: GO Identifier '#{go_id_or_synonym_id}' does not appear to be a primary ID nor synonym. Is the GO.db database up to date?" end end end |
#subsume?(subsumer_go_id, subsumee_go_id) ⇒ Boolean
Does the subsumer subsume the subsumee? i.e. Does it include the subsumee as one of its children in the GO tree?
For repetitively testing one GO term subsumes others, it might be faster to use subsume_tester
125 126 127 128 129 130 131 132 133 134 135 |
# File 'lib/go.rb', line 125 def subsume?(subsumer_go_id, subsumee_go_id) # map the subsumee to non-synonomic id primaree = self.primary_go_id(subsumee_go_id) primarer = self.primary_go_id(subsumer_go_id) # return if they are the same - the obvious case return true if primaree == primarer # return if subsumee is a descendent of sumsumer return go_offspring(primarer).include?(primaree) end |
#subsume_tester(subsumer_go_id, check_for_synonym = true) ⇒ Object
Return a subsume tester for a given GO term. This method is faster than repeatedly calling subsume? because the list of children is cached
139 140 141 |
# File 'lib/go.rb', line 139 def subsume_tester(subsumer_go_id, check_for_synonym=true) Go::SubsumeTester.new(self, subsumer_go_id, check_for_synonym) end |
#term(go_id) ⇒ Object
Retrieve the string description of the given go identifier
91 92 93 |
# File 'lib/go.rb', line 91 def term(go_id) @r.eval_R("Term(get('#{go_id}', GOTERM))") end |