Class: OBAClient
- Inherits:
-
Object
- Object
- OBAClient
- Defined in:
- lib/oba-client.rb
Overview
A class for interacting with the Open Biomedical Annotator. There are two things we do: get text, and parse it. We can do both independently or serially.
Constant Summary collapse
- VERSION =
"2.0.4"
- DEFAULT_TIMEOUT =
A high HTTP read timeout, as the service sometimes takes awhile to respond.
30
- DEFAULT_URI =
The endpoint URI for the production version of the Annotator service.
"http://rest.bioontology.org/obs/annotator"
- HEADER =
The header for every request. There’s no need to specify this per-instance.
{"Content-Type" => "application/x-www-form-urlencoded"}
- ANNOTATOR_PARAMETERS =
Parameters the annotator accepts. Any one not in this list (excluding textToAnnotate) is not valid.
[ :email, :filterNumber, :format, :isStopWordsCaseSensitive, :isVirtualOntologyID, :levelMax, :longestOnly, :ontologiesToExpand, :ontologiesToKeepInResult, :mappingTypes, :minTermSize, :scored, :semanticTypes, :stopWords, :wholeWordOnly, :withDefaultStopWords, :withSynonyms, ]
- STATISTICS_BEANS_XPATH =
"/success/data/annotatorResultBean/statistics/statisticsBean"
- ANNOTATION_BEANS_XPATH =
"/success/data/annotatorResultBean/annotations/annotationBean"
- ONTOLOGY_BEANS_XPATH =
"/success/data/annotatorResultBean/ontologies/ontologyUsedBean"
- CONCEPT_ATTRIBUTES =
Attributes for mapping concepts (only one type).
{ :id => lambda {|c| c.xpath("id").text.to_i}, :localConceptId => lambda {|c| c.xpath("localConceptId").text}, :localOntologyId => lambda {|c| c.xpath("localOntologyId").text.to_i}, :isTopLevel => lambda {|c| to_b(c.xpath("isTopLevel").text)}, :fullId => lambda {|c| c.xpath("fullId").text}, :preferredName => lambda {|c| c.xpath("preferredName").text}, :synonyms => lambda do |c| c.xpath("synonyms/synonym").map do |s| s.xpath("string").text end end, :semanticTypes => lambda do |c| c.xpath("semanticTypes/semanticTypeBean").map do |s| { :id => s.xpath("id").text.to_i, :semanticType => s.xpath("semanticType").text, :description => s.xpath("description").text } end end }
- CONTEXT_ATTRIBUTES =
Attributes for mapping and mgrep contexts (both will add additional attributes).
{ :contextName => lambda {|c| c.xpath("contextName").text}, :isDirect => lambda {|c| to_b(c.xpath("isDirect").text)}, :from => lambda {|c| c.xpath("from").text.to_i}, :to => lambda {|c| c.xpath("to").text.to_i}, }
- ANNOTATION_CONTEXT_ATTRIBUTES =
Attributes for annotation contexts.
{ :score => lambda {|c| c.xpath("score").text.to_i}, :concept => lambda {|c| parse_concept(c.xpath("concept").first)}, :context => lambda {|c| parse_context(c.xpath("context").first)} }
- MAPPED_CONTEXT_ATTRIBUTES =
Attributes for mapping contexts.
CONTEXT_ATTRIBUTES.merge( :mappingType => lambda {|c| c.xpath("mappingType").text}, :mappedConcept => lambda {|c| parse_concept(c.xpath("mappedConcept").first)} )
- MGREP_CONTEXT_ATTRIBUTES =
Attributes for mgrep contexts.
CONTEXT_ATTRIBUTES.merge( :name => lambda {|c| c.xpath("term/name").text}, :localConceptId => lambda {|c| c.xpath("term/localConceptId").text}, :isPreferred => lambda {|c| to_b(c.xpath("term/isPreferred").text)}, :dictionaryId => lambda {|c| c.xpath("term/dictionaryId").text} )
- CONTEXT_CLASSES =
Map the bean type to the set of attributes we parse from it.
{ "annotationContextBean" => ANNOTATION_CONTEXT_ATTRIBUTES, "mgrepContextBean" => MGREP_CONTEXT_ATTRIBUTES, "mappingContextBean" => MAPPED_CONTEXT_ATTRIBUTES, }
Class Method Summary collapse
-
.parse(xml) ⇒ Hash<Symbol, Object>
Parse raw XML, returning a Hash with three elements: statistics, annotations, and ontologies.
-
.parse_concept(concept) ⇒ Hash<Symbol, Object>
Parse a concept: a toplevel annotation concept, or an annotation’s mapping concept.
-
.parse_context(context) ⇒ Hash<Symbol, Object>
Parse a context: an annotation, or a mapping/mgrep context bean.
-
.to_b(value) ⇒ true, false
A little helper: convert a string true/false or 1/0 value to boolean.
Instance Method Summary collapse
-
#execute(text) ⇒ Hash<Symbol, Array>, ...
Perform the annotation.
-
#initialize(options = {}) ⇒ OBAClient
constructor
Instantiate the class with a set of reused options.
Constructor Details
#initialize(options = {}) ⇒ OBAClient
Instantiate the class with a set of reused options. Options used by the method are:
* {String} **uri**: the URI of the annotator service (default:
{DEFAULT_URI}).
* {Fixnum} **timeout**: the length of the read timeout (default:
{DEFAULT_TIMEOUT}).
* {Boolean} **parse_xml**: whether to parse the received text (default:
false).
* {Array}<{String}> **ontologies**: a pseudo-parameter which sets both
ontologiesToExpand and ontologiesToKeepInResult.
62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 |
# File 'lib/oba-client.rb', line 62 def initialize( = {}) @uri = URI.parse(.delete(:uri) || DEFAULT_URI) @timeout = .delete(:timeout) || DEFAULT_TIMEOUT @parse_xml = .delete(:parse_xml) if ontologies = .delete(:ontologies) [:ontologiesToExpand, :ontologiesToKeepInResult].each do |k| if .include?(k) puts "WARNING: specified both :ontologies and #{k}, ignoring #{k}." end [k] = ontologies end end @options = {} .each do |k, v| if !ANNOTATOR_PARAMETERS.include?(k) puts "WARNING: #{k} is not a valid annotator parameter." end if v.is_a? Array @options[k] = v.join(",") else @options[k] = v end end if !@options.include?(:email) puts "TIP: as a courtesy, consider including your email in the " + "request (:email => '[email protected]')" end end |
Class Method Details
.parse(xml) ⇒ Hash<Symbol, Object>
Parse raw XML, returning a Hash with three elements: statistics, annotations, and ontologies. Respectively, these represent the annotation statistics (annotations by mapping type, etc., as a Hash), an Array of each annotation (as a Hash), and an Array of ontologies used (also as a Hash).
238 239 240 241 242 243 244 245 246 247 248 249 250 251 252 253 254 255 256 257 258 259 260 261 262 263 264 265 |
# File 'lib/oba-client.rb', line 238 def self.parse(xml) puts "WARNING: text is empty!" if (xml.gsub(/\n/, "") == "") doc = Nokogiri::XML.parse(xml) statistics = Hash[doc.xpath(STATISTICS_BEANS_XPATH).map do |sb| [sb.xpath("mapping").text, sb.xpath("nbAnnotation").text.to_i] end] annotations = doc.xpath(ANNOTATION_BEANS_XPATH).map do |annotation| parse_context(annotation) end ontologies = doc.xpath(ONTOLOGY_BEANS_XPATH).map do |ontology| { :localOntologyId => ontology.xpath("localOntologyId").text.to_i, :virtualOntologyId => ontology.xpath("virtualOntologyId").text.to_i, :name => ontology.xpath("name").text, :version => ontology.xpath("version").text.to_f, :nbAnnotation => ontology.xpath("nbAnnotation").text.to_i } end { :statistics => statistics, :annotations => annotations, :ontologies => ontologies } end |
.parse_concept(concept) ⇒ Hash<Symbol, Object>
Parse a concept: a toplevel annotation concept, or an annotation’s mapping concept.
221 222 223 224 225 |
# File 'lib/oba-client.rb', line 221 def self.parse_concept(concept) Hash[CONCEPT_ATTRIBUTES.map do |k, v| [k, v.call(concept)] end] end |
.parse_context(context) ⇒ Hash<Symbol, Object>
Parse a context: an annotation, or a mapping/mgrep context bean.
199 200 201 202 203 204 205 206 207 208 209 210 211 212 |
# File 'lib/oba-client.rb', line 199 def self.parse_context(context) # Annotations (annotationBeans) do not have a class, so we'll refer to them # as annotationContextBeans. context_class will be one of the types in # {CONTEXT_CLASSES}. context_class = if context.attribute("class").nil? "annotationContextBean" else context.attribute("class").value end Hash[CONTEXT_CLASSES[context_class].map do |k, v| [k, v.call(context)] end] end |
.to_b(value) ⇒ true, false
A little helper: convert a string true/false or 1/0 value to boolean. AFAIK, there’s no better way to do this.
274 275 276 277 278 279 280 281 |
# File 'lib/oba-client.rb', line 274 def self.to_b(value) case value when "0" then false when "1" then true when "false" then false when "true" then true end end |
Instance Method Details
#execute(text) ⇒ Hash<Symbol, Array>, ...
Perform the annotation.
100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 |
# File 'lib/oba-client.rb', line 100 def execute(text) request = Net::HTTP::Post.new(@uri.path, initheader=HEADER) request.body = {:textToAnnotate => text}.merge(@options).map do |k, v| "#{CGI.escape(k.to_s)}=#{CGI.escape(v.to_s)}" end.join("&") puts request.body if $DEBUG begin response = Net::HTTP.new(@uri.host, @uri.port).start do |http| http.read_timeout = @timeout http.request(request) end @parse_xml ? self.class.parse(response.body) : response.body rescue Timeout::Error puts "Request for #{text[0..10]}... timed-out at #{@timeout} seconds." end end |