Class: StanfordParser::StandoffParsedText
- Inherits:
-
Array
- Object
- Array
- StanfordParser::StandoffParsedText
- Defined in:
- lib/stanfordparser.rb
Overview
Standoff syntactic annotation of natural language text which may contain multiple sentences.
This is an Array of StandoffNode objects, one for each sentence in the text.
Instance Method Summary collapse
-
#initialize(text, nodetype = StandoffNode, tokenizer = EN_PENN_TREEBANK_TOKENIZER, parser = DefaultParser.instance) ⇒ StandoffParsedText
constructor
Parse the text and create the standoff annotation.
-
#inspect ⇒ Object
Print class name and number of sentences.
-
#to_s ⇒ Object
Print parses.
Constructor Details
#initialize(text, nodetype = StandoffNode, tokenizer = EN_PENN_TREEBANK_TOKENIZER, parser = DefaultParser.instance) ⇒ StandoffParsedText
Parse the text and create the standoff annotation.
The default parser is a singleton instance of the English language Stanford Natural Langugage parser. There may be a delay of a few seconds for it to load the first time it is created.
323 324 325 326 327 328 329 330 331 332 333 |
# File 'lib/stanfordparser.rb', line 323 def initialize(text, nodetype = StandoffNode, tokenizer = EN_PENN_TREEBANK_TOKENIZER, parser = DefaultParser.instance) preprocessor = StandoffDocumentPreprocessor.new(tokenizer) # Segment the text into sentences. Parse each sentence, writing # standoff annotation information into the terminal nodes. preprocessor.getSentencesFromString(text).map do |sentence| parse = parser.apply(sentence.to_s) push(nodetype.new(parse, sentence)) end end |
Instance Method Details
#inspect ⇒ Object
Print class name and number of sentences.
336 337 338 |
# File 'lib/stanfordparser.rb', line 336 def inspect "<#{self.class.name}, #{length} sentences>" end |
#to_s ⇒ Object
Print parses.
341 342 343 |
# File 'lib/stanfordparser.rb', line 341 def to_s flatten.join(" ") end |