Class: StanfordParser::DocumentPreprocessor

Inherits:
Rjb::JavaObjectWrapper show all
Defined in:
lib/stanfordparser.rb

Overview

Tokenizes documents into words and sentences.

This is a wrapper for the edu.stanford.nlp.process.DocumentPreprocessor object.

Direct Known Subclasses

StandoffDocumentPreprocessor

Instance Attribute Summary

Attributes inherited from Rjb::JavaObjectWrapper

#java_object

Instance Method Summary collapse

Methods inherited from Rjb::JavaObjectWrapper

#each, #method_missing

Constructor Details

#initialize(suppressEscaping = false) ⇒ DocumentPreprocessor

Returns a new instance of DocumentPreprocessor.



221
222
223
# File 'lib/stanfordparser.rb', line 221

def initialize(suppressEscaping = false)
  super("edu.stanford.nlp.process.DocumentPreprocessor", suppressEscaping)
end

Dynamic Method Handling

This class handles dynamic methods through the method_missing method in the class Rjb::JavaObjectWrapper

Instance Method Details

#getSentencesFromString(s) ⇒ Object

Returns a list of sentences in a string.



226
227
228
229
# File 'lib/stanfordparser.rb', line 226

def getSentencesFromString(s)
  s = Rjb::JavaObjectWrapper.new("java.io.StringReader", s)
  _invoke(:getSentencesFromText, "Ljava.io.Reader;", s.java_object)
end

#inspectObject



231
232
233
# File 'lib/stanfordparser.rb', line 231

def inspect
  "<#{self.class.to_s.split('::').last}>"
end

#to_sObject



235
236
237
# File 'lib/stanfordparser.rb', line 235

def to_s
  inspect
end