Class: BioInterchange::TextMining::PDFxXMLReader
- Defined in:
- lib/biointerchange/textmining/pdfx_xml_reader.rb
Defined Under Namespace
Classes: MyListener
Instance Method Summary collapse
-
#deserialize(inputstream) ⇒ Object
Reads input stream and returns associated
BioInterchange::TextMining::Document
model.
Methods inherited from TMReader
Methods inherited from Reader
Constructor Details
This class inherits a constructor from BioInterchange::TextMining::TMReader
Instance Method Details
#deserialize(inputstream) ⇒ Object
Reads input stream and returns associated BioInterchange::TextMining::Document
model
Presently I assume a single document per xml file, and that <section> tags cannot nest. I also assume that a Content::DOCUMENT type is everything between the <article> tags.
inputstream
-
Input IO stream to deserialize
37 38 39 40 41 42 43 |
# File 'lib/biointerchange/textmining/pdfx_xml_reader.rb', line 37 def deserialize(inputstream) raise BioInterchange::Exceptions::ImplementationReaderError, 'InputStream not of type IO, cannot read.' unless inputstream.kind_of?(IO) or inputstream.kind_of?(String) @input = inputstream pdfx end |