Class: Jrtika

Inherits:
Object
  • Object
show all
Defined in:
lib/jrtika.rb

Class Method Summary collapse

Class Method Details

.read(file_path) ⇒ Object

Read text and metadata from a file and return a hash containing both.

Jrtika.read file_path


10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
# File 'lib/jrtika.rb', line 10

def self.read(file_path)
  @RESOURCE_NAME_KEY = ""

      is = java.io.FileInputStream.new(file_path)
      parser = org.apache.tika.parser.AutoDetectParser.new()

      handler = org.apache.tika.sax.BodyContentHandler.new(-1)
      meta = org.apache.tika..Metadata.new()

      meta.set(@RESOURCE_NAME_KEY,file_path)

      parser.parse(is, handler, meta)
      is.close()
      {:text => handler.toString(), :metadata => meta.toString()}

end