DocPdfToText

Author

Eric Silverberg (www.ericsilverberg.com)

Copyright

Copyright © 2009 Eric Silverberg

License

MIT (Go Beavers!)

Git

github.com/esilverberg/DocPdfToText/tree/master

This gem enables you to interact with document conversion libraries through Rails to convert .doc, .docx and .pdf files into text

Requirements

* Antiword: http://www.winfield.demon.nl/
* pdftotext: http://packages.ubuntu.com/hardy/poppler-utils
* OdfConverter: http://www.oooninja.com/2008/01/convert-openxml-docx-etc-in-linux-using.html
* Openoffice-headless: http://wiki.alfresco.com/wiki/Running_OpenOffice_From_Terminal
* DocumentConverter.py (included): http://artofsolving.com/opensource/pyodconverter

Example Usage

DocPdfToText adds several methods to your model. The only one you will want to call is file_to_txt, shown below:

include DocPdfToText
...
puts file_to_txt(test_file)

Copyright © 2009 esilverberg. See LICENSE for details.