Class: Paperclip::Document::Processors::Reader
- Inherits:
-
Paperclip::Document::Processor
- Object
- Processor
- Paperclip::Document::Processor
- Paperclip::Document::Processors::Reader
- Defined in:
- lib/paperclip/document/processors/reader.rb
Overview
This processor extract the OCR text of the file
Instance Attribute Summary collapse
-
#clean ⇒ Object
Returns the value of attribute clean.
-
#language ⇒ Object
Returns the value of attribute language.
-
#text_column ⇒ Object
Returns the value of attribute text_column.
Attributes inherited from Paperclip::Document::Processor
Instance Method Summary collapse
-
#default_text_column ⇒ Object
Returns the name of the default text column.
-
#initialize(file, options = {}, attachment = nil) ⇒ Reader
constructor
A new instance of Reader.
-
#make ⇒ Object
Extract the text of all the document.
-
#text_column? ⇒ Boolean
Check if the default text column is present.
Methods inherited from Paperclip::Document::Processor
Constructor Details
#initialize(file, options = {}, attachment = nil) ⇒ Reader
Returns a new instance of Reader.
10 11 12 13 14 15 16 17 18 19 20 21 |
# File 'lib/paperclip/document/processors/reader.rb', line 10 def initialize(file, = {}, = nil) super(file, , ) if @options[:text_column].nil? and text_column? @options[:text_column] = default_text_column end @language = @options[:language] @text_column = @options[:text_column] unless @text_column raise Paperclip::Error, "No content text column given" end @clean = (RUBY_VERSION >= "2.0" ? false : .has_key?(:clean) ? !![:clean] : true) end |
Instance Attribute Details
#clean ⇒ Object
Returns the value of attribute clean.
8 9 10 |
# File 'lib/paperclip/document/processors/reader.rb', line 8 def clean @clean end |
#language ⇒ Object
Returns the value of attribute language.
8 9 10 |
# File 'lib/paperclip/document/processors/reader.rb', line 8 def language @language end |
#text_column ⇒ Object
Returns the value of attribute text_column.
8 9 10 |
# File 'lib/paperclip/document/processors/reader.rb', line 8 def text_column @text_column end |
Instance Method Details
#default_text_column ⇒ Object
Returns the name of the default text column
49 50 51 |
# File 'lib/paperclip/document/processors/reader.rb', line 49 def default_text_column @attachment.name.to_s + "_content_text" end |
#make ⇒ Object
Extract the text of all the document
24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 |
# File 'lib/paperclip/document/processors/reader.rb', line 24 def make destination_path = tmp_dir.to_s = {output: destination_path, clean: @clean} [:language] = (language.is_a?(Proc) ? language.call(.instance) : language) Docsplit.extract_text(file_path.to_s, ) destination_file = File.join(destination_path, basename + ".txt") instance = @attachment.instance f = File.open(destination_file) instance[text_column] = f.read instance.run_callbacks(:save) { false } f.close return file end |
#text_column? ⇒ Boolean
Check if the default text column is present
41 42 43 44 45 46 |
# File 'lib/paperclip/document/processors/reader.rb', line 41 def text_column? expected_column = default_text_column return instance.class.columns.detect do |column| column.name.to_s == expected_column end end |