Class: Stevedore::StevedoreEmail
- Inherits:
-
StevedoreBlob
- Object
- StevedoreBlob
- Stevedore::StevedoreEmail
- Defined in:
- lib/parsers/stevedore_email.rb
Instance Attribute Summary collapse
-
#attachments ⇒ Object
TODO write wrt other fields.
-
#content_type ⇒ Object
TODO write wrt other fields.
-
#creation_date ⇒ Object
TODO write wrt other fields.
-
#dkim_verified ⇒ Object
TODO write wrt other fields.
-
#message_cc ⇒ Object
TODO write wrt other fields.
-
#message_from ⇒ Object
TODO write wrt other fields.
-
#message_to ⇒ Object
TODO write wrt other fields.
-
#subject ⇒ Object
TODO write wrt other fields.
Attributes inherited from StevedoreBlob
#download_url, #extra, #text, #title
Class Method Summary collapse
Instance Method Summary collapse
Methods inherited from StevedoreBlob
#analyze!, #clean_text, #initialize
Constructor Details
This class inherits a constructor from Stevedore::StevedoreBlob
Instance Attribute Details
#attachments ⇒ Object
TODO write wrt other fields. where do those go???
13 14 15 |
# File 'lib/parsers/stevedore_email.rb', line 13 def @attachments end |
#content_type ⇒ Object
TODO write wrt other fields. where do those go???
13 14 15 |
# File 'lib/parsers/stevedore_email.rb', line 13 def content_type @content_type end |
#creation_date ⇒ Object
TODO write wrt other fields. where do those go???
13 14 15 |
# File 'lib/parsers/stevedore_email.rb', line 13 def creation_date @creation_date end |
#dkim_verified ⇒ Object
TODO write wrt other fields. where do those go???
13 14 15 |
# File 'lib/parsers/stevedore_email.rb', line 13 def dkim_verified @dkim_verified end |
#message_cc ⇒ Object
TODO write wrt other fields. where do those go???
13 14 15 |
# File 'lib/parsers/stevedore_email.rb', line 13 def @message_cc end |
#message_from ⇒ Object
TODO write wrt other fields. where do those go???
13 14 15 |
# File 'lib/parsers/stevedore_email.rb', line 13 def @message_from end |
#message_to ⇒ Object
TODO write wrt other fields. where do those go???
13 14 15 |
# File 'lib/parsers/stevedore_email.rb', line 13 def @message_to end |
#subject ⇒ Object
TODO write wrt other fields. where do those go???
13 14 15 |
# File 'lib/parsers/stevedore_email.rb', line 13 def subject @subject end |
Class Method Details
.new_from_tika(content, metadata, download_url, filepath) ⇒ Object
15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 |
# File 'lib/parsers/stevedore_email.rb', line 15 def self.new_from_tika(content, , download_url, filepath) t = super t.creation_date = ["Creation-Date"] t. = ["Message-To"] t. = ["Message-From"] t. = ["Message-Cc"] t.title = t.subject = ["subject"] t.dkim_verified = begin Dkim::Verifier.new(filepath).verify! rescue Dkim::DkimError false end t. = ["X-Attachments"].to_s.split("|").map do || = CGI::unescape() possible_filename = File.join(File.dirname(filepath), ) eml_filename = File.join(File.dirname(filepath), File.basename(filepath, '.eml') + '-' + ) possible_s3_url = S3_BASEPATH + '/' + CGI::escape(File.basename(possible_filename)) possible_eml_s3_url = S3_BASEPATH + '/' + CGI::escape(File.basename(eml_filename)) # we might be uploading from the disk in which case we see if we can find an attachment on disk with the name from X-Attachments # or we might be uploading via S3, in which case we see if an object exists, accessible on S3, with the path from X-Attachments # TODO: support private S3 buckets s3_url = if File.exists? possible_filename possible_s3_url elsif File.exists? eml_filename possible_eml_s3_url else nil end s3_url = begin if Manticore::Client.new.head(possible_s3_url).code == 200 puts "found attachment: #{possible_s3_url}" possible_s3_url elsif Manticore::Client.new.head(possible_eml_s3_url).code == 200 puts "found attachment: #{possible_eml_s3_url}" possible_eml_s3_url end rescue nil end if s3_url.nil? if s3_url.nil? STDERR.puts "Tika X-Attachments: " + ["X-Attachments"].to_s.inspect STDERR.puts "Couldn't find attachment '#{possible_s3_url}' aka '#{possible_eml_s3_url}' from '#{}' from #{download_url}" end s3_url end.compact t end |
Instance Method Details
#to_hash ⇒ Object
65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 |
# File 'lib/parsers/stevedore_email.rb', line 65 def to_hash { "sha1" => Digest::SHA1.hexdigest(download_url), "title" => title.to_s, "source_url" => download_url.to_s, "file" => { "title" => title.to_s, "file" => text.to_s }, "analyzed" => { "body" => text.to_s, "metadata" => { "Content-Type" => content_type || "message/rfc822", "Creation-Date" => creation_date, "Message-To" => .is_a?(Enumerable) ? : [ ], "Message-From" => .is_a?(Enumerable) ? : [ ], "Message-Cc" => .is_a?(Enumerable) ? : [ ], "subject" => subject, "attachments" => , "dkim_verified" => dkim_verified } }, "_updatedAt" => Time.now } end |