Class: String
- Inherits:
-
Object
- Object
- String
- Defined in:
- lib/smail/mime/coding_extensions.rb
Overview
Extensions to the String library for encoding and decoding of MIME data.
Instance Method Summary collapse
-
#best_mime_encoding ⇒ Object
Returns the MIME encoding that is likely to produce the shortest encoded string, either :none, :base64, or :quoted_printable.
-
#decode_base64 ⇒ Object
Returns this string decoded from base64 as defined in RFC2045, section 6.8.
-
#decode_base64! ⇒ Object
Performs decode_base64 in place, and returns the string.
-
#decode_mime(method = nil) ⇒ Object
Decodes this string according to method, where method is :base64, :quoted_printable, or :none.
-
#decode_mime!(method = nil) ⇒ Object
Performs decode_mime in place, and returns the string.
-
#decode_quoted_printable ⇒ Object
Returns this string decoded from quoted-printable as defined in RFC2045, section 6.7.
-
#decode_quoted_printable! ⇒ Object
Performs decode_quoted_printable in place, and returns the string.
-
#encode_base64 ⇒ Object
Returns this string encoded as base64 as defined in RFC2045, section 6.8.
-
#encode_base64! ⇒ Object
Performs encode_base64 in place, and returns the string.
-
#encode_quoted_printable ⇒ Object
Returns this string encoded as quoted-printable as defined in RFC2045, section 6.7.
-
#encode_quoted_printable! ⇒ Object
Performs encode_quoted_printable in place, and returns the string.
-
#guess_mime_encoding ⇒ Object
Guesses whether this string is encoded in base64 or quoted-printable.
-
#iconv(to_charset, from_charset) ⇒ Object
Converts this string to to_charset from from_charset using Iconv.
- #iconv!(to_charset, from_charset) ⇒ Object
-
#is_ascii? ⇒ Boolean
Returns true if the string contains only valid ASCII characters (i.e. nothing over ASCII 127).
-
#is_space? ⇒ Boolean
Returns true if the string consists entirely of whitespace.
Instance Method Details
#best_mime_encoding ⇒ Object
Returns the MIME encoding that is likely to produce the shortest encoded string, either :none, :base64, or :quoted_printable.
92 93 94 95 96 97 98 99 100 |
# File 'lib/smail/mime/coding_extensions.rb', line 92 def best_mime_encoding if self.is_ascii? :none elsif self.length > (self.mb_chars.length * 1.1) :base64 else :quoted_printable end end |
#decode_base64 ⇒ Object
Returns this string decoded from base64 as defined in RFC2045, section 6.8.
30 31 32 33 34 35 36 37 38 39 |
# File 'lib/smail/mime/coding_extensions.rb', line 30 def decode_base64 #self.unpack("m*").first # This should be the above line but due to a bug in the ruby base64 decoder # it will only decode base64 where the lines are in multiples of 4, this is # contrary to RFC2045 which says that all characters other than the 65 used # are to be ignored. Currently we remove all the other characters but it # might be better to use it's advice to only remove line breaks and white # space self.tr("^A-Za-z0-9+/=", "").unpack("m*").first end |
#decode_base64! ⇒ Object
Performs decode_base64 in place, and returns the string.
42 43 44 |
# File 'lib/smail/mime/coding_extensions.rb', line 42 def decode_base64! self.replace(self.decode_base64) end |
#decode_mime(method = nil) ⇒ Object
Decodes this string according to method, where method is :base64, :quoted_printable, or :none.
If method is not supplied or is nil, guess_mime_encoding is used to try to pick an appropriate method.
Method can also be a string: ‘q’, ‘quoted-printable’, ‘b’, or ‘base64’ This lets you pass in methods directly from Content-Transfer-Encoding headers, or from RFC2047 words. Matching is case-insensitive.
111 112 113 114 115 116 117 118 119 120 121 122 123 124 |
# File 'lib/smail/mime/coding_extensions.rb', line 111 def decode_mime(method = nil) method ||= guess_mime_encoding method = method.downcase if method.kind_of?(String) case method when :none self when :base64, 'b', 'base64' self.decode_base64 when :quoted_printable, 'q', 'quoted-printable' self.decode_quoted_printable else raise ArgumentError, "Bad MIME encoding" end end |
#decode_mime!(method = nil) ⇒ Object
Performs decode_mime in place, and returns the string.
127 128 129 |
# File 'lib/smail/mime/coding_extensions.rb', line 127 def decode_mime!(method = nil) self.replace(self.decode_mime(method)) end |
#decode_quoted_printable ⇒ Object
Returns this string decoded from quoted-printable as defined in RFC2045, section 6.7.
65 66 67 |
# File 'lib/smail/mime/coding_extensions.rb', line 65 def decode_quoted_printable self.unpack("M*").first end |
#decode_quoted_printable! ⇒ Object
Performs decode_quoted_printable in place, and returns the string.
70 71 72 |
# File 'lib/smail/mime/coding_extensions.rb', line 70 def decode_quoted_printable! self.replace(self.decode_quoted_printable) end |
#encode_base64 ⇒ Object
Returns this string encoded as base64 as defined in RFC2045, section 6.8.
20 21 22 |
# File 'lib/smail/mime/coding_extensions.rb', line 20 def encode_base64 [self].pack("m*") end |
#encode_base64! ⇒ Object
Performs encode_base64 in place, and returns the string.
25 26 27 |
# File 'lib/smail/mime/coding_extensions.rb', line 25 def encode_base64! self.replace(self.encode_base64) end |
#encode_quoted_printable ⇒ Object
Returns this string encoded as quoted-printable as defined in RFC2045, section 6.7.
47 48 49 50 51 52 53 54 55 56 57 |
# File 'lib/smail/mime/coding_extensions.rb', line 47 def encode_quoted_printable result = [self].pack("M*") # Ruby's quoted printable encoding uses soft line breaks to buffer spaces # at the end of lines, rather than encoding them with =20. We fix this. result.gsub!(/( +)=\n\n/) { "=20" * $1.length + "\n" } # Ruby's quoted printable encode puts a soft line break on the end of any # string that doesn't already end in a hard line break, so we have to # clean it up. result.gsub!(/=\n\Z/, '') result end |
#encode_quoted_printable! ⇒ Object
Performs encode_quoted_printable in place, and returns the string.
60 61 62 |
# File 'lib/smail/mime/coding_extensions.rb', line 60 def encode_quoted_printable! self.replace(self.encode_quoted_printable) end |
#guess_mime_encoding ⇒ Object
Guesses whether this string is encoded in base64 or quoted-printable.
Returns either :base64 or :quoted_printable
77 78 79 80 81 82 83 84 85 86 87 88 |
# File 'lib/smail/mime/coding_extensions.rb', line 77 def guess_mime_encoding # Grab the first line and have a guess? # A multiple of 4 and no characters that aren't in base64 ? # Need to allow for = at end of base64 string squashed = self.tr("\r\n\s", '').strip.sub(/=*\Z/, '') if squashed.length.remainder(4) == 0 && squashed.count("^A-Za-z0-9+/") == 0 :base64 else :quoted_printable end # or should we just try both and see what works? end |
#iconv(to_charset, from_charset) ⇒ Object
Converts this string to to_charset from from_charset using Iconv.
Because there are cases where charsets are encoded incorrectly on the ‘net we also allow for them and attempt to fix them up here. If conversion ultimately fails we remove all characters 0x80 and above, replacing them with ! symbols and effectively making it a US-ASCII (and therefore UTF-8) string.
138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 |
# File 'lib/smail/mime/coding_extensions.rb', line 138 def iconv(to_charset, from_charset) failed = false begin converted = Iconv.new(to_charset, from_charset).iconv(self) rescue Iconv::IllegalSequence case from_charset.downcase when 'us-ascii' # Some mailers do not send a charset when it should be CP1252, # the default Windows Latin charset begin converted = Iconv.new(to_charset, 'cp1252').iconv(self) rescue Iconv::IllegalSequence failed = true end when 'ks_c_5601-1987' # Microsoft products erroneously use this for what should be CP949 # see http://tagunov.tripod.com/cjk.html begin converted = Iconv.new(to_charset, 'cp949').iconv(self) rescue Iconv::IllegalSequence, Iconv::InvalidCharacter failed = true end else failed = true end rescue Iconv::InvalidCharacter if self =~ /\n$/ # Some messages can come in with a superfluous new line on the end, # which screws up the encoding. (ISO-2022-JP for example.) begin converted = Iconv.new(to_charset, 'iso-2022-jp').iconv(self.chomp) + "\n" rescue Iconv::InvalidCharacter converted = self.tr("\200-\377", "\041") end else converted = self.tr("\200-\377", "\041") end end if failed begin converted = Iconv.new(to_charset + '//IGNORE', from_charset).iconv(self) rescue Iconv::InvalidCharacter converted = self.tr("\200-\377", "\041") end end converted end |
#iconv!(to_charset, from_charset) ⇒ Object
188 189 190 |
# File 'lib/smail/mime/coding_extensions.rb', line 188 def iconv!(to_charset, from_charset) self.replace(self.iconv(to_charset, from_charset)) end |
#is_ascii? ⇒ Boolean
Returns true if the string contains only valid ASCII characters (i.e. nothing over ASCII 127).
15 16 17 |
# File 'lib/smail/mime/coding_extensions.rb', line 15 def is_ascii? self.length == self.tr("\200-\377", '').length end |
#is_space? ⇒ Boolean
Returns true if the string consists entirely of whitespace. (The empty string will return false.)
9 10 11 |
# File 'lib/smail/mime/coding_extensions.rb', line 9 def is_space? return Regexp.new('\A\s+\Z', Regexp::MULTILINE).match(self) != nil end |