Class: String

Inherits:
Object
  • Object
show all
Defined in:
lib/smail/mime/coding_extensions.rb

Overview

Extensions to the String library for encoding and decoding of MIME data.

Instance Method Summary collapse

Instance Method Details

#best_mime_encodingObject

Returns the MIME encoding that is likely to produce the shortest encoded string, either :none, :base64, or :quoted_printable.



92
93
94
95
96
97
98
99
100
# File 'lib/smail/mime/coding_extensions.rb', line 92

def best_mime_encoding
  if self.is_ascii?
    :none
  elsif self.length > (self.mb_chars.length * 1.1)
    :base64
  else
    :quoted_printable
  end
end

#decode_base64Object

Returns this string decoded from base64 as defined in RFC2045, section 6.8.



30
31
32
33
34
35
36
37
38
39
# File 'lib/smail/mime/coding_extensions.rb', line 30

def decode_base64
  #self.unpack("m*").first
  # This should be the above line but due to a bug in the ruby base64 decoder
  # it will only decode base64 where the lines are in multiples of 4, this is
  # contrary to RFC2045 which says that all characters other than the 65 used
  # are to be ignored. Currently we remove all the other characters but it 
  # might be better to use it's advice to only remove line breaks and white
  # space
  self.tr("^A-Za-z0-9+/=", "").unpack("m*").first
end

#decode_base64!Object

Performs decode_base64 in place, and returns the string.



42
43
44
# File 'lib/smail/mime/coding_extensions.rb', line 42

def decode_base64!
  self.replace(self.decode_base64)
end

#decode_mime(method = nil) ⇒ Object

Decodes this string according to method, where method is :base64, :quoted_printable, or :none.

If method is not supplied or is nil, guess_mime_encoding is used to try to pick an appropriate method.

Method can also be a string: ‘q’, ‘quoted-printable’, ‘b’, or ‘base64’ This lets you pass in methods directly from Content-Transfer-Encoding headers, or from RFC2047 words. Matching is case-insensitive.



111
112
113
114
115
116
117
118
119
120
121
122
123
124
# File 'lib/smail/mime/coding_extensions.rb', line 111

def decode_mime(method = nil)
  method ||= guess_mime_encoding
  method = method.downcase if method.kind_of?(String)
  case method
    when :none
      self
    when :base64, 'b', 'base64'
      self.decode_base64
    when :quoted_printable, 'q', 'quoted-printable'
      self.decode_quoted_printable
    else
      raise ArgumentError, "Bad MIME encoding"
  end
end

#decode_mime!(method = nil) ⇒ Object

Performs decode_mime in place, and returns the string.



127
128
129
# File 'lib/smail/mime/coding_extensions.rb', line 127

def decode_mime!(method = nil)
  self.replace(self.decode_mime(method))
end

#decode_quoted_printableObject

Returns this string decoded from quoted-printable as defined in RFC2045, section 6.7.



65
66
67
# File 'lib/smail/mime/coding_extensions.rb', line 65

def decode_quoted_printable
  self.unpack("M*").first
end

#decode_quoted_printable!Object

Performs decode_quoted_printable in place, and returns the string.



70
71
72
# File 'lib/smail/mime/coding_extensions.rb', line 70

def decode_quoted_printable!
  self.replace(self.decode_quoted_printable)
end

#encode_base64Object

Returns this string encoded as base64 as defined in RFC2045, section 6.8.



20
21
22
# File 'lib/smail/mime/coding_extensions.rb', line 20

def encode_base64
  [self].pack("m*")
end

#encode_base64!Object

Performs encode_base64 in place, and returns the string.



25
26
27
# File 'lib/smail/mime/coding_extensions.rb', line 25

def encode_base64!
  self.replace(self.encode_base64)
end

#encode_quoted_printableObject

Returns this string encoded as quoted-printable as defined in RFC2045, section 6.7.



47
48
49
50
51
52
53
54
55
56
57
# File 'lib/smail/mime/coding_extensions.rb', line 47

def encode_quoted_printable
  result = [self].pack("M*")
  # Ruby's quoted printable encoding uses soft line breaks to buffer spaces
  # at the end of lines, rather than encoding them with =20. We fix this.
  result.gsub!(/( +)=\n\n/) { "=20" * $1.length + "\n" }
  # Ruby's quoted printable encode puts a soft line break on the end of any
  # string that doesn't already end in a hard line break, so we have to
  # clean it up.
  result.gsub!(/=\n\Z/, '')
  result
end

#encode_quoted_printable!Object

Performs encode_quoted_printable in place, and returns the string.



60
61
62
# File 'lib/smail/mime/coding_extensions.rb', line 60

def encode_quoted_printable!
  self.replace(self.encode_quoted_printable)
end

#guess_mime_encodingObject

Guesses whether this string is encoded in base64 or quoted-printable.

Returns either :base64 or :quoted_printable



77
78
79
80
81
82
83
84
85
86
87
88
# File 'lib/smail/mime/coding_extensions.rb', line 77

def guess_mime_encoding
  # Grab the first line and have a guess?
  # A multiple of 4 and no characters that aren't in base64 ?
  # Need to allow for = at end of base64 string
  squashed = self.tr("\r\n\s", '').strip.sub(/=*\Z/, '')
  if squashed.length.remainder(4) == 0 && squashed.count("^A-Za-z0-9+/") == 0
      :base64
  else
      :quoted_printable
  end
  # or should we just try both and see what works?
end

#iconv(to_charset, from_charset) ⇒ Object

Converts this string to to_charset from from_charset using Iconv.

Because there are cases where charsets are encoded incorrectly on the ‘net we also allow for them and attempt to fix them up here. If conversion ultimately fails we remove all characters 0x80 and above, replacing them with ! symbols and effectively making it a US-ASCII (and therefore UTF-8) string.



138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
# File 'lib/smail/mime/coding_extensions.rb', line 138

def iconv(to_charset, from_charset)
  failed = false
  begin
    converted = Iconv.new(to_charset, from_charset).iconv(self)
  rescue Iconv::IllegalSequence
    case from_charset.downcase
      when 'us-ascii'
        # Some mailers do not send a charset when it should be CP1252,
        # the default Windows Latin charset
        begin
          converted = Iconv.new(to_charset, 'cp1252').iconv(self)
        rescue Iconv::IllegalSequence
          failed = true
        end
      when 'ks_c_5601-1987'
        # Microsoft products erroneously use this for what should be CP949
        # see http://tagunov.tripod.com/cjk.html
        begin
          converted = Iconv.new(to_charset, 'cp949').iconv(self)
        rescue Iconv::IllegalSequence, Iconv::InvalidCharacter
          failed = true
        end
      else
        failed = true
    end
  rescue Iconv::InvalidCharacter
    if self =~ /\n$/
      # Some messages can come in with a superfluous new line on the end,
      # which screws up the encoding. (ISO-2022-JP for example.)
      begin
        converted = Iconv.new(to_charset, 'iso-2022-jp').iconv(self.chomp) + "\n"
      rescue Iconv::InvalidCharacter
        converted = self.tr("\200-\377", "\041")
      end
    else
      converted = self.tr("\200-\377", "\041")
    end
  end

  if failed
    begin
      converted = Iconv.new(to_charset + '//IGNORE', from_charset).iconv(self)
    rescue Iconv::InvalidCharacter
      converted = self.tr("\200-\377", "\041")
    end
  end

  converted
end

#iconv!(to_charset, from_charset) ⇒ Object



188
189
190
# File 'lib/smail/mime/coding_extensions.rb', line 188

def iconv!(to_charset, from_charset)
  self.replace(self.iconv(to_charset, from_charset))
end

#is_ascii?Boolean

Returns true if the string contains only valid ASCII characters (i.e. nothing over ASCII 127).

Returns:

  • (Boolean)


15
16
17
# File 'lib/smail/mime/coding_extensions.rb', line 15

def is_ascii?
    self.length == self.tr("\200-\377", '').length
end

#is_space?Boolean

Returns true if the string consists entirely of whitespace. (The empty string will return false.)

Returns:

  • (Boolean)


9
10
11
# File 'lib/smail/mime/coding_extensions.rb', line 9

def is_space?
  return Regexp.new('\A\s+\Z', Regexp::MULTILINE).match(self) != nil
end