Method: Parser::Source::Buffer.recognize_encoding

Defined in:
lib/parser/source/buffer.rb

.recognize_encoding(string) ⇒ String?

Try to recognize encoding of string as Ruby would, i.e. by looking for magic encoding comment or UTF-8 BOM. string can be in any encoding.

Parameters:

  • string (String)

Returns:

  • (String, nil)

    encoding name, if recognized

Raises:


52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
# File 'lib/parser/source/buffer.rb', line 52

def self.recognize_encoding(string)
  return if string.empty?

  # extract the first two lines in an efficient way
  string =~ /\A(.*)\n?(.*\n)?/
  first_line, second_line = $1, $2

  if first_line.start_with?("\xef\xbb\xbf".freeze) # BOM
    return Encoding::UTF_8
  elsif first_line[0, 2] == '#!'.freeze
    encoding_line = second_line
  else
    encoding_line = first_line
  end

  return nil if encoding_line.nil? || encoding_line[0] != '#'

  if (result = ENCODING_RE.match(encoding_line))
    begin
      Encoding.find(result[3] || result[4] || result[6])
    rescue ArgumentError => e
      raise Parser::UnknownEncodingInMagicComment, e.message
    end
  else
    nil
  end
end