Module: LuckySneaks::Unidecoder

Defined in:
lib/lucky_sneaks/unidecoder.rb

Constant Summary collapse

CODEPOINTS =

Contains Unicode codepoints, loading as needed from YAML files

Hash.new { |h, k|
  h[k] = YAML::load_file(File.join(File.dirname(__FILE__), "unidecoder_data", "#{k}.yml"))
}

Class Method Summary collapse

Class Method Details

.decode(string) ⇒ Object

Returns string with its UTF-8 characters transliterated to ASCII ones

You’re probably better off just using the added String#to_ascii



14
15
16
17
18
19
20
21
22
23
24
# File 'lib/lucky_sneaks/unidecoder.rb', line 14

def decode(string)
  string.gsub(/[^\x00-\x7f]/u) do |codepoint|
    unpacked = codepoint.unpack("U")[0]
    begin
      CODEPOINTS[code_group(unpacked)][grouped_point(unpacked)]
    rescue
      # Hopefully this won't come up much
      "?"
    end
  end
end

.encode(codepoint) ⇒ Object

Returns character for the given Unicode codepoint



27
28
29
# File 'lib/lucky_sneaks/unidecoder.rb', line 27

def encode(codepoint)
  ["0x#{codepoint}".to_i(16)].pack("U")
end

.in_yaml_file(character) ⇒ Object

Returns string indicating which file (and line) contains the transliteration value for the character



33
34
35
36
# File 'lib/lucky_sneaks/unidecoder.rb', line 33

def in_yaml_file(character)
  unpacked = character.unpack("U")[0]
  "#{code_group(unpacked)}.yml (line #{grouped_point(unpacked) + 2})"
end