Class: LOLspeak::Tranzlator

Inherits:
Object
  • Object
show all
Defined in:
lib/lolspeak.rb

Overview

A class to perform English to LOLspeak translation based on a dictionary of words.

Instance Attribute Summary collapse

Class Method Summary collapse

Instance Method Summary collapse

Constructor Details

#initialize(dictionary) ⇒ Tranzlator

Creates a Tranzlator from the given dictionary

:call-seq:

initialize(dictionary) -> Tranzlator


44
45
46
47
48
49
50
# File 'lib/lolspeak.rb', line 44

def initialize(dictionary)
  @dictionary = dictionary
  @traced_words = {}
  @try_heuristics = false
  @translated_heuristics = {}
  @heuristics_exclude = Set.new
end

Instance Attribute Details

#heuristics_excludeObject

(Set) Words to exclude if heuristics are on.



25
26
27
# File 'lib/lolspeak.rb', line 25

def heuristics_exclude
  @heuristics_exclude
end

#traceObject

(bool -> false) Wether or not to record translations



15
16
17
# File 'lib/lolspeak.rb', line 15

def trace
  @trace
end

#traced_wordsObject (readonly)

(Hash) Stores all translations, if trace is true.



20
21
22
# File 'lib/lolspeak.rb', line 20

def traced_words
  @traced_words
end

#translated_heuristicsObject (readonly)

(Hash) Stores all words translated via heuristics, if try_heuristics is true.



23
24
25
# File 'lib/lolspeak.rb', line 23

def translated_heuristics
  @translated_heuristics
end

#try_heuristicsObject

(bool -> false) If true, try heurstics when translating words. If false, only use the dictionary for translation.



18
19
20
# File 'lib/lolspeak.rb', line 18

def try_heuristics
  @try_heuristics
end

Class Method Details

.from_file(file) ⇒ Object

Creates a Tranzlator using a dictionary from a YAML file

:call-seq:

Tranzlator.from_file(file)        -> Tranzlator


33
34
35
36
# File 'lib/lolspeak.rb', line 33

def from_file(file)
  dictionary = YAML::load_file(file)
  return Tranzlator.new(dictionary)
end

Instance Method Details

#clear_traced_wordsObject

Clears the trace word hash



122
123
124
# File 'lib/lolspeak.rb', line 122

def clear_traced_words
  @traced_words = {}
end

#clear_translated_heuristicsObject

Clears the hash storing words translated by heuristics



127
128
129
# File 'lib/lolspeak.rb', line 127

def clear_translated_heuristics
  @translated_heuristics = {}
end

#translate_word(word, &filter) ⇒ Object

Translates a single word into LOLspeak. By default, the result is in all lower case:

translator.translate_word("Hi") -> "oh hai"

If a block is given the word may be transformed. You could use this to upper case or XML encode the result. This example upper cases the result:

translator.translate_word("hi") { |w| w.upcase } -> "OH HAI"

If heuristics are off, then only words in the dictionary are translated. If heuristics are on, then words not in the dictionary may be translated using standard LOLspeak heuristics, such as “*tion” -> “*shun”.

:call-seq:

translate_word(word)                       -> String
translate_word(word) { |word| transform }  -> String


71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
# File 'lib/lolspeak.rb', line 71

def translate_word(word, &filter)
  word = word.downcase
  lol_word = @dictionary[word]
  if lol_word.nil?
    lol_word = @dictionary[word.gsub("", "'")]
  end
  
  if lol_word.nil? and word.match(/(.*)([\’\']\w+)$/)
    prefix, suffix = $1, $2
    lol_word = @dictionary[prefix]
    lol_word += suffix if !lol_word.nil?
  end
  
  if lol_word.nil? and @try_heuristics and !@heuristics_exclude.member?(word)
    if (word =~ /(.*)tion(s?)$/)
      lol_word = "#{$1}shun#{$2}"
    elsif (word =~ /(.*)ed$/)
      lol_word = "#{$1}d"
    elsif (word =~ /(.*)ing$/)
      lol_word = "#{$1}in"
    elsif (word =~ /(.*)ss$/)
      lol_word = "#{$1}s"
    elsif (word =~ /(.*)er$/)
      lol_word = "#{$1}r"
    elsif (word !~ /ous$/) and (word =~ /^([0-9A-Za-z_]+)s$/)
      lol_word = "#{$1}z"
    end
    if (word =~ /ph/)
      lol_word = word.dup if lol_word.nil?
      lol_word.gsub!(/ph/, 'f')
    end

    if !lol_word.nil?
      @translated_heuristics[word] = lol_word
    end
  end

  if lol_word.nil?
    lol_word = word
  else
    @traced_words[word] = lol_word
  end
  
  if !filter.nil?
    lol_word = filter.call(lol_word)
  end

  return lol_word
end

#translate_words(words, &filter) ⇒ Object

Translates all the words in a string. If a block is given, it is called to transform each individual word.

:call-seq:

translate_words(words)                       -> String
translate_words(words) { |word| transform }  -> String


138
139
140
141
142
143
144
145
146
147
148
# File 'lib/lolspeak.rb', line 138

def translate_words(words, &filter)
  lol_words = words.gsub(/(\w[\w’\']*)(\s*)/) do
    word, space = $1, $2
    lol_word = translate_word(word, &filter)

    # Stick the space back on, as long is it's not empty
    lol_word += space if lol_word != ""
    lol_word
  end
  return lol_words
end

#translate_xml_element!(xml_element, &filter) ⇒ Object

Translates the REXML::Text parts of a single REXML::Element. The element is modified in place.

If a block is given, it is called to transform each individual word. By default, each word is XML escaped, so this transform applies on top of that.

:call-seq:

translate_xml_element!(xml_element)
translate_xml_element!(xml_element) { |word| transform }


161
162
163
164
165
166
167
168
169
170
171
172
# File 'lib/lolspeak.rb', line 161

def translate_xml_element!(xml_element, &filter)
  xml_element.texts.each do |text|
    string = REXML::Text::unnormalize(text.to_s)
    string = self.translate_words(string) do |w|
      w = REXML::Text::normalize(w)
      w = filter.call(w) if !filter.nil?
      w
    end
    new_text = REXML::Text.new(string, true, nil, true)
    text.replace_with(new_text)
  end
end

#translate_xml_element_recursive!(xml_element, &filter) ⇒ Object

Translates the REXML::Text parts of an REXML::Element and all child elements. The elements are modified in place.

If a block is given, it iscalled to transform each individual word. By default, each word is XML escaped, so this transform applies on top of that.

:call-seq:

translate_xml_element!(xml_element)
translate_xml_element!(xml_element) { |word| transform }


185
186
187
# File 'lib/lolspeak.rb', line 185

def translate_xml_element_recursive!(xml_element, &filter)
  xml_element.each_recursive { |e| translate_xml_element!(e, &filter) }
end

#translate_xml_string(xml_string, &filter) ⇒ Object

Translates the text parts of a well-formed XML string. It parses the string using REXML and then translates the root element using translate_xml_element_recursive!.

If a block is given, it is called to transform each individual word.

:call-seq:

translate_xml_string(xml_string)                      -> String
translate_xml_string(xml_string) { |word| transform } -> String


199
200
201
202
203
# File 'lib/lolspeak.rb', line 199

def translate_xml_string(xml_string, &filter)
  xml_doc = REXML::Document.new xml_string
  translate_xml_element_recursive!(xml_doc, &filter)
  return xml_doc.to_s
end