Module: Prawn::Rtl::Connector::Logic
- Defined in:
- lib/prawn/rtl/connector/logic.rb
Overview
Handles the logic for Arabic letter connection and contextual form selection.
This module implements the core algorithm for determining which form (isolated, initial, medial, or final) an Arabic character should take based on its surrounding characters. It maintains a mapping of Arabic Unicode characters to their various contextual forms.
Defined Under Namespace
Classes: CharacterInfo
Constant Summary collapse
- @@charinfos =
nil
Class Method Summary collapse
-
.add(common, isolated, final, initial, medial, connects, diacritic = false) ⇒ Object
private
Adds a character and its contextual forms to the character mapping.
-
.charinfos ⇒ Hash{String => CharacterInfo}
private
Returns the character information mapping for Arabic characters.
-
.determine_form(previous_previous_char, previous_char, next_char, next_next_char) ⇒ Symbol
Determines the contextual form of an Arabic character.
-
.transform(str) ⇒ String
Transforms Arabic text by applying contextual letter forms.
Class Method Details
.add(common, isolated, final, initial, medial, connects, diacritic = false) ⇒ Object
This method is part of a private API. You should avoid using this method if possible, as it may be removed or be changed in the future.
Adds a character and its contextual forms to the character mapping.
197 198 199 200 201 202 203 204 205 206 207 208 |
# File 'lib/prawn/rtl/connector/logic.rb', line 197 def self.add(common, isolated, final, initial, medial, connects, diacritic = false) charinfo = CharacterInfo.new( [common.hex].pack('U'), [isolated.hex].pack('U'), [final.hex].pack('U'), [initial.hex].pack('U'), [medial.hex].pack('U'), connects, diacritic ) @@charinfos[charinfo.common] = charinfo end |
.charinfos ⇒ Hash{String => CharacterInfo}
This method is part of a private API. You should avoid using this method if possible, as it may be removed or be changed in the future.
Returns the character information mapping for Arabic characters.
Lazily initializes and returns a hash mapping Arabic Unicode characters to their CharacterInfo objects containing contextual forms.
134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 |
# File 'lib/prawn/rtl/connector/logic.rb', line 134 def self.charinfos return @@charinfos unless @@charinfos.nil? @@charinfos = {} add('0627', 'fe8d', 'fe8e', 'fe8d', 'fe8e', false) # Alef add('0628', 'fe8f', 'fe90', 'fe91', 'fe92', true) # Ba2 add('062a', 'fe95', 'fe96', 'fe97', 'fe98', true) # Ta2 add('062b', 'fe99', 'fe9a', 'fe9b', 'fe9c', true) # Tha2 add('062c', 'fe9d', 'fe9e', 'fe9f', 'fea0', true) # Jeem add('062d', 'fea1', 'fea2', 'fea3', 'fea4', true) # 7a2 add('062e', 'fea5', 'fea6', 'fea7', 'fea8', true) # 7'a2 add('062f', 'fea9', 'feaa', 'fea9', 'feaa', false) # Dal add('0630', 'feab', 'feac', 'feab', 'feac', false) # Thal add('0631', 'fead', 'feae', 'fead', 'feae', false) # Ra2 add('0632', 'feaf', 'feb0', 'feaf', 'feb0', false) # Zain add('0633', 'feb1', 'feb2', 'feb3', 'feb4', true) # Seen add('0634', 'feb5', 'feb6', 'feb7', 'feb8', true) # Sheen add('0635', 'feb9', 'feba', 'febb', 'febc', true) # 9ad add('0636', 'febd', 'febe', 'febf', 'fec0', true) # 9'ad add('0637', 'fec1', 'fec2', 'fec3', 'fec4', true) # 6a2 add('0638', 'fec5', 'fec6', 'fec7', 'fec8', true) # 6'a2 add('0639', 'fec9', 'feca', 'fecb', 'fecc', true) # 3ain add('063a', 'fecd', 'fece', 'fecf', 'fed0', true) # 3'ain add('0641', 'fed1', 'fed2', 'fed3', 'fed4', true) # Fa2 add('0642', 'fed5', 'fed6', 'fed7', 'fed8', true) # Qaf add('0643', 'fed9', 'feda', 'fedb', 'fedc', true) # Kaf add('0644', 'fedd', 'fede', 'fedf', 'fee0', true) # Lam add('0645', 'fee1', 'fee2', 'fee3', 'fee4', true) # Meem add('0646', 'fee5', 'fee6', 'fee7', 'fee8', true) # Noon add('0647', 'fee9', 'feea', 'feeb', 'feec', true) # Ha2 add('0648', 'feed', 'feee', 'feed', 'feee', false) # Waw add('064a', 'fef1', 'fef2', 'fef3', 'fef4', true) # Ya2 add('0621', 'fe80', 'fe80', 'fe80', 'fe80', false) # Hamza add('0622', 'fe81', 'fe82', 'fe81', 'fe82', false) # Alef Madda add('0623', 'fe83', 'fe84', 'fe83', 'fe84', false) # Alef Hamza Above add('0624', 'fe85', 'fe86', 'fe85', 'fe86', false) # Waw Hamza add('0625', 'fe87', 'fe88', 'fe87', 'fe88', false) # Alef Hamza Below add('0626', 'fe89', 'fe8a', 'fe8b', 'fe8c', true) # Ya2 Hamza add('0629', 'fe93', 'fe94', 'fe93', 'fe94', false) # Ta2 Marbu6a add('0640', '0640', '0640', '0640', '0640', true) # Tatweel add('0649', 'feef', 'fef0', 'feef', 'fef0', false) # Alef Layyina add('0651', 'fe7c', 'fe7c', 'fe7c', 'fe7d', false, true) # Shadda add('0652', 'fe7e', 'fe7e', 'fe7e', 'fe7f', false, true) # Sukun add('064e', 'fe76', 'fe76', 'fe76', 'fe77', false, true) # Fatha add('0650', 'fe7a', 'fe7a', 'fe7a', 'fe7b', false, true) # Kasra add('064f', 'fe78', 'fe78', 'fe78', 'fe79', false, true) # Damma add('0653', '0653', '0653', '0653', '0653', false, true) # Madda add('064b', 'fe79', 'fe70', 'fe70', 'fe71', false, true) # Fathatan add('064d', 'fe74', 'fe74', 'fe74', 'fe74', false, true) # Kasratan add('064c', 'fe72', 'fe72', 'fe72', 'fe72', false, true) # Dammatan @@charinfos end |
.determine_form(previous_previous_char, previous_char, next_char, next_next_char) ⇒ Symbol
Determines the contextual form of an Arabic character.
Determines the form of the current character (:isolated, :initial, :medial, or :final), given the previous character and the next one. In Arabic, all characters can connect with a previous character, but not all letters can connect with the next character (this is determined by CharacterInfo#connects?).
74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 |
# File 'lib/prawn/rtl/connector/logic.rb', line 74 def self.determine_form(previous_previous_char, previous_char, next_char, next_next_char) charinfos = self.charinfos next_char = next_next_char if charinfos[next_char] && charinfos[next_char].diacritic? previous_char = previous_previous_char if charinfos[previous_char] && charinfos[previous_char].diacritic? if charinfos[previous_char] && charinfos[next_char] charinfos[previous_char].connects? ? :medial : :initial # If the current character does not connect, # its medial form will map to its final form, # and its initial form will map to its isolated form. elsif charinfos[previous_char] # The next character is not an arabic character. charinfos[previous_char].connects? ? :final : :isolated elsif charinfos[next_char] # The previous character is not an arabic character. :initial # If the current character does not connect, its initial form will map to its isolated form. else # Neither of the surrounding characters are arabic characters. :isolated end end |
.transform(str) ⇒ String
Transforms Arabic text by applying contextual letter forms.
Processes a string character by character, determining the appropriate contextual form for each Arabic letter based on its surrounding characters. Non-Arabic characters pass through unchanged.
99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 |
# File 'lib/prawn/rtl/connector/logic.rb', line 99 def self.transform(str) res = '' charinfos = self.charinfos previous_previous_char = nil previous_char = nil current_char = nil next_char = nil next_next_char = nil consume_character = lambda do |char| previous_previous_char = previous_char previous_char = current_char current_char = next_char next_char = next_next_char next_next_char = char return unless current_char if charinfos.key?(current_char) form = determine_form(previous_previous_char, previous_char, next_char, next_next_char) res += charinfos[current_char].formatted[form] else res += current_char end end str.each_char { |char| consume_character.call(char) } 2.times { consume_character.call(nil) } res end |