Class: TextRank::CharFilter::AsciiFolding

Inherits:
Object
  • Object
show all
Defined in:
lib/text_rank/char_filter/ascii_folding.rb

Overview

Characater filter to transform non-ASCII (unicode) characters into ASCII-friendly versions.

= Example

AsciiFolding.new.filter!("the Perigordian Abbé then made answer, because a poor beggar of the country of Atrébatie heard some foolish things said") => "the Perigordian Abbe then made answer, because a poor beggar of the country of Atrebatie heard some foolish things said"

Constant Summary collapse

NON_ASCII_CHARS =

Non-ASCII characters to replace

'ÀÁÂÃÄÅàáâãäåĀāĂ㥹ÇçĆćĈĉĊċČčÐðĎďĐđÈÉÊËèéêëĒēĔĕĖėĘęĚěĜĝĞğĠġĢģĤĥĦħÌÍÎÏìíîïĨĩĪīĬĭĮįİıĴĵĶķĸĹĺĻļĽľĿŀŁłÑñŃńŅņŇňʼnŊŋÒÓÔÕÖØòóôõöøŌōŎŏŐőŔŕŖŗŘřŚśŜŝŞşŠšſŢţŤťŦŧÙÚÛÜùúûüŨũŪūŬŭŮůŰűŲųŴŵÝýÿŶŷŸŹźŻżŽž'
EQUIVALENT_ASCII_CHARS =

"Equivalent" ASCII characters

'AAAAAAaaaaaaAaAaAaCcCcCcCcCcDdDdDdEEEEeeeeEeEeEeEeEeGgGgGgGgHhHhIIIIiiiiIiIiIiIiIiJjKkkLlLlLlLlLlNnNnNnNnnNnOOOOOOooooooOoOoOoRrRrRrSsSsSsSssTtTtTtUUUUuuuuUuUuUuUuUuUuWwYyyYyYZzZzZz'

Instance Method Summary collapse

Instance Method Details

#filter!(text) ⇒ String

Perform the filter

Parameters:

  • text (String)

Returns:

  • (String)


24
25
26
# File 'lib/text_rank/char_filter/ascii_folding.rb', line 24

def filter!(text)
  text.tr!(NON_ASCII_CHARS, EQUIVALENT_ASCII_CHARS)
end