Unicode Collation
Unicode sorting is complicated (unicode.org/reports/tr10/), and Ruby doesn't do it correctly. But there is a widely-used implementation of the Unicode collation algorithm in the ICU (International Components for Unicode) libraries. This gem is a simple C wrapper to add the ucol_getSortKey function from the ICU Collation API to Ruby Strings.
Usage:
['cafe', 'cafes', 'caf.A??'].sort
=> ['cafe', 'cafes', 'caf??']
require 'unicode_collation'
['cafe', 'cafes', 'caf??'].sort_by {|s| s.unicode_sort_key}
=> ['cafe', 'caf??', 'cafes']
Install:
You must install ICU first. You can download the source from site.icu-project.org/download, or on Mac, you can install with MacPorts:
sudo port install icu
sudo gem install ninjudd-unicode-collation -s http://gems.github.com
To do:
Add support for locales other than en-US.
License:
Copyright (c) 2009 Justin Balthrop, Geni.com; Published under The MIT License, see LICENSE