UnicodeUtils implements Unicode algorithms for case conversion,
normalization, text segmentation and more in pure Ruby code.
New in this release:
···
====================
Added the following UnicodeUtils methods:
* east_asian_width
* display_width
* default_ignorable_char_q
* gc
* graphic_char_q
* general_category
* char_type
* char_display_width
* debug
Usage
Ruby 1.9.1 or higher is required.
$ gem install unicode_utils
require "unicode_utils/display_width"
UnicodeUtils.display_width("Matzにっき") # => 10
$ irb -r unicode_utils/u
irb(main):001:0> U.debug("Matzにっき")
Char | Ordinal | Name | General Category | UTF-8
------+---------+--------------------------+------------------+----------
"M" | 4D | LATIN CAPITAL LETTER M | Uppercase_Letter | 4D
"a" | 61 | LATIN SMALL LETTER A | Lowercase_Letter | 61
"t" | 74 | LATIN SMALL LETTER T | Lowercase_Letter | 74
"z" | 7A | LATIN SMALL LETTER Z | Lowercase_Letter | 7A
"に" | 306B | HIRAGANA LETTER NI | Other_Letter | E3 81 AB
"っ" | 3063 | HIRAGANA LETTER SMALL TU | Other_Letter | E3 81 A3
"き" | 304D | HIRAGANA LETTER KI | Other_Letter | E3 81 8D
Documentation & Source
http://unicode-utils.rubyforge.org
http://github.com/lang/unicode_utils
Issues
It should work on all Ruby 1.9.1 implementations or higher
independently of operating system. If not, please report
it on github.
--
Stefan Lang