(keitai-l) Re: Sorting Yomi

From: Alex Shinn <foof_at_synthcode.com>
Date: 01/28/05
Message-ID: <867jly48lo.wl@lain.inunome.com>
At Fri, 14 Jan 2005 19:18:10 +0900 (JST), Curt Sampson wrote:
> 
> I've been thinking I'd like an efficient way to properly sort yomi
> (expressed in hiragana) in dictionary order, and I think I may have
> found one. However, it's not a field I'm very familiar with, so I
> thought I'd solicit comments here. I've described it at:
> 
>     http://pc.tabemo.com/cjs/ja-sort.html

Coming back to this after a little research, definitely read the
Unicode Collation Algorithm:

  http://www.unicode.org/reports/tr10/

It's a lot more complicated than Japanese, but many of the
optimizations they suggest are still applicable.

The most important thing is you don't need to store your keys as
arrays.  When you compute the initial keys as

  'hapu'    =>  '55',  '02',  '12'
  'hahuko'  =>  '551', '040', '000'
  'habuki'  =>  '551', '021', '010'

you can then join them into a single string by adding a value less
than all other values between keys, say '.':

  'hapu'    =>  '55.02.12'
  'hahuko'  =>  '551.040.000'
  'habuki'  =>  '551.021.010'

Then just a normal string sort will work.


-- 
Alex
Received on Fri Jan 28 03:59:21 2005