(keitai-l) Re: Sorting Yomi

From: Alex Shinn <foof_at_synthcode.com>
Date: 01/17/05
Message-ID: <86llas5wqr.wl@lain.inunome.com>
At Mon, 17 Jan 2005 14:57:59 +0900 (JST), Curt Sampson wrote:
> 
> This is not true, because sorts based on the numerical representation of
> a kana can't give tokuon a lower precedence than kana following the kana
> with tokuon. For example,「じゃきょう」 sorts before 「しゃく」in my
> dictionary, but with a sort based on character codes, じ (0x3058) comes
> after し (0x3057), and so じゃきょう would sort after even 「しんぬ」.

Oops, sorry, don't mind me I was asleep when I replied :(

I think for hiragana only your algorithm works.  Including kanji,
katakana and romaji the JIS standard includes 5 collation levels - you
can see an open source implementation of the full collation in Perl's
Lingua::JA::Sort::JIS:

  http://search.cpan.org/~sadahiro/Lingua-JA-Sort-JIS-0.04/JIS.pm

-- 
Alex
Received on Mon Jan 17 09:40:51 2005