(keitai-l) Re: tech- acsii to hex conversion for url strings in email

From: Paul Bryan Lester <pbl1_at_livedoor.com>
Date: 04/02/01
Message-ID: <3AC7C255.12C42CFF@livedoor.com>
Cool, I'm not the only one checking for those single byte characters!!

But according to the Unicode Standard it can get hairy, especially if
you are using keitais which define some of their own characters!
(And some do!)

And there are some non-Japanese high-byte characters that are double
byte as well according to Unicode.  So my code got a little complicated there.
(I'm using Javascript and Java for my stuff instead of perl).
(Maybe thats overkill. You probably don't need to be aware of
those things).

Renfield Kuroda wrote:

> "M. David" wrote:
> >
> > Mika,
> >
> > Thanks for the code, with a few changes so that it can be used on a url-
> >
> > $hex_str = unpack("H32",  $asc_str);
> > $hex_str =~ s/(..)/ $1/g;
> > $hex_str =~ s/ /%/g;
> >
> > am i right in assuming there are 2 characters for each 1 in regular ascii?
>
> Not necessarily. Japanese might have single-byte ascii mixed in. Best to
> check for high bytes to determine if a byte is part of a 2-byte
> character or not. Or better yet use jcode.pl.
>
> r e n
>
> --
> ascii:  r       e       n       f       i       e       l       d
> octal:  \162    \145    \156    \146    \151    \145    \154    \144
> hex:    \x72    \x65    \x6e    \x66    \x69    \x65    \x6c    \x64
> morgan stanley dean witter japan
> e-business technologies | engineering and strategy
>
> [ Did you check the archives?   http://www.appelsiini.net/keitai-l/ ]

--
-Paul Lester
pbl1@cornell.edu
http://members.tripod.com/~pbl1/

"Don't Forget to Try in Mind"
      "May the Force be with you"
              "Ketchup is Good"
-"Ketchup, natto and kimchee, that`s what Wogis are made of"
(I gotta start answering this stuff at work)


[ Did you check the archives?   http://www.appelsiini.net/keitai-l/ ]
Received on Mon Apr 2 03:03:28 2001