(keitai-l) Re: Removing "emoji" from a string...

From: J. David Beutel <jdb_at_getsu.com> Date: 06/27/03 Message-ID: <Pine.LNX.4.44.0306271141390.16993-100000@tokimi.getsu.com>

On Fri, 27 Jun 2003, Nik Frengle wrote:

> hopefully
> > the providers are using encodings that include user defined
> characters,
> > and assigning their emoji to those, instead of making up their own
> > encodings.
> [Nik Frengle] What does this mean? There isn't enough room in ASCII or
> the first parts of SJIS to fit all of the emoji currently in use, and in

I mean that some encodings define character codes that are intended for
users (e.g., Docomo) to be able to assign to their own characters.  
Likewise, some character sets define characters that are intended for
users to be able to assign to their own glyphs.  For example, take
Unicode's Private Use Area, range E000-F8FF.

On the other hand, if a sequence of bytes contains a character code that 
is not defined in some encoding, or worse yet, if it uses a character code 
to represent a different character than the one assigned by the encoding, 
then that sequence of bytes is not an example of that encoding.  It is a 
different encoding.

> any case there are no downloadable fonts for phones in Japan. So putting
> them somewhere that you can get at them more easily would be nice for
> developers, but make absolutely no difference to end users. Use Enfour's
> Keitai font package if you want to emulate i-mode emoji on your phone.
> Other carriers have free tools. 

I think Chris wants to know how he can remove emoji from the emails that
his server receives.  The fonts on the phones are irrelevant.

> Another thing to mention is that emoji are never input as a single
> character by a developer: They are input using the &#XXXXX; format,
> where XXXXX is a number.  You do this in a text editor. If you wanted to

This sounds like creating a web page using HTML character entities.  
Chris can't process the email on his server this way, unless the emoji in
the email is also received as HTML character entities.

11011011