Home
2008:
January
February
March
April
May
June
July
August
2007:
January
February
March
April
May
June
July
August
September
October
November
December
2006:
January
February
March
April
May
June
July
August
September
October
November
December
2005:
January
February
March
April
May
June
July
August
September
October
November
December
2004:
January
February
March
April
May
June
July
August
September
October
November
December
2003:
January
February
March
April
May
June
July
August
September
October
November
December
2002:
January
February
March
April
May
June
July
August
September
October
November
December
2001:
January
February
March
April
May
June
July
August
September
October
November
December
2000:
April
May
June
July
August
September
October
November
December

(keitai-l) Re: Removing "emoji" from a string...

From: Nik Frengle <eseller_at_eimode.com>
Date: 06/26/03
Message-ID: <000b01c33bf9$a9a34190$504403db_at_hoshakuj1>
David,

> I don't know exactly.  
[Nik Frengle] You are right, you don't. 
>But for the emoji not in a standard character set,
> the provider must be using a custom character set, and perhaps a
custom
> character encoding.  
[Nik Frengle] The character encoding is SJIS, and the character set is
one that is embedded in the handset. The characters not in the usual
Japanese character set are stored in a private area not normally used,
and generally considered illegal by PCs, which means that even if you
have a font with the emojis stored in the same place in the encoding as
a mobile phone does, your PC won't look there, because it thinks that it
is not legal to look there. 
[Nik Frengle]
> Hopefully each provider documents their own
> customization.  
[Nik Frengle] They do. Use Google. Browse the Keitai-l archives.
>But anyone should be able to figure it out by emailing all
> the emoji from every provider to their PC and studying the character
codes
> in those emails.  Hopefully someone has already done so.  Also,
hopefully
> the providers are using encodings that include user defined
characters,
> and assigning their emoji to those, instead of making up their own
> encodings.
[Nik Frengle] What does this mean? There isn't enough room in ASCII or
the first parts of SJIS to fit all of the emoji currently in use, and in
any case there are no downloadable fonts for phones in Japan. So putting
them somewhere that you can get at them more easily would be nice for
developers, but make absolutely no difference to end users. Use Enfour's
Keitai font package if you want to emulate i-mode emoji on your phone.
Other carriers have free tools. 
> 
> Note that the provider's email gateway could convert the emoji on
their
> way out, depending on their destination.  For example, ezweb could
convert
> emoji sent to a docomo.ne.jp address into the closest docomo emoji.
(I
> don't know if they do; I'm just saying they could.) 
[Nik Frengle] The outgoing gateway could do that, but why would you
offer such a service as a carrier to the customers of your competitor?
It makes more business sense to offer incoming conversion to your own
customers. I believe that both J-Phone and au have some sort of service
that does this, though I may be wrong.
Another thing to mention is that emoji are never input as a single
character by a developer: They are input using the &#XXXXX; format,
where XXXXX is a number.  You do this in a text editor. If you wanted to
strip the emoji, you could do a grep or regex operation to look for
anything in the plain text that was in that format, and either delete it
or come up with some sort of conversion. For example, an i-mode angry
face is in the form of &#63894;, whereas you could have a hash table and
convert that to a more PC-readable ;< or something. Most of the emoji,
though, don't have simple equivalents.

Best Regards,
Nik Frengle

---
Outgoing mail is certified Virus Free.
Checked by AVG anti-virus system (http://www.grisoft.com).
Version: 6.0.491 / Virus Database: 290 - Release Date: 2003/06/18
 
Received on Thu Jun 26 18:47:55 2003