(keitai-l) Re: emoji and choking waps

From: Craig Dunn <craig.dunn_at_conceptdevelopment.net>
Date: 01/15/01
Message-ID: <OGEGIKAMGPHPOJLMMLMLOEEECAAA.craig.dunn@conceptdevelopment.net>
Ron,

>>I'm looking for a way to weed out these pesky emoji (using ASP).

not sure whether you're setting the bytecodes (eg. 0xF89F) or the entity
(&#decimal code;)

if option 2, the function below may be modified to catch (and maybe
'translate') the special characters [NOTE: if you're seeing the bytecodes, a
RegExp solution may still work for you, I just don't have any code around]

basically it's using Regular Expressions to find &#2C8F; and &12345;
patterns in a string. you could use the RegEx objects Find and Replace
commands to weed them out, and possible put something more useful in
(possibly the emoji 'name' from the docomo web site?)

hope this helps (if i'm on the right track i may do attempt the rewrite - if
so will post). interested to see what you end up with.

craig

'/*******************************
'* Written using VBScript 5.0 (?) W2k/IIS5
'*******************************/
Function DetectHTMLUnicode(filein)
	'/* Should only be used on ASCII strings */
	dim whatcode
	dim RegEx, Matches, Match
	set RegEx = Server.CreateObject("VBScript.RegExp")
	RegEx.IgnoreCase = TRUE
	RegEx.Global = TRUE
	RegEx.Pattern = "&\x#[0-9A-F][0-9A-F][0-9A-F][0-9A-F];"
	if RegEx.Test(filein) then
		'/* At least one &#HHHH; detected, assume bad characters */
		whatcode = "SGMLUNICODE"
	else
		RegEx.Pattern = "&\#[0-9]+;"
		if RegEx.Test(filein) then
			'/* At least one &#12345; detected, assume bad characters */
			whatcode = "HTMLUNICODE"
		else
			'/* Return ASCII - assumes input was ASCII */
			whatcode = "ASCII"
		end if
	end if
	DetectHTMLUnicode = whatcode
End Function
--------------------------------


[ Did you check the archives?   http://www.appelsiini.net/keitai-l/ ]
Received on Mon Jan 15 23:08:27 2001