About charset setting and replacing
- From: gmclee@xxxxxxxx
- Date: 14 Jul 2006 03:29:09 -0700
Hi there,
I am writing a program to load HTML from file and send it to IE
directly. I've met some problem in charset setting. Most of HTML have
charset "us-ascii", for some reason, some UNICODE TEXT will be
inserted into the HTML before sending to IE. The problem is
1) Can I specify special charset for some component, e.g.
<span charset="UTF-8"> SOME UNICODE HERE</spand>
2) If "NO" for 1), so any way to change the charset of the original
HTML? Because I have no HTML praser handy, I can only SEARCH & REPLACE
the charset programmly. I've checked the several HTML and find the
CHARSET format like
<META http-equiv=Content-Type content="text/html; charset=us-ascii">
So, for leading the program to replace the correct one, I search the
keyword "charset=" and get the position, and then search the position
of double quotation marks, finally, I replace the substring with UTF8,
everything seems fine. However, I am worrying about if there are some
excepction. Will these, for example, happen?
<META http-equiv=Content-Type content="text/html;" charset="us-ascii">
OR
<META http-equiv=Content-Type content='text/html;' charset='us-ascii'>
OR
<META http-equiv=Content-Type content='text/html; charset=us-ascii'>
Any better approach for my problem?
p.s. Someone suggest me to send the original code to IE and then call
IE's charset setting function to change the charset, I try, but for my
UNICODE TEXT, aftering changing the charset, the UNICODE TEXT becomes
some meaningly code!!!
Thanks in advance.
.
- Follow-Ups:
- Re: About charset setting and replacing
- From: Jack
- Re: About charset setting and replacing
- From: Chris Morris
- Re: About charset setting and replacing
- Prev by Date: Re: https-Question
- Next by Date: Re: About charset setting and replacing
- Previous by thread: Picker (Ajax?)
- Next by thread: Re: About charset setting and replacing
- Index(es):
Relevant Pages
|