Re: About charset setting and replacing




gmclee@xxxxxxxx wrote:
Hi there, I am writing a program to load HTML from file and send it
to IE directly. I've met some problem in charset setting. Most of
HTML have charset "us-ascii", for some reason, some UNICODE TEXT
will be inserted into the HTML before sending to IE. The problem is

1) Can I specify special charset for some component, e.g. <span
charset="UTF-8"> SOME UNICODE HERE</spand>

1. UTF-8 isn't a charset, it's an encoding.
Anyway, the following meta is extract from some page (the source HTML
of the searching result of google)

<meta http-equiv="Content-Type" content="text/html; charset=UTF-8">

2. The UTF-8 encoding includes and encompasses all of US-ASCII.
3. Encodings apply to pages, not to HTML fragments.

If you create a page that is encoded as UTF-8, and serve it as UTF-8,
US-ASCII characters will automatically be rendered correctly.
What I mean is : insert some UNICODE (e.g. Asian Character) into the
HTML, so if the charset is US-ASCII, it cannot render the text
correctly.

What I don't understand is what you mean by "send it to IE directly".
Are you writing a server? If so, then you need to look into how to serve
pages encoded as UTF-8 (and that would be off-topic here).
I am sorry for my misleading you. I am writing a client which send the
HTML code to IE with Microsoft IWebbrower2 and IHTMLDocument2
interfaces. With those interfaces, I can change the HTML of any page
dynamically.

.



Relevant Pages

  • Re: http-equiv caps & spacing in Apache 1.3.36
    ... using the type for incompatible XHTML is not forbidden ... HTML, current practice on the Internet includes a wide variety of HTML ... Encoding of a charset is often for choosing an alphabet and that's ... override the HTTP headers sent by a prior server. ...
    (comp.infosystems.www.servers.unix)
  • Re: About charset setting and replacing
    ... I've met some problem in charset setting. ... inserted into the HTML before sending to IE. ... Since any valid us-ascii character is also valid UTF-8 ...
    (comp.infosystems.www.authoring.html)
  • Re: character encoding in CGI.pm
    ... >> Or is XML defined such that this is a perfectly valid situation? ... It isn't valid HTML (take this document, ... its charset; in this case, the charset given in the HTTP header ...
    (comp.lang.perl.misc)
  • Re: About charset setting and replacing
    ... I've met some problem in charset setting. ... inserted into the HTML before sending to IE. ... Since any valid us-ascii character is also valid UTF-8 ...
    (comp.infosystems.www.authoring.html)
  • Re: About charset setting and replacing
    ... HTML have charset "us-ascii", for some reason, some UNICODE TEXT ... If you create a page that is encoded as UTF-8, and serve it as UTF-8, ...
    (comp.infosystems.www.authoring.html)