Re: About charset setting and replacing



gmclee@xxxxxxxx wrote:
Hi there, I am writing a program to load HTML from file and send it
to IE directly. I've met some problem in charset setting. Most of
HTML have charset "us-ascii", for some reason, some UNICODE TEXT
will be inserted into the HTML before sending to IE. The problem is

1) Can I specify special charset for some component, e.g. <span
charset="UTF-8"> SOME UNICODE HERE</spand>

1. UTF-8 isn't a charset, it's an encoding.
2. The UTF-8 encoding includes and encompasses all of US-ASCII.
3. Encodings apply to pages, not to HTML fragments.

If you create a page that is encoded as UTF-8, and serve it as UTF-8,
US-ASCII characters will automatically be rendered correctly.

What I don't understand is what you mean by "send it to IE directly".
Are you writing a server? If so, then you need to look into how to serve
pages encoded as UTF-8 (and that would be off-topic here).

--
Jack.
.



Relevant Pages

  • Re: About charset setting and replacing
    ... I've met some problem in charset setting. ... inserted into the HTML before sending to IE. ... Since any valid us-ascii character is also valid UTF-8 ...
    (comp.infosystems.www.authoring.html)
  • Re: About charset setting and replacing
    ... I've met some problem in charset setting. ... inserted into the HTML before sending to IE. ... Since any valid us-ascii character is also valid UTF-8 ...
    (comp.infosystems.www.authoring.html)
  • Re: About charset setting and replacing
    ... HTML have charset "us-ascii", for some reason, some UNICODE TEXT ... If you create a page that is encoded as UTF-8, and serve it as UTF-8, ...
    (comp.infosystems.www.authoring.html)
  • Re: ADV Release update per TSRI
    ... Might be a charset issue (your normal posts are US-ASCII, ... message (which is UTF-8)? ...
    (rec.arts.anime.misc)
  • Re: Character encodings and invalid characters
    ... > characters (or replaces them in the case of common ones like ... Sounds like you want to re-invent JTidy or HTML Tidy (google is your ... HTML (UTF-8 encoding of these characters happens to be the same). ... decode it as US-ASCII, ...
    (comp.lang.java.programmer)