Re: persian languages charset, and what DOCTYPE?
- From: "Alan J. Flavell" <flavell@xxxxxxxxxxxxxxxxx>
- Date: Sat, 8 Apr 2006 16:04:31 +0100
On Sat, 8 Apr 2006, Harlan Messinger wrote:
else, and the appearance in two places of "تست2", once after the
date at the top, and once as the first item in list of Recent Posts.
The first one appears in the page source as "تست2"
Yes, I'd spotted that, and noted that if interpreted as utf-8 it turns
out as Arabic-script characters, which made it seem as if that part
had been inserted into it incorrectly.
and the second appears as
"تست2", the character entity
representation of the same thing.
Blimey, so it does! I hadn't spotted that at first look. So it's
worse than just broken!!
Furthermore, I now see loads of hrefs like these:
http://journalhome.com/razavi/21877/%26Oslash%3B%26ordf%3B%26Oslash%3B%26sup3%3B%26Oslash%3B%26ordf%3B2.html
*Shudder*
For what it's worth - coming back to the تست2 which we saw, if I
convert[1] that from utf-8 to us-ascii encoding then the result reads:
تست2
which can be decoded e.g with my trusty decoding ring (;-) at
http://ppewww.ph.gla.ac.uk/~flavell/unicode/unidata06.html
At this kind of third-hand remove from the original complainant, and
with me only understanding the theory of the character representation,
without being able to read Farsi - nor have the slightest inclination
to tangle with the mess that comes out of MS's attempts to extrude
something resembling HTML, I'm afraid I can't go much further than to
say that these pages seem to be dreadfully broken; it's a wonder that
anything comes out as intended.
good luck (you-all will need it!)
[1] by "convert" I mean, in Seamonkey (nee Mozilla), manually set
View> Encoding to utf-8, then File> Edit Page, then in Composer,
"Save and change character encoding". Unfortunately it doesn't
offer us-ascii as an option, but any 8-bit encoding which doesn't
cover Arabic would suffice for this purpose - e.g Armenian, Thai,
whatever you like. (Perhaps we should ask the Mozilla folks to
support saving in us-ascii explicitly?).
.
- Follow-Ups:
- Re: persian languages charset, and what DOCTYPE?
- From: Alan J. Flavell
- Re: persian languages charset, and what DOCTYPE?
- References:
- persian languages charset, and what DOCTYPE?
- From: Simon
- Re: persian languages charset, and what DOCTYPE?
- From: Harlan Messinger
- persian languages charset, and what DOCTYPE?
- Prev by Date: Re: A List of One.
- Next by Date: Re: A List of One.
- Previous by thread: Re: persian languages charset, and what DOCTYPE?
- Next by thread: Re: persian languages charset, and what DOCTYPE?
- Index(es):
Relevant Pages
|
Loading