Re: Cross-platform e-mail text size problems



Sander Tekelenburg wrote:
In article <g4atm1$eh6$1@xxxxxxxx>, AV3 <arvimide@xxxxxxxxxxxxx> wrote:

Sander Tekelenburg wrote:

[...]

"Plain text" has nothing to do with character repertoires. [...]
My friend Google, sending me to Wikipedia, suggests the relationship to ASCII that I referred to.

The way I read <http://en.wikipedia.org/wiki/Plain_text>, ASCII is mentioned mostly as historical reference. I don't see it suggest that ASCII is more "plain text" than Unicode. It says that "plain text" used to require ASCII (and never one of the 'high ascii' variants we were stuck with before Unicode) and goes on to explain how Unicode is replacing ASCII in plain text.

But I can see how the IMO too prominent placement of the 2nd and 3rd paragraph can be misleading. Would fit better under "encoding" or "history".

When Apple began implementing Unicode, only .rtf-documents could preserve formatting with diacritics beyond the range of Latin 1; .txt-documents showed gobbledegook for characters beyond the first 256 of Unicode.

Sounds like a situation where the underlying system is not Unicode. It's probably easier to take something like RTF and wrap "special characters" in it, than making the entire (file) system Unicode savvy.

In any case, this seems anecdotal. It just tells us that one implementor made use of RTF to hack some sort of Unicode support into a non-Unicode aware system.



At the risk of continuing to be anecdotal, I started using unicode in Mac OS 9 (WorldText and SUE) and jumped to Mac OS 10.1, where WorldText documents were convertible to .rtf documents. In those days ASCII meant to me ISO-8859-1, and I still encounter messages, possibly from linux environments, encoded in the old Latin designations but congruent and readable with Unicode. Messages mistakenly encoded in ISO-8859-1 instead of utf-8 are garbled.


I haven't checked since those early days, always formatting for .rtf, .doc, etc., and avoiding .txt.

...

[...]

Has plain text been upgraded without my noticing?

If you define "plain text" as "non-formatted", encoding is irrelevant.

If you define "plain text" as "lowest common denomiator", I suppose you could say that it has indeed been upgraded from ASCII to Unicode, thanks to Unicode having become ubiquitous enough to be considered a "low enough common denominator".



I communicate beyond the Mac environment, so I try to observe conventions common to all. I think "plain text" has both technical and general meanings. I think a possibly outdated technical meaning was 'ASCII,' i. e., 'Latin 1.' I think the OP meant it as 'non-formatted,' but I wanted to raise the Latin 1 question., since that remains a problem for many.


--
++====+=====+=====+=====+=====+====+====+=====+=====+=====+=====+====++
||Arnold VICTOR, New York City, i. e., <arvimideQ@xxxxxxxxxxxxxx> ||
||Arnoldo VIKTORO, Nov-jorkurbo, t. e., <arvimideQ@xxxxxxxxxxxxxx> ||
||Remove capital letters from e-mail address for correct address/ ||
|| Forigu majusklajn literojn el e-poŝta adreso por ĝusta adreso ||
++====+=====+=====+=====+=====+====+====+=====+=====+=====+=====+====++
NOTICE: Due to Presidential Executive Orders, the National Security
Agency may have read this email without warning, warrant, or notice.
They may do this without any judicial or legislative oversight. You
have no recourse or protection.
.



Relevant Pages

  • Re: CFile::Read problem ???
    ... As far as the C compiler is concerned, ... you can pretty much always assign a char ... as ASCII and wchar_t as Unicode. ...
    (microsoft.public.windowsce.embedded.vc)
  • Re: Opening a text file that may be ASCII *or* Unicode
    ... It could well be ASCII empty -- no bytes.) ... UTF & BOM ... Positively Must Know About Unicode and Character Sets ... > regards, Andy ...
    (microsoft.public.scripting.vbscript)
  • Re: Cross-platform e-mail text size problems
    ... ASCII that I referred to. ... stuck with before Unicode) and goes on to explain how Unicode is ... Since Mac OS X the system has Unicode support under the hood. ...
    (comp.sys.mac.apps)
  • Re: Format of string output of a socket server
    ... ASCII is the same no matter what byte encoding is used. ... By definition any ASCII string is in UTF-8 encoding. ... The client program can then convert to Unicode or whatever they see fit? ... I am writing a socket server to deliver telephony events to clients on ...
    (microsoft.public.win32.programmer.networks)
  • Re: Can somebody else please try and explain to the FUCKING IDIOT "michael adams" what Usenet is? &#
    ... Binary groups *are* plain text. ... in binary newsgroups have to be restricted to the binary ... correspond with the ASCII character set. ... characters that are allowed on Usenet correspond to values between 0 ...
    (uk.media.tv.misc)