Re: Expansion/Contraction



Albert ARIBAUD wrote:
Mok-Kong Shen wrote:

Albert ARIBAUD wrote:

Hmm... I would think the table aims at providing expansion or
contraction ratios for translations at human language level, not for
conversions at character encoding level; thus, variability is quite
understandable as each single chinese ideogram may translate to / from
a good deal of different english phrases, thus to /from a highly
variable number of roman letters.
Let me quote a sentence from that web page:

To avoid significant increases in formatting costs during
localization due to excessive struggles to fit too much copy into
too little space, documents should be formatted with sufficient
white space to accommodate text expansion.

In my understanding, at least one of the main concern is the question:
How much "space" (on a printed media or other displays) would a
translated text occupy as compared to the original text?

Until there, I completely agree with you on how this page should be understood.

Now, in ASCII,
if one uses the same font and size, then the comparison can be simply
done by character counts, for one can readily assume that a word
displayed in one language is as "readable" (comfortably discernable by
the eye) as in another language. But a Chinese character (ideogram) is
inherently different in appearance from an English word that consists of
a sequence of ASCII characters. How large should the size of a Chinese
character be, in order that there be comparable "readability" to English
words of a given font size, that is a quenstion that probably isn't yet
well studied todate. That would mean that a comparison of the spaces on
printed media occupied by the original and the translated text in the
case of Chinese vs. English (and similar languages) rather difficult.
That was my point.

The way you develop this point here is much more meaningful than it was in your previous post, because in this previous post, you were focusing exclusively on encodings, without any mention at all of readability or printing. OTOH, your eplanation above has its emphasis on readability... and actually seems to acknowledge the page's "varies" comment rather than contradict it, as you seemd to be doing previously.

I did mention readability. In my previous post I put the word readability in quotes to indicate that I mean by it not
comprehensibility or understandability but ease/difficulty of
viewing the displayed text with the eye. Because a sensible comparison
is hampered by the difficulty of determining exactly when (through
adjusting the font size) equivalent "readability" has been achieved
for the original and the translated text, "unknown due to comparison
difficulty" would be more appropriate than "varies" in the
present context in my view. For at the current time one couldn't
yet exclude the possibility that the result of a detailed study
turns out indeed to be that Chinese always needs less space than
English, or the other way round.

Actually, the only thing in your exposition above which I still have an issue with is the mentioning of ASCII. Why mention ASCII (and before that, unicode and byte counts), which is only a character encoding convention (and text storage measure), when the issue is with printing and readability, which not an encoding or storing issue at all?

The phrase "in ASCII" (the first use of "ASCII") could be replaced by
"for source and target languages whose alphabets are commonly encoded
in ASCII, e.g. the European languages". The second use of "ASCII"
could be substituted by "English alphabetical", if that avoids
misunderstanding.

Thanks,

M. K. Shen
.



Relevant Pages

  • Re: Origin of Chinese spoken languages
    ... >>You said spoken Chinese language generated and evolved from writing ... > You can invent any word in any spoken and written languages. ... The following list of character ...
    (sci.lang)
  • Re: Registry Destruction in progress...
    ... I suspect they coded it in Chinese in some eternal modules.... ... when you switch to Chinese or Japanese language (don't know if it applies ... U0041 when I find it under XP's character table, using Arial as the font ...
    (rec.games.computer.ultima.dragons)
  • Re: Expansion/Contraction
    ... ratios for translations at human language level, not for conversions at character encoding level; thus, variability is quite understandable as each single chinese ideogram may translate to / from a good deal of different english phrases, thus to /from a highly variable number of roman letters. ...
    (sci.lang.translation)
  • Re: Expansion/Contraction
    ... conversions at character encoding level; ... displayed in one language is as "readable" (comfortably discernable by ... But a Chinese character is ...
    (sci.lang.translation)
  • Re: A note on computing thugs and coding bums
    ... It would handle international characters if the execution character ... method I used in "Build Your Own .Net Language and Compiler". ... work areas and counting on Nul is an illusion. ...
    (comp.programming)