Re: unicode failure with inputenc package

r <inpost@xxxxxxxxx> writes:
Upon further thought, with the instruction to use utf8 and the
inputenc package, why does latex accept my tex files being created in
utf8 text encoded text editor, yet fail to accept a utf8 character
within those very same utf8 encoded text files?


the utf8 code in inputenc is extremely tricky, and it relies on each
character having a separate definition.

because this is a space hog, it doesn't even try to set up characters
it doesn't believe you can typeset (you tell it what you can typeset
via your fontenc declarations).

all of this is a mess, which derives from tex only being able to deal
with fonts with at most 256 glyphs in them.

otoh, xetex, which people have suggested you use, uses opentype fonts
which can, in principle, contain a glyph for every registered unicode
code point. you don't need a macro to access that -- just the unicode
Robin Fairbairns, Cambridge

Relevant Pages

  • Unicode (Was: Re: subjective feelings about actions?)
    ... >any other unicode encoding. ... If you are working at the character level, ... working on raw UTF8 can be a chore. ...
  • Re: UTF8 conversion
    ... Javascript does not handle UTF8. ... own code to handle UTF8 strings (strings of single byte characters, ... (each byte representing a single character you may see if you view ...
  • Re: Decode data of different charsets into UTF8 (Perl internal format)
    ... I turned out that it wasn't the decode() function. ... block, determine the charset, decode it and process it. ... The "utf8" charset will skip the decoding part in the script. ... I think checking for the first character to be in range U+0001 to ...
  • RE: Problem with UTF8 and array binding....
    ... I already had NLS_LANG set to AMERICAN_AMERICA.AL32UTF8 (which is my db character set). ... On another note, I've discovered that if we set the UTF8 flag on the Perl string, it gives me the PLS-418 error. ... Problem with UTF8 and array binding.... ... array bind type must match PL/SQL table row type' error, but only when passing UTF8 data. ...
  • Re: HTML in utf8 and perl
    ... >script generating an html page encoded in utf8. ... Other than that, the second character is coded by three, ... This is the UTF-8 representation of U+D184, ... printf "\N\n"; ...