Re: Problem defining \begin{CJK} . . . \end{CJK} in a macro
- From: "Ulrich Diez" <eu_angelion@xxxxxxxxxxxxxx>
- Date: Thu, 22 Jun 2006 17:38:27 +0200
ipsi/Andrew wrote:
I define the following Macro:....
\newcommand{\HelloGrandma}{%
\begin{CJK}{UTF8}{song}%
ä½ å¥½å¥¶å¥¶%
\end{CJK}%
}%
I'm utterly confused by this.
Now, I eventually realised that I could put the \begin{CJK} immediately
after the \begin{document}, and similar for \end, and everything works,
but I'd like to know *why* this happens.
The problem is about category-codes.
If you intend to make intensive use of the CJK-package, you need
to know about category-codes. So I'll try to elaborate a bit:
When (La)TeX reads a line of text from a file, one of the first
things that happen is, that everything is transformed into so
called "tokens".
A token can be e.g., a control-sequence, or a character/category-
code-pair.
The process of transforming stuff into tokens while reading input
is called "tokenizing".
When (La)TeX reads a character from the input-file and forms a
token from it, the resulting token will be a character of a
distinct category-code. There are several category-codes, e.g.,
catcode 11 means that the character is an ordinary text-letter.
Catcode 13 means that the character is to be treated like a
control-sequence.
You can e.g. write:
\catcode`\A=13\relax -> now "A" is to be treated like a macro
when it is read from input.
\defA{test} -> as A is to be treated like a macro,
you can use it as macro-name-argument
when defining...
A -> this yields the phrase "test"
\catcode`\A=11\relax -> now "A" is to be treated like a letter
when it is read from input.
A -> this yields the letter "A"
\catcode`\A=13\relax -> now "A" is to be treated like a macro
when it is read from input.
A -> this yields the phrase "test" as the
macro/catcode13-A is already defined.
\catcode`\A=11\relax -> now "A" is to be treated like a letter
when it is read from input.
A -> this yields the letter "A"
...
Be aware that I always added the phrase "when it is read from
input." I did so because reading input and macro-defining/
expansion are different concepts.
If I write e.g.,
\catcode`\A=11\relax
\def\macrowitha{A}
\catcode`\A=13\relax
\defA{test}
\macrowitha
,the last line will yield just the letter "A". Defining the
macro "\macrowitha" took place when A was of letter-catcode, so
the definition of \macrowitha contains a letter-A-token and it
will always expand to a letter-A-token, no matter how future
"A"-characters that are read from input-file will get tokenized.
The CJK-package also takes advantage of changing category-codes:
Within the CJK-environment category-codes are changed so that
characters while reading them from input-file are treated like
macros from which these nice Chinese/Japanese/Korean/Whatsoever
letters get created.
That means: "\begin{CJK}..." is just a directive for changing
how category-codes get assigned to characters when
future input is read/tokenized from text-file.
"\end{CJK}" is there for undoing the change/for
restoring "the old way" of assigning catcodes to
characters when reading input-files.
The clue hereby is: Putting this directives into a macro-
definition does not "execute" them. These directives do not get
executed when defining takes place but when the defined macro
gets expanded.
So writing something like
1 \newcommand{\HelloGrandma}{%
2 \begin{CJK}{UTF8}{song}%
3 ä½ å¥½å¥¶å¥¶%
4 \end{CJK}%
5 }%
yields:
1 - Define a new macro \HelloGrandma to do the following:
2 - Start a CJK-environment
3 - Write some tokens "ä½ å¥½å¥¶å¥¶%"
4 - End the CJK-environment.
The crucial point is: The tokens of line 3 get tokenized
according to the rules/catcode-settings which are valid when
_defining_ takes place/when the definition-text is read from the
input-file. If defining does itself not take place within a
CJK-environment, the likelihood for unexpected results is
very high.
Expanding \HelloGrandma yields:
1. Spit out the tokens: \begin{CJK}{UTF8}{song}%
2. Spit out the tokens²: ä½ å¥½å¥¶å¥¶%
3. Spit out the tokens: \end{CJK}%
1. means: Change directives for tokenizing characters
that get read from input-file.
2. means: Some tokens that do not come from consecutive
input-file-reading but do come from expanding
a macro the definition of which was tokenized
long ago not according to CJK-rules.
3. means: Reset directives for tokenizing characters
that get read from input-file.
²Sorry, but I have no unicode available on my old machine
right now. Have to use Unired. Pasting to latin1-encoded
stuff yields this.
So - if you want the correct result - you have to take care
that tokenizing the definition/the tokens of line 2 also
takes place according to the CJK-directives.
In order to get that, it is sufficient to let defining
take place within a CJK-environment also.
Another point is that you might wish to make \HelloGrandma
robust 8-) . Usually stuff that gets written to file (e.g., toc)
is evaluated fully before writing actually takes place.
With CJK a problem arises:
Stuff from within CJK gets fully evaluated and characters get
written to toc-file. In the next run toc-file is read and
characters get tokenized with wrong catcodes while reading that
file and thus do in the table of contents not yield Chinese/
Japanese/Korean/Whatsoever- stuff any more. You can prevent
this full-evaluation by LaTeX's \protect- and robustness-
mechanisms. If you declare a robust command, it will not be
fully evaluated any more when writing to toc-file etc takes
place but it will be written to file "verbatimly".
If you define e.g. \newcommand\test{bla bla} and write
\section{\test}, you will find in the toc-file the phrase/
the characters "bla bla".
If defining \test and calling \section took place within a
CJK-environment, the section-head in the text might contain
Chinese while the table of contents which is not read within
a CJK-environment would contain some unexpected/weird
character-conglomerate.
If you write \DeclareRobustCommand\test{bla bla} and write
\section{\test}, you will find in the toc-file the token
"\test".
Ulrich
\documentclass{article}
\usepackage{CJK}
\newcommand*\globalrobust[1]{%
{%
\escapechar=-1\relax
\expandafter\global
\expandafter\let\csname\string#1 \expandafter
\endcsname\csname\string#1 \endcsname
\global\let#1#1%
}%
}%
\begin{CJK}{UTF8}{song}%
\DeclareRobustCommand*\HelloGrandma{%
\begin{CJK}{UTF8}{song}%
ä½ å¥½å¥¶å¥¶%
\end{CJK}%
}%
\globalrobust\HelloGrandma
\end{CJK}%
\begin{document}
\tableofcontents
\section{Dear Grandma \HelloGrandma}
Dear Grandma \HelloGrandma
\end{document}
.
- References:
- Prev by Date: CTAN has a new package: digiconfigs
- Next by Date: Re: CTAN has a new package: digiconfigs
- Previous by thread: Re: Problem defining \begin{CJK} . . . \end{CJK} in a macro
- Next by thread: Re: Problem defining \begin{CJK} . . . \end{CJK} in a macro
- Index(es):
Relevant Pages
|