Rocksolid Light

News from da outaworlds

mail  files  register  groups  login

Message-ID:  

BOFH excuse #102: Power company testing new voltage spike (creation) equipment


comp / comp.text.tex / Re: catcode questions

SubjectAuthor
* catcode questionsFrançois Patte
+* Re: catcode questionsJulian Bradfield
|`- Re: catcode questionsFrançois Patte
`* Re: catcode questionsUlrich D i e z
 `- Re: catcode questionsUlrich D i e z

1
Subject: catcode questions
From: François Patte
Newsgroups: comp.text.tex
Organization: A noiseless patient Spider
Date: Thu, 16 Jan 2025 13:44 UTC
Path: news.eternal-september.org!eternal-september.org!.POSTED!not-for-mail
From: francois.patte@mi.parisdescartes.fr (François Patte)
Newsgroups: comp.text.tex
Subject: catcode questions
Date: Thu, 16 Jan 2025 14:44:41 +0100
Organization: A noiseless patient Spider
Lines: 42
Message-ID: <vmb2g9$3h7v4$1@dont-email.me>
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8; format=flowed
Content-Transfer-Encoding: 8bit
Injection-Date: Thu, 16 Jan 2025 14:44:42 +0100 (CET)
Injection-Info: dont-email.me; posting-host="d088ab342ec28a5654582b3188baba34";
logging-data="3710948"; mail-complaints-to="abuse@eternal-september.org"; posting-account="U2FsdGVkX1+fc+A+v/fs7fWBtkukw/zc"
User-Agent: Mozilla Thunderbird
Cancel-Lock: sha1:41fRJ3EPpGnNstKyi5sp4I0e6Wc=
Content-Language: en-GB, en-US
View all headers

Bonjour,

For typographic purposes, I need to make “~” a normal character,
redefine “_” and choose another non-breaking space, I've chosen: “¬”.

I then define an environment where these rules apply:

\catcode`\_=13 %
\catcode`\¬=13
\newenvironment{toto}{%
\catcode`\~=12 %
\catcode`\¬=13%
\def¬{\kern 1ex}%
\catcode`\_=13 %
\def_{}%
.................. et beaucoup d'autres choses....
}%
{%
\catcode`\_=8%
\catcode`\¬=12%
}%
\catcode`\_=8%
\catcode`\¬=12%

It works, but it seems redundant: I can't define \catcode`\¬=13,
\def¬{\kern 1ex} (idem for _) only in the environment, otherwise
latex will protest that a “control sequence” is missing, hence the
\catcode`\_=13 % \catcode`\¬=13 before the environment.

Likewise, putting their initial catcodes only in the end-of-environment
declaration isn't enough either.... hence the redundancy after the
environment definition.

Hence my question: is this a normal way of proceeding or is there a more
orthodox way?

I repeat: it works the way I want it to and, so far, I haven't had any
side effects.

Thank you for your advice.

F.P.

Subject: Re: catcode questions
From: Julian Bradfield
Newsgroups: comp.text.tex
Date: Thu, 16 Jan 2025 17:18 UTC
References: 1
Path: news.eternal-september.org!eternal-september.org!feeder3.eternal-september.org!nntp-feed.chiark.greenend.org.uk!ewrotcd!feeds.news.ox.ac.uk!news.ox.ac.uk!usenet.inf.ed.ac.uk!.POSTED!not-for-mail
From: jcb@inf.ed.ac.uk (Julian Bradfield)
Newsgroups: comp.text.tex
Subject: Re: catcode questions
Date: Thu, 16 Jan 2025 17:18:17 +0000 (UTC)
Lines: 34
Message-ID: <slrnvoiffc.661k.jcb@kotte.inf.ed.ac.uk>
References: <vmb2g9$3h7v4$1@dont-email.me>
NNTP-Posting-Host: kotte.inf.ed.ac.uk
Mime-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit
X-Trace: macpro.inf.ed.ac.uk 1737047897 52327 129.215.25.19 (16 Jan 2025 17:18:17 GMT)
X-Complaints-To: usenet@macpro.inf.ed.ac.uk
NNTP-Posting-Date: Thu, 16 Jan 2025 17:18:17 +0000 (UTC)
User-Agent: slrn/0.9.9p1 (Linux)
View all headers

On 2025-01-16, François Patte <francois.patte@mi.parisdescartes.fr> wrote:
> Bonjour,
>
> For typographic purposes, I need to make “~” a normal character,
> redefine “_” and choose another non-breaking space, I've chosen: “¬”.
>
> I then define an environment where these rules apply:
>
> \catcode`\_=13 %
> \catcode`\¬=13
> \newenvironment{toto}{%
> \catcode`\~=12 %
> \catcode`\¬=13%
> \def¬{\kern 1ex}%
> \catcode`\_=13 %
> \def_{}%
> ................. et beaucoup d'autres choses....
> }%
> {%
> \catcode`\_=8%
> \catcode`\¬=12%
> }%
> \catcode`\_=8%
> \catcode`\¬=12%
>
> It works, but it seems redundant: I can't define \catcode`\¬=13,
> \def¬{\kern 1ex} (idem for _) only in the environment, otherwise
> latex will protest that a “control sequence” is missing, hence the
> \catcode`\_=13 % \catcode`\¬=13 before the environment.

That's because catcodes are assigned at the early stage of processing
(what Knuth calls TeX's mouth). So to be able to write \def¬ , the ¬
has to have catcode 13 at the place you write \def¬ .
It's often irritating, but that's the way it is.

Subject: Re: catcode questions
From: François Patte
Newsgroups: comp.text.tex
Organization: A noiseless patient Spider
Date: Thu, 16 Jan 2025 21:33 UTC
References: 1 2
Path: news.eternal-september.org!eternal-september.org!.POSTED!not-for-mail
From: francois.patte@mi.parisdescartes.fr (François Patte)
Newsgroups: comp.text.tex
Subject: Re: catcode questions
Date: Thu, 16 Jan 2025 22:33:49 +0100
Organization: A noiseless patient Spider
Lines: 40
Message-ID: <vmbtvt$3m4o9$1@dont-email.me>
References: <vmb2g9$3h7v4$1@dont-email.me>
<slrnvoiffc.661k.jcb@kotte.inf.ed.ac.uk>
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8; format=flowed
Content-Transfer-Encoding: 8bit
Injection-Date: Thu, 16 Jan 2025 22:33:50 +0100 (CET)
Injection-Info: dont-email.me; posting-host="d088ab342ec28a5654582b3188baba34";
logging-data="3871497"; mail-complaints-to="abuse@eternal-september.org"; posting-account="U2FsdGVkX18sV/q+ik2gwnTR5XvZiThd"
User-Agent: Mozilla Thunderbird
Cancel-Lock: sha1:8PRUzTF2vT+L7Lm0yCq9KOtckPw=
Content-Language: en-GB, en-US
In-Reply-To: <slrnvoiffc.661k.jcb@kotte.inf.ed.ac.uk>
View all headers

Le 16/01/2025 à 18:18, Julian Bradfield a écrit :
> On 2025-01-16, François Patte <francois.patte@mi.parisdescartes.fr> wrote:
>> Bonjour,
>>
>> For typographic purposes, I need to make “~” a normal character,
>> redefine “_” and choose another non-breaking space, I've chosen: “¬”.
>>
>> I then define an environment where these rules apply:
>>
>> \catcode`\_=13 %
>> \catcode`\¬=13
>> \newenvironment{toto}{%
>> \catcode`\~=12 %
>> \catcode`\¬=13%
>> \def¬{\kern 1ex}%
>> \catcode`\_=13 %
>> \def_{}%
>> ................. et beaucoup d'autres choses....
>> }%
>> {%
>> \catcode`\_=8%
>> \catcode`\¬=12%
>> }%
>> \catcode`\_=8%
>> \catcode`\¬=12%
>>
>> It works, but it seems redundant: I can't define \catcode`\¬=13,
>> \def¬{\kern 1ex} (idem for _) only in the environment, otherwise
>> latex will protest that a “control sequence” is missing, hence the
>> \catcode`\_=13 % \catcode`\¬=13 before the environment.
>
> That's because catcodes are assigned at the early stage of processing
> (what Knuth calls TeX's mouth). So to be able to write \def¬ , the ¬
> has to have catcode 13 at the place you write \def¬ .
> It's often irritating, but that's the way it is.

May I consider that my syntax is correct?

F.P.

Subject: Re: catcode questions
From: Ulrich D i e z
Newsgroups: comp.text.tex
Date: Fri, 17 Jan 2025 02:08 UTC
References: 1
Path: news.eternal-september.org!eternal-september.org!feeder3.eternal-september.org!news.swapon.de!weretis.net!feeder8.news.weretis.net!reader5.news.weretis.net!news.solani.org!.POSTED!not-for-mail
From: ud.usenetcorrespondence@web.de (Ulrich D i e z)
Newsgroups: comp.text.tex
Subject: Re: catcode questions
Date: Fri, 17 Jan 2025 03:08:11 +0100
Message-ID: <vmce01$2m5m$1@solani.org>
References: <vmb2g9$3h7v4$1@dont-email.me>
MIME-Version: 1.0
Content-Type: text/plain; charset=utf-8
Content-Transfer-Encoding: 8bit
Injection-Date: Fri, 17 Jan 2025 02:06:57 -0000 (UTC)
Injection-Info: solani.org;
logging-data="88246"; mail-complaints-to="abuse@news.solani.org"
User-Agent: Mozilla/5.0 (X11; Linux i686 on x86_64; rv:60.0) Gecko/20100101
Thunderbird/60.9.0
Cancel-Lock: sha1:wYHZ8QqY1qA+BqV11/Bl1V+aG3s=
In-Reply-To: <vmb2g9$3h7v4$1@dont-email.me>
X-User-ID: eJwNx0kBwDAIBEBLnAvIoST4l5DOb1zBmDA4zNf3MOKfCvmmRkvlbTNgXGqbppBHjT5jnTv7AP5REHw=
Content-Language: de-DE
View all headers

François Patte schrieb:

> Bonjour,
>
> For typographic purposes, I need to make “~” a normal character,
> redefine “_” and choose another non-breaking space, I've chosen: “¬”.
>
> I then define an environment where these rules apply:
>
> \catcode`\_=13 %
> \catcode`\¬=13
> \newenvironment{toto}{%
> \catcode`\~=12 %
> \catcode`\¬=13%
> \def¬{\kern 1ex}%
> \catcode`\_=13 %
> \def_{}%
> ................. et beaucoup d'autres choses....
> }%
> {%
> \catcode`\_=8%
> \catcode`\¬=12%
> }%
> \catcode`\_=8%
> \catcode`\¬=12%
>
> It works, but it seems redundant: I can't define \catcode`\¬=13,
> \def¬{\kern 1ex} (idem for _) only in the environment, otherwise
> latex will protest that a “control sequence” is missing, hence the
> \catcode`\_=13 % \catcode`\¬=13 before the environment.
>
> Likewise, putting their initial catcodes only in the end-of-environment
> declaration isn't enough either.... hence the redundancy after the
> environment definition.
>
> Hence my question: is this a normal way of proceeding or is there a more
> orthodox way?
>
> I repeat: it works the way I want it to and, so far, I haven't had any
> side effects.
>
> Thank you for your advice.
>
> F.P.
Using the non-ascii-character "¬" ?

What is the input encoding of your .tex file?

If the input-encoding is utf8 and you don't use a TeX engine with native
utf8 support (XeTeX, LuaTeX) but use a traditional 8-bit-TeX engine
(TeX, pdfTeX) and the package inputenc with option "utf8", this might be
a problem as ¬ has code-point-number 172(decimal) in unicode and in the
transformation-format utf-8 is encoded via the two bytes C2(hex) AC(hex)
which by traditional TeX engines are interpreted as _two_ input characters.

The backslash has code-point-number 5C(hex)=92(decimal) in unicode and
in the transformation-format utf-8 is encoded via the single byte 5C .

Thus s.th. like \¬ is interpreted as three bytes/characters 5C C2 AC.
The first character is the backslash which has category 0(escape).
The second and the third characters, when the inputenc-package is loaded
for interpreting utf8-input, are of category 13(active).

Thus tokenizing this yields a control symbol token whose name is formed
by the character whose code-point-number in TeX's internal character
representation scheme is C2(hex)=194(dec) and an active character token
whose character code is AC(hex)=172(decimal). As the byte AC cannot be
the first byte of a character encoded in transformation-format utf8,
active AC triggers an error-message.

If
- either using a TeX-engine with native utf8-support, like XeTeX/LuaTeX,
and encoding the .tex-input file in utf-8,
- or using a traditional 8bit-TeX engine, like TeX or pdfTeX, and
encoding the .tex-input-file in some single-byte-encoding like
iso-8859-1 or Windows-1252
, then you can try \lccode/\lowercase-trickery:

\documentclass{article}

\newcommand\MyActivate[2]{%
\begingroup
\lccode`\~ =`#1 %
\lowercase{\endgroup\def~}{#2}%
\catcode`#1 =13 %
}%

\newenvironment{toto}{%
\MyActivate{\¬}{\kern 1ex}%
\MyActivate{\_}{}%
\catcode`\~=12\relax
}{}%

\begin{document}

\message{^^JBefore toto^^J}

\showthe\catcode`\¬
\showthe\catcode`\_
\showthe\catcode`\~
\show ¬
\show _
\show ~

\begin{toto}
\message{^^JWithin toto^^J}
\showthe\catcode`\¬
\showthe\catcode`\_
\showthe\catcode`\~
\show ¬
\show _
\show ~
\end{toto}

\message{^^JAfter toto^^J}
\showthe\catcode`\¬
\showthe\catcode`\_
\showthe\catcode`\~
\show ¬
\show _
\show ~

\end{document}

Sincerely

Ulrich

Subject: Re: catcode questions
From: Ulrich D i e z
Newsgroups: comp.text.tex
Date: Fri, 17 Jan 2025 15:18 UTC
References: 1 2
Path: news.eternal-september.org!eternal-september.org!feeder3.eternal-september.org!weretis.net!feeder8.news.weretis.net!reader5.news.weretis.net!news.solani.org!.POSTED!not-for-mail
From: ud.usenetcorrespondence@web.de (Ulrich D i e z)
Newsgroups: comp.text.tex
Subject: Re: catcode questions
Date: Fri, 17 Jan 2025 16:18:07 +0100
Message-ID: <vmds90$3a4n$1@solani.org>
References: <vmb2g9$3h7v4$1@dont-email.me> <vmce01$2m5m$1@solani.org>
MIME-Version: 1.0
Content-Type: text/plain; charset=utf-8
Content-Transfer-Encoding: 8bit
Injection-Date: Fri, 17 Jan 2025 15:16:48 -0000 (UTC)
Injection-Info: solani.org;
logging-data="108695"; mail-complaints-to="abuse@news.solani.org"
User-Agent: Mozilla/5.0 (X11; Linux i686 on x86_64; rv:60.0) Gecko/20100101
Thunderbird/60.9.0
Cancel-Lock: sha1:FzsivIeUF69+5oaRff9QG42EL8M=
X-User-ID: eJwVx8ERwDAIA7CVChjH61AI+4/Qq37KoLEPmERurrf2/H8nVnPTEQRv4VlGjEpmnV10h64+HVsQ3Q==
In-Reply-To: <vmce01$2m5m$1@solani.org>
Content-Language: en-US
View all headers

Things like `\A and `A are alphabetic constants.

Pitfalls with alphabetic constants are:

An alphabetic constant formed by a one-letter control sequence token
cannot be passed as macro-argument in case the
one-letter control sequence token is \outer unless the
one-letter control sequence token "hit" by \noexpand.

Forming an alphabetic constant from a character-token does not work
out in case at the time of tokenization the category of the character
is 0 or 5 or 9 or 14 or 15.

Assuming % has category 14 and thus is a comment-character,
something like

\catcode`%=12

does not work out.

But

\catcode`\%=12

does work out.

However, assuming \% is defined \outer, using the token \% within a
macro argument can only be done when previously it is "hit" by
\noexpand .

The following variant of \MyActivate might be more flexible:

Syntax:

\MyActivate{<One letter control sequence whose
name is the character to activate>}%
{<Tokens where #1 is to be replaced by the
corresponding active character token>}%

The braces surrounding the arguments are mandatory. (Instead of
characters { and } you can use any other character of category 1
resp. 2 .)

<One letter control sequence whose name is the character to activate>
may be \outer at the time of carrying out the environment toto, but
due to \newenvironment being a macro may not be \outer at the time of
defining the environment toto.

The corresponding active character may be \outer at the time of
defining and at the time of carrying out the environment toto.

When using \MyActivate inside macro- or environment-definitions, hashes
of #1 denoting the corresponding active character token need to be doubled .

If <Tokens where #1 is to be replaced by the corresponding active
character token> is used for defining a macro with parameter-text,
hashes belonging to macro parameters need to be doubled.

\documentclass{article}

\newcommand\MyGobble[1]{}%
\newcommand\MyActivate{%
% The one-letter control sequence might be \outer.
% So "hit" the one-letter control sequence with
% \noexpand before expanding \MyActivateB - this
% requires some brace-hacking:
\expandafter\expandafter\expandafter\expandafter
\expandafter\expandafter\expandafter\expandafter
\expandafter\expandafter\expandafter\expandafter
\expandafter\expandafter\expandafter\MyActivateB
\expandafter\expandafter\expandafter\expandafter
\expandafter\expandafter\expandafter\expandafter
\expandafter\expandafter\expandafter\expandafter
\expandafter\expandafter\expandafter{%
\expandafter\expandafter\expandafter\expandafter
\expandafter\expandafter\expandafter\noexpand
\expandafter\expandafter\expandafter\iffalse
\expandafter\expandafter\expandafter}%
\expandafter\expandafter\expandafter\fi
\expandafter\MyGobble
\string
}%
\newcommand\MyActivateB[2]{%
\begingroup
\lccode`\~ =`#1 %
\long\def\temp##1{\endgroup#2}%
\lowercase\expandafter{%
\expandafter\expandafter
\expandafter \temp
\expandafter\expandafter
\expandafter {%
\expandafter\noexpand
\noexpand~}%
}%
\catcode`#1 =13 %
}%

\newenvironment{toto}{%
\MyActivate{\¬}{\def##1{\kern 1ex}}%
\MyActivate{\_}{\def##1{}}%
\catcode`\~=12 %
}{}%

\begin{document}

%\outer\def\¬{outer macro}
%\outer\def\~{outer macro}
%\catcode`\~=13 \outer\def~{outer macro}
%\catcode`\¬=13 \outer\def¬{outer macro}

\message{^^JBefore toto^^J}

\showthe\catcode`\¬
\showthe\catcode`\_
\showthe\catcode`\~
\show ¬
\show _
\show ~

\begin{toto}
\message{^^JWithin toto^^J}
\showthe\catcode`\¬
\showthe\catcode`\_
\showthe\catcode`\~
\show ¬
\show _
\show ~
\end{toto}

\message{^^JAfter toto^^J}
\showthe\catcode`\¬
\showthe\catcode`\_
\showthe\catcode`\~
\show ¬
\show _
\show ~

\end{document}

Sincerely

Ulrich

1

rocksolid light 0.9.8
clearnet tor