Rocksolid Light

News from da outaworlds

mail  files  register  groups  login

Message-ID:  

A kind of Batman of contemporary letters. -- Philip Larkin on Anthony Burgess


comp / comp.lang.python / Re: Printing UTF-8 mail to terminal

SubjectAuthor
* Printing UTF-8 mail to terminalLoris Bennett
+* Re: Printing UTF-8 mail to terminalLeft Right
|`* Re: Printing UTF-8 mail to terminalLoris Bennett
| `* Re: Printing UTF-8 mail to terminalInada Naoki
|  `- Re: Printing UTF-8 mail to terminalLoris Bennett
+- Re: Printing UTF-8 mail to terminal (Posting On Python-List Prohibited)Lawrence D'Oliveiro
`* Re: Printing UTF-8 mail to terminalCameron Simpson
 `* Re: Printing UTF-8 mail to terminalLoris Bennett
  +* Re: Printing UTF-8 mail to terminalLoris Bennett
  |+- Re: Printing UTF-8 mail to terminaldieter.maurer
  |`* Re: Printing UTF-8 mail to terminalCameron Simpson
  | `* Re: Printing UTF-8 mail to terminalLoris Bennett
  |  `* Re: Printing UTF-8 mail to terminalLoris Bennett
  |   `* Re: Printing UTF-8 mail to terminalLoris Bennett
  |    +- Re: Printing UTF-8 mail to terminalPeter J. Holzer
  |    `- Re: Printing UTF-8 mail to terminalCameron Simpson
  `- Re: Printing UTF-8 mail to terminalCameron Simpson

1
Subject: Printing UTF-8 mail to terminal
From: Loris Bennett
Newsgroups: comp.lang.python
Organization: FUB-IT, Freie Universität Berlin
Date: Thu, 31 Oct 2024 15:33 UTC
Path: eternal-september.org!news.eternal-september.org!feeder2.eternal-september.org!fu-berlin.de!uni-berlin.de!not-for-mail
From: loris.bennett@fu-berlin.de (Loris Bennett)
Newsgroups: comp.lang.python
Subject: Printing UTF-8 mail to terminal
Date: Thu, 31 Oct 2024 16:33:41 +0100
Organization: FUB-IT, Freie Universität Berlin
Lines: 40
Message-ID: <878qu49tii.fsf@zedat.fu-berlin.de>
Mime-Version: 1.0
Content-Type: text/plain; charset=utf-8
Content-Transfer-Encoding: 8bit
X-Trace: news.uni-berlin.de h6A2QsJdNwyS4TjUPiLUdA2J5sQHgLnUlp57EntrI4k2Vo
Cancel-Lock: sha1:E6dYkpc5qJ/pmp69t7WNK60lVeM= sha1:gSIvz3wjtMauUgyRhHnU6ekDtPM= sha256:v9c2ithv//r5phiZIlGAykFBZv/VLHEMOE/zk6pieZ0=
User-Agent: Gnus/5.13 (Gnus v5.13) Emacs/28.2 (gnu/linux)
View all headers

Hi,

I have a command-line program which creates an email containing German
umlauts. On receiving the mail, my mail client displays the subject and
body correctly:

Subject: Übung

Sehr geehrter Herr Dr. Bennett,

Dies ist eine Übung.

So far, so good. However, when I use the --verbose option to print
the mail to the terminal via

if args.verbose:
print(mail)

I get:

Subject: Übungsbetreff

Sehr geehrter Herr Dr. Bennett,

Dies ist eine =C3=9Cbung.

What do I need to do to prevent the body from getting mangled?

I seem to remember that I had issues in the past with a Perl version of
a similar program. As far as I recall there was an issue with fact the
greeting is generated by querying a server, whereas the body is being
read from a file, which lead to oddities when the two bits were
concatenated. But that might just have been a Perl thing.

Cheers,

Loris

--
This signature is currently under constuction.

Subject: Re: Printing UTF-8 mail to terminal
From: Left Right
Newsgroups: comp.lang.python
Date: Thu, 31 Oct 2024 16:38 UTC
References: 1 2
Path: eternal-september.org!news.eternal-september.org!feeder2.eternal-september.org!fu-berlin.de!uni-berlin.de!not-for-mail
From: olegsivokon@gmail.com (Left Right)
Newsgroups: comp.lang.python
Subject: Re: Printing UTF-8 mail to terminal
Date: Thu, 31 Oct 2024 17:38:50 +0100
Lines: 65
Message-ID: <mailman.61.1730392745.4695.python-list@python.org>
References: <878qu49tii.fsf@zedat.fu-berlin.de>
<CAJQBtgmwNbjYNr-LWYCia-9+CoRzaLj22YxzyP_EhwSspRD8_g@mail.gmail.com>
Mime-Version: 1.0
Content-Type: text/plain; charset="UTF-8"
Content-Transfer-Encoding: quoted-printable
X-Trace: news.uni-berlin.de Ebr66jxUYPpgSNVqrMDQagwsuWnmELtuorhJDVkkQAtg==
Cancel-Lock: sha1:EzzTALTbMKBbWgM4l7ouWWWUqqA= sha256:xABH9i9zfbEOOCd97XSBXtfTbUMBld89E1qgn70u1Lo=
Return-Path: <olegsivokon@gmail.com>
X-Original-To: python-list@python.org
Delivered-To: python-list@mail.python.org
Authentication-Results: mail.python.org; dkim=pass
reason="2048-bit key; unprotected key"
header.d=gmail.com header.i=@gmail.com header.b=BfNIQkxW;
dkim-adsp=pass; dkim-atps=neutral
X-Spam-Status: OK 0.001
X-Spam-Evidence: '*H*': 1.00; '*S*': 0.00; 'looks': 0.02; 'generated':
0.03; '31,': 0.05; 'containing': 0.05; 'windows,': 0.05; 'thing.':
0.07; 'utf-8': 0.07; 'url:mailman': 0.09; 'can,': 0.09; 'cc:addr
:python-list': 0.09; 'terminal': 0.09; 'cheers,': 0.11; 'cc:no
real name:2**0': 0.14; 'url:listinfo': 0.15; 'believe,': 0.16;
'bennett': 0.16; 'bits': 0.16; 'command-line': 0.16; 'dies': 0.16;
'displays': 0.16; 'encoding': 0.16; 'encoding.': 0.16; 'far,':
0.16; 'recall': 0.16; 'run,': 0.16; 'server,': 0.16; 'terminals':
0.16; 'terminology': 0.16; 'which,': 0.16; 'wrote:': 0.16;
'problem': 0.16; 'solve': 0.19; 'uses': 0.19; 'thu,': 0.19;
'cc:addr:python.org': 0.20; 'option': 0.20; 'url-
ip:188.166.95.178/32': 0.20; 'url-ip:188.166.95/24': 0.20;
'issue': 0.21; 'creates': 0.22; 'doubt': 0.22; 'version': 0.23;
'url-ip:188.166/16': 0.24; 'past': 0.25; 'cc:2**0': 0.25;
'behavior': 0.26; 'fact': 0.28; 'seem': 0.31; 'default': 0.31;
'message-id:@mail.gmail.com': 0.31; 'program': 0.32; "doesn't":
0.32; 'good.': 0.32; 'python-list': 0.32; 'but': 0.32; "i'm":
0.33; 'there': 0.33; 'header:In-Reply-To:1': 0.34;
'received:google.com': 0.34; 'windows': 0.34; 'mean': 0.34;
'printing': 0.34; 'from:addr:gmail.com': 0.34; 'currently': 0.37;
'using': 0.37; 'read': 0.38; '8bit%:14': 0.38; 'use': 0.39; 'two':
0.39; "that's": 0.39; 'quite': 0.39; 'text': 0.39; 'program.':
0.40; 'family': 0.60; 'physical': 0.60; 'remember': 0.61;
"there's": 0.61; 'subject': 0.63; 'similar': 0.63; 'email': 0.63;
'your': 0.64; 'german': 0.64; 'receiving': 0.66; 'lead': 0.67;
'body': 0.67; 'please,': 0.67; 'prevent': 0.67; 'site': 0.68;
'and,': 0.69; 'ist': 0.69; 'signature': 0.76; 'dr.': 0.77;
'client': 0.82; 'mail,': 0.91; 'subject:UTF': 0.91; 'skip:\xc3
10': 0.95; 'subject:mail': 0.95
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed;
d=gmail.com; s=20230601; t=1730392742; x=1730997542; darn=python.org;
h=content-transfer-encoding:cc:to:subject:message-id:date:from
:in-reply-to:references:mime-version:from:to:cc:subject:date
:message-id:reply-to;
bh=5kIsSS1zXQqwFhcELFUwtaXfK74sv+PMmAXADH4meLo=;
b=BfNIQkxWkI8TCpGlJO4DXlA+gPeFR5fHyp0zmbeMq+X/k4w1Y4uf0zqqC0Ju2hB0cX
LCd23W61qh1GCKmMzIddaXzTij3g8Uh4O0NWoRiXuW433S9OIaYY+N12TcVmS6AlldZ2
HgNMUkkfeFKuC2nl8DbgQmFuUkYuoGIGS2/oxOM6vZSKtu9uuPAoBQ2VVhqzsLYYFlJs
efTFoMw5kgrv5RtKGRgw8X5mV8fxsGbuetdV5+2SnDT68oUaab1RlOT5gZQAndOt3gGa
TwAVtbZm4RIYWQO3O2BM1i3hcWYhiISjEtupAhv4xbE+fd437RJk7XKzV5HFMNI90ygW
qMJg==
X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed;
d=1e100.net; s=20230601; t=1730392742; x=1730997542;
h=content-transfer-encoding:cc:to:subject:message-id:date:from
:in-reply-to:references:mime-version:x-gm-message-state:from:to:cc
:subject:date:message-id:reply-to;
bh=5kIsSS1zXQqwFhcELFUwtaXfK74sv+PMmAXADH4meLo=;
b=c/8eXRHyhcAxPe2xvKaASPWV7zQsOdnuq23ggdg13/ox13x4PYBxm+mWeHVx7sYgMV
yDGcQqWt5myA0ejzsIbVMQPVsoQDntsdBe6XaeivLI5zJDqZDxpkQbL13shkN3J8Buaw
K8vlTWUnXLLNhNB3F5npeMahTROR6Q+I4MdmfPGuA8DH3vUqVbA51y3+Y/rD8lpAanP4
A7jbokRBg7jrwjPkLUkkdz/iDMfZ9gg9nJIOalOD2HX/no/Ra4utNzJZw9y6hPzDK2ee
ATSpN8+rJavFnwrSpPRBzJGMSVSOqRPfUBcVdqlalc6yc8cyhLquNbSqzzZZERMd+QNM
njhw==
X-Gm-Message-State: AOJu0YxUvZzWIKPfHYxJOzFbdLxdbr8bjwkSPNshSpuxop43gQctdMXD
U40TFCfrgXnOJx9DXRTQ2uKuBVnDCaFrjlFAAz7njzUz64lFl2o/Q2kbjUKnLNLAQDfNliyJaTh
Ri+KsrYRHyO6bBc7ezdYHMgXhX/Q=
X-Google-Smtp-Source: AGHT+IFFUDMBkliaOhDXnwGkdxHiEU8VttOPFuaZlZHiQLef2sWg7nH2qPwmzs/EPdKqvAdAJ5cYctjM4jqunNjk6aA=
X-Received: by 2002:ac8:5814:0:b0:461:4907:ded9 with SMTP id
d75a77b69052e-461717026c3mr111858121cf.30.1730392742404; Thu, 31 Oct 2024
09:39:02 -0700 (PDT)
In-Reply-To: <878qu49tii.fsf@zedat.fu-berlin.de>
X-BeenThere: python-list@python.org
X-Mailman-Version: 2.1.39
Precedence: list
List-Id: General discussion list for the Python programming language
<python-list.python.org>
List-Unsubscribe: <https://mail.python.org/mailman/options/python-list>,
<mailto:python-list-request@python.org?subject=unsubscribe>
List-Archive: <https://mail.python.org/pipermail/python-list/>
List-Post: <mailto:python-list@python.org>
List-Help: <mailto:python-list-request@python.org?subject=help>
List-Subscribe: <https://mail.python.org/mailman/listinfo/python-list>,
<mailto:python-list-request@python.org?subject=subscribe>
X-Mailman-Original-Message-ID: <CAJQBtgmwNbjYNr-LWYCia-9+CoRzaLj22YxzyP_EhwSspRD8_g@mail.gmail.com>
X-Mailman-Original-References: <878qu49tii.fsf@zedat.fu-berlin.de>
View all headers

There's quite a lot of misuse of terminology around terminal / console
/ shell. Please, correct me if I'm wrong, but it looks like you are
printing that on MS Windows, right? MS Windows doesn't have or use
terminals (that's more of a Unix-related concept). And, by "terminal"
I mean terminal emulator (i.e. a program that emulates the behavior of
a physical terminal). You can, of course, find some terminal programs
for windows (eg. mintty), but I doubt that that's what you are dealing
with.

What MS Windows users usually end up using is the console. If you
run, eg. cmd.exe, it will create a process that displays a graphical
console. The console uses an encoding scheme to represent the text
output. I believe that the default on MS Windows is to use some
single-byte encoding. This answer from SE family site tells you how to
set the console encoding to UTF-8 permanently:
https://superuser.com/questions/269818/change-default-code-page-of-windows-console-to-utf-8
, which, I believe, will solve your problem with how the text is
displayed.

On Thu, Oct 31, 2024 at 5:19 PM Loris Bennett via Python-list
<python-list@python.org> wrote:
>
> Hi,
>
> I have a command-line program which creates an email containing German
> umlauts. On receiving the mail, my mail client displays the subject and
> body correctly:
>
> Subject: Übung
>
> Sehr geehrter Herr Dr. Bennett,
>
> Dies ist eine Übung.
>
> So far, so good. However, when I use the --verbose option to print
> the mail to the terminal via
>
> if args.verbose:
> print(mail)
>
> I get:
>
> Subject: Übungsbetreff
>
> Sehr geehrter Herr Dr. Bennett,
>
> Dies ist eine =C3=9Cbung.
>
> What do I need to do to prevent the body from getting mangled?
>
> I seem to remember that I had issues in the past with a Perl version of
> a similar program. As far as I recall there was an issue with fact the
> greeting is generated by querying a server, whereas the body is being
> read from a file, which lead to oddities when the two bits were
> concatenated. But that might just have been a Perl thing.
>
> Cheers,
>
> Loris
>
> --
> This signature is currently under constuction.
> --
> https://mail.python.org/mailman/listinfo/python-list

Subject: Re: Printing UTF-8 mail to terminal (Posting On Python-List Prohibited)
From: Lawrence D'Oliv
Newsgroups: comp.lang.python
Organization: A noiseless patient Spider
Date: Thu, 31 Oct 2024 19:35 UTC
References: 1
Path: eternal-september.org!news.eternal-september.org!.POSTED!not-for-mail
From: ldo@nz.invalid (Lawrence D'Oliveiro)
Newsgroups: comp.lang.python
Subject: Re: Printing UTF-8 mail to terminal (Posting On Python-List
Prohibited)
Date: Thu, 31 Oct 2024 19:35:50 -0000 (UTC)
Organization: A noiseless patient Spider
Lines: 14
Message-ID: <vg0m6l$2qq89$2@dont-email.me>
References: <878qu49tii.fsf@zedat.fu-berlin.de>
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit
Injection-Date: Thu, 31 Oct 2024 20:35:50 +0100 (CET)
Injection-Info: dont-email.me; posting-host="d4b447124d23a399310d6ccfb3eed0bc";
logging-data="2976009"; mail-complaints-to="abuse@eternal-september.org"; posting-account="U2FsdGVkX18njh6r27BjE10fFpfUu2ja"
User-Agent: Pan/0.160 (Toresk; )
Cancel-Lock: sha1:keyWe1kzu+W3FQ9TSmNtqCfPGN8=
View all headers

On Thu, 31 Oct 2024 16:33:41 +0100, Loris Bennett wrote:

> Dies ist eine =C3=9Cbung.
>
> What do I need to do to prevent the body from getting mangled?

I don’t think that’s actually getting mangled, that is how the actual
message body looks. What you have there is called “quoted printable”
encoding, and it’s a standard way to ensure the message body consists only
of 7-bit ASCII.

If you look at the source of the message, you should see a header line
like “Content-Transfer-Encoding: quoted-printable”. This is how your email
client knows how to display the text properly.

Subject: Re: Printing UTF-8 mail to terminal
From: Cameron Simpson
Newsgroups: comp.lang.python
Date: Thu, 31 Oct 2024 20:50 UTC
References: 1 2
Path: eternal-september.org!news.eternal-september.org!feeder2.eternal-september.org!fu-berlin.de!uni-berlin.de!not-for-mail
From: cs@cskk.id.au (Cameron Simpson)
Newsgroups: comp.lang.python
Subject: Re: Printing UTF-8 mail to terminal
Date: Fri, 1 Nov 2024 07:50:56 +1100
Lines: 39
Message-ID: <mailman.63.1730408232.4695.python-list@python.org>
References: <878qu49tii.fsf@zedat.fu-berlin.de>
<ZyPtsLSme7IJ-q4j@cskk.homeip.net>
Mime-Version: 1.0
Content-Type: text/plain; charset=utf-8; format=flowed
Content-Transfer-Encoding: 8bit
X-Trace: news.uni-berlin.de eueGt4p5U65xaq1mR8PxNAhtC0RlZedJTp6w3TONyGkA==
Cancel-Lock: sha1:fwqPqU8xHCjdYBXm0N4qn+ayPuQ= sha256:moixbWxwXl21haXc1zVE1B3IysyZvGPuDX9r7sF0a0c=
Return-Path: <cameron@cskk.id.au>
X-Original-To: python-list@python.org
Delivered-To: python-list@mail.python.org
Authentication-Results: mail.python.org; dkim=none reason="no signature";
dkim-adsp=none (unprotected policy); dkim-atps=neutral
X-Spam-Status: OK 0.001
X-Spam-Evidence: '*H*': 1.00; '*S*': 0.00; 'looks': 0.02;
'containing': 0.05; 'utf-8': 0.07; 'cc:addr:python-list': 0.09;
'module:': 0.09; 'parse': 0.09; 'terminal': 0.09; 'cheers,': 0.11;
'cc:no real name:2**0': 0.14; 'bennett': 0.16; 'cameron': 0.16;
'command-line': 0.16; 'dies': 0.16; 'directly,': 0.16; 'displays':
0.16; 'encoding': 0.16; 'encoding.': 0.16; 'far,': 0.16;
'from:addr:cs': 0.16; 'from:addr:cskk.id.au': 0.16;
'from:name:cameron simpson': 0.16; 'message-id:@cskk.homeip.net':
0.16; 'received:13.237': 0.16; 'received:13.237.201': 0.16;
'received:13.237.201.189': 0.16; 'received:cskk.id.au': 0.16;
'received:id.au': 0.16; 'received:mail.cskk.id.au': 0.16;
'simpson': 0.16; 'stdlib': 0.16; 'undo': 0.16; 'unicode': 0.16;
'wrote:': 0.16; 'python': 0.16; 'probably': 0.17;
'cc:addr:python.org': 0.20; 'option': 0.20; 'creates': 0.22;
"i'd": 0.24; 'cc:2**0': 0.25; 'binary': 0.26; 'object': 0.26;
'expect': 0.28; 'header:User-Agent:1': 0.30; 'module': 0.31;
'program': 0.32; 'good.': 0.32; 'header:In-Reply-To:1': 0.34;
'received:au': 0.35; 'using': 0.37; 'url-ip:151.101.0.223/32':
0.38; 'url-ip:151.101.128.223/32': 0.38; 'url-
ip:151.101.192.223/32': 0.38; 'url-ip:151.101.64.223/32': 0.38;
'8bit%:14': 0.38; 'use': 0.39; 'text': 0.39; 'happen': 0.40;
'subject': 0.63; 'email': 0.63; 'your': 0.64; 'german': 0.64;
'imagine': 0.64; 'received:13': 0.64; 'receiving': 0.66;
'received:userid': 0.66; 'body': 0.67; 'prevent': 0.67; 'ist':
0.69; 'transport': 0.69; 'dr.': 0.77; 'client': 0.82; 'mail,':
0.91; 'subject:UTF': 0.91; 'skip:\xc3 10': 0.95; 'subject:mail':
0.95
Mail-Followup-To: Loris Bennett <loris.bennett@fu-berlin.de>,
python-list@python.org
Content-Disposition: inline
In-Reply-To: <878qu49tii.fsf@zedat.fu-berlin.de>
User-Agent: Mutt/2.2.13 (2024-03-09)
X-BeenThere: python-list@python.org
X-Mailman-Version: 2.1.39
Precedence: list
List-Id: General discussion list for the Python programming language
<python-list.python.org>
List-Unsubscribe: <https://mail.python.org/mailman/options/python-list>,
<mailto:python-list-request@python.org?subject=unsubscribe>
List-Archive: <https://mail.python.org/pipermail/python-list/>
List-Post: <mailto:python-list@python.org>
List-Help: <mailto:python-list-request@python.org?subject=help>
List-Subscribe: <https://mail.python.org/mailman/listinfo/python-list>,
<mailto:python-list-request@python.org?subject=subscribe>
X-Mailman-Original-Message-ID: <ZyPtsLSme7IJ-q4j@cskk.homeip.net>
X-Mailman-Original-References: <878qu49tii.fsf@zedat.fu-berlin.de>
View all headers

On 31Oct2024 16:33, Loris Bennett <loris.bennett@fu-berlin.de> wrote:
>I have a command-line program which creates an email containing German
>umlauts. On receiving the mail, my mail client displays the subject and
>body correctly:
[...]
>So far, so good. However, when I use the --verbose option to print
>the mail to the terminal via
>
> if args.verbose:
> print(mail)
>
>I get:
>
> Subject: Übungsbetreff
>
> Sehr geehrter Herr Dr. Bennett,
>
> Dies ist eine =C3=9Cbung.
>
>What do I need to do to prevent the body from getting mangled?

That looks to me like quoted-printable. This is an encoding for binary
transport of text to make it robust against not 8-buit clean transports.
So your Unicode text is encodings as UTF-8, and then that is encoded in
quoted-printable for transport through the email system.

Your terminal probably accepts UTF-8 - I imagine other German text
renders corectly?

You need to get the text and undo the quoted-printable encoding.

If you're using the Python email module to parse (or construct) the
message as a `Message` object I'd expect that to happen automatically.

If you're just dealing with this directly, use the `quopri` stdlib
module: https://docs.python.org/3/library/quopri.html

Cheers,
Cameron Simpson <cs@cskk.id.au>

Subject: Re: Printing UTF-8 mail to terminal
From: Loris Bennett
Newsgroups: comp.lang.python
Organization: FUB-IT, Freie Universität Berlin
Date: Fri, 1 Nov 2024 06:52 UTC
References: 1 2 3
Path: eternal-september.org!news.eternal-september.org!feeder2.eternal-september.org!fu-berlin.de!uni-berlin.de!not-for-mail
From: loris.bennett@fu-berlin.de (Loris Bennett)
Newsgroups: comp.lang.python
Subject: Re: Printing UTF-8 mail to terminal
Date: Fri, 01 Nov 2024 07:52:32 +0100
Organization: FUB-IT, Freie Universität Berlin
Lines: 73
Message-ID: <87v7x7o37z.fsf@zedat.fu-berlin.de>
References: <878qu49tii.fsf@zedat.fu-berlin.de>
<CAJQBtgmwNbjYNr-LWYCia-9+CoRzaLj22YxzyP_EhwSspRD8_g@mail.gmail.com>
<mailman.61.1730392745.4695.python-list@python.org>
Mime-Version: 1.0
Content-Type: text/plain; charset=utf-8
Content-Transfer-Encoding: 8bit
X-Trace: news.uni-berlin.de WQQz/7h4bztscMQmJJEkdwmfQzy9JQzI8xw18e89uegDV3
Cancel-Lock: sha1:gFMXQd4U48qmvDY1Dy1PkTR6FE8= sha1:kaNX/HJJK9mfJ1hGxyKA5Mlr6uU= sha256:UURRQDzZ24pcwYZZjm99ox41pMxlKL8xio+emSddAiA=
User-Agent: Gnus/5.13 (Gnus v5.13) Emacs/28.2 (gnu/linux)
View all headers

Left Right <olegsivokon@gmail.com> writes:

> There's quite a lot of misuse of terminology around terminal / console
> / shell. Please, correct me if I'm wrong, but it looks like you are
> printing that on MS Windows, right? MS Windows doesn't have or use
> terminals (that's more of a Unix-related concept). And, by "terminal"
> I mean terminal emulator (i.e. a program that emulates the behavior of
> a physical terminal). You can, of course, find some terminal programs
> for windows (eg. mintty), but I doubt that that's what you are dealing
> with.
>
> What MS Windows users usually end up using is the console. If you
> run, eg. cmd.exe, it will create a process that displays a graphical
> console. The console uses an encoding scheme to represent the text
> output. I believe that the default on MS Windows is to use some
> single-byte encoding. This answer from SE family site tells you how to
> set the console encoding to UTF-8 permanently:
> https://superuser.com/questions/269818/change-default-code-page-of-windows-console-to-utf-8
> , which, I believe, will solve your problem with how the text is
> displayed.

I'm not using MS Windows. I am using a Gnome terminal on Debian 12
locally and connecting via SSH to a AlmaLinux 8 server, where I start a
tmux session.

> On Thu, Oct 31, 2024 at 5:19 PM Loris Bennett via Python-list
> <python-list@python.org> wrote:
>>
>> Hi,
>>
>> I have a command-line program which creates an email containing German
>> umlauts. On receiving the mail, my mail client displays the subject and
>> body correctly:
>>
>> Subject: Übung
>>
>> Sehr geehrter Herr Dr. Bennett,
>>
>> Dies ist eine Übung.
>>
>> So far, so good. However, when I use the --verbose option to print
>> the mail to the terminal via
>>
>> if args.verbose:
>> print(mail)
>>
>> I get:
>>
>> Subject: Übungsbetreff
>>
>> Sehr geehrter Herr Dr. Bennett,
>>
>> Dies ist eine =C3=9Cbung.
>>
>> What do I need to do to prevent the body from getting mangled?
>>
>> I seem to remember that I had issues in the past with a Perl version of
>> a similar program. As far as I recall there was an issue with fact the
>> greeting is generated by querying a server, whereas the body is being
>> read from a file, which lead to oddities when the two bits were
>> concatenated. But that might just have been a Perl thing.
>>
>> Cheers,
>>
>> Loris
>>
>> --
>> This signature is currently under constuction.
>> --
>> https://mail.python.org/mailman/listinfo/python-list
--
Dr. Loris Bennett (Herr/Mr)
FUB-IT, Freie Universität Berlin

Subject: Re: Printing UTF-8 mail to terminal
From: Loris Bennett
Newsgroups: comp.lang.python
Organization: FUB-IT, Freie Universität Berlin
Date: Fri, 1 Nov 2024 07:11 UTC
References: 1 2 3
Path: eternal-september.org!news.eternal-september.org!feeder2.eternal-september.org!fu-berlin.de!uni-berlin.de!not-for-mail
From: loris.bennett@fu-berlin.de (Loris Bennett)
Newsgroups: comp.lang.python
Subject: Re: Printing UTF-8 mail to terminal
Date: Fri, 01 Nov 2024 08:11:30 +0100
Organization: FUB-IT, Freie Universität Berlin
Lines: 75
Message-ID: <87msijo2cd.fsf@zedat.fu-berlin.de>
References: <878qu49tii.fsf@zedat.fu-berlin.de>
<ZyPtsLSme7IJ-q4j@cskk.homeip.net>
<mailman.63.1730408232.4695.python-list@python.org>
Mime-Version: 1.0
Content-Type: text/plain; charset=utf-8
Content-Transfer-Encoding: 8bit
X-Trace: news.uni-berlin.de F/voHON23hlB+hJzzQkdgQdd7VyltylDGj/0NOCru3pNdh
Cancel-Lock: sha1:YBVYSUz5TVITFkDZI4hNbhYIlh0= sha1:oHvYR+NR3yndeLgpTVG1dpe/yss= sha256:80BQ/xH2QjEOCLFULCS0613plrzH4g/VRW//Iag29fY=
User-Agent: Gnus/5.13 (Gnus v5.13) Emacs/28.2 (gnu/linux)
View all headers

Cameron Simpson <cs@cskk.id.au> writes:

> On 31Oct2024 16:33, Loris Bennett <loris.bennett@fu-berlin.de> wrote:
>>I have a command-line program which creates an email containing German
>>umlauts. On receiving the mail, my mail client displays the subject and
>>body correctly:
> [...]
>>So far, so good. However, when I use the --verbose option to print
>>the mail to the terminal via
>>
>> if args.verbose:
>> print(mail)
>>
>>I get:
>>
>> Subject: Übungsbetreff
>>
>> Sehr geehrter Herr Dr. Bennett,
>>
>> Dies ist eine =C3=9Cbung.
>>
>>What do I need to do to prevent the body from getting mangled?
>
> That looks to me like quoted-printable. This is an encoding for binary
> transport of text to make it robust against not 8-buit clean
> transports. So your Unicode text is encodings as UTF-8, and then that
> is encoded in quoted-printable for transport through the email system.

As I mentioned, I think the problem is to do with the way the salutation
text provided by the "salutation server" and the mail body from a file
are encoded. This seems to be different.

> Your terminal probably accepts UTF-8 - I imagine other German text
> renders corectly?

Yes, it does.

> You need to get the text and undo the quoted-printable encoding.
>
> If you're using the Python email module to parse (or construct) the
> message as a `Message` object I'd expect that to happen automatically.

I am using

email.message.EmailMessage

as, from the Python documentation

https://docs.python.org/3/library/email.examples.html

I gathered that that is the standard approach.

And you are right that encoding for the actual mail which is received is
automatically sorted out. If I display the raw email in my client I get
the following:

Content-Type: text/plain; charset="utf-8"
Content-Transfer-Encoding: quoted-printable
...
Subject: =?utf-8?q?=C3=9Cbungsbetreff?=
...
Dies ist eine =C3=9Cbung.

I would interpret that as meaning that the subject and body are encoded
in the same way.

The problem just occurs with the unsent string representation printed to
the terminal.

Cheers,

Loris

--
This signature is currently under constuction.

Subject: Re: Printing UTF-8 mail to terminal
From: Loris Bennett
Newsgroups: comp.lang.python
Organization: FUB-IT, Freie Universität Berlin
Date: Fri, 1 Nov 2024 09:10 UTC
References: 1 2 3 4
Path: eternal-september.org!news.eternal-september.org!feeder2.eternal-september.org!fu-berlin.de!uni-berlin.de!not-for-mail
From: loris.bennett@fu-berlin.de (Loris Bennett)
Newsgroups: comp.lang.python
Subject: Re: Printing UTF-8 mail to terminal
Date: Fri, 01 Nov 2024 10:10:03 +0100
Organization: FUB-IT, Freie Universität Berlin
Lines: 105
Message-ID: <875xp7nwus.fsf@zedat.fu-berlin.de>
References: <878qu49tii.fsf@zedat.fu-berlin.de>
<ZyPtsLSme7IJ-q4j@cskk.homeip.net>
<mailman.63.1730408232.4695.python-list@python.org>
<87msijo2cd.fsf@zedat.fu-berlin.de>
Mime-Version: 1.0
Content-Type: text/plain; charset=utf-8
Content-Transfer-Encoding: 8bit
X-Trace: news.uni-berlin.de d0A9IH+Li+s7M7wAQ4teiQvaoAmxBcmdPOpzuGedwo6n+a
Cancel-Lock: sha1:cmZwrIDAcdUi4798XRpapXgmCm8= sha1:F3eo7XCCZDS30Wg/LzckvwylPFU= sha256:DfogCWjb8/NrMJhlFEtx+qapWnDoWnvJ5FhLz2TKgy4=
User-Agent: Gnus/5.13 (Gnus v5.13) Emacs/28.2 (gnu/linux)
View all headers

"Loris Bennett" <loris.bennett@fu-berlin.de> writes:

> Cameron Simpson <cs@cskk.id.au> writes:
>
>> On 31Oct2024 16:33, Loris Bennett <loris.bennett@fu-berlin.de> wrote:
>>>I have a command-line program which creates an email containing German
>>>umlauts. On receiving the mail, my mail client displays the subject and
>>>body correctly:
>> [...]
>>>So far, so good. However, when I use the --verbose option to print
>>>the mail to the terminal via
>>>
>>> if args.verbose:
>>> print(mail)
>>>
>>>I get:
>>>
>>> Subject: Übungsbetreff
>>>
>>> Sehr geehrter Herr Dr. Bennett,
>>>
>>> Dies ist eine =C3=9Cbung.
>>>
>>>What do I need to do to prevent the body from getting mangled?
>>
>> That looks to me like quoted-printable. This is an encoding for binary
>> transport of text to make it robust against not 8-buit clean
>> transports. So your Unicode text is encodings as UTF-8, and then that
>> is encoded in quoted-printable for transport through the email system.
>
> As I mentioned, I think the problem is to do with the way the salutation
> text provided by the "salutation server" and the mail body from a file
> are encoded. This seems to be different.
>
>> Your terminal probably accepts UTF-8 - I imagine other German text
>> renders corectly?
>
> Yes, it does.
>
>> You need to get the text and undo the quoted-printable encoding.
>>
>> If you're using the Python email module to parse (or construct) the
>> message as a `Message` object I'd expect that to happen automatically.
>
> I am using
>
> email.message.EmailMessage
>
> as, from the Python documentation
>
> https://docs.python.org/3/library/email.examples.html
>
> I gathered that that is the standard approach.
>
> And you are right that encoding for the actual mail which is received is
> automatically sorted out. If I display the raw email in my client I get
> the following:
>
> Content-Type: text/plain; charset="utf-8"
> Content-Transfer-Encoding: quoted-printable
> ...
> Subject: =?utf-8?q?=C3=9Cbungsbetreff?=
> ...
> Dies ist eine =C3=9Cbung.
>
> I would interpret that as meaning that the subject and body are encoded
> in the same way.
>
> The problem just occurs with the unsent string representation printed to
> the terminal.

If I log the body like this

body = f"{salutation},\n\n{text}\n{signature}"
logger.debug("body: " + body)
and look at the log file in my terminal I see

2024-11-01 09:59:12,318 - DEBUG - mailer:create_body - body: Sehr geehrter Herr Dr. Bennett,

Dies ist eine Übung.
...

as expected. The non-UTF-8 text occurs when I do

mail = EmailMessage()
mail.set_content(body, cte="quoted-printable")
...

if args.verbose:
print(mail)

which is presumably also correct.

The question is: What conversion is necessary in order to print the
EmailMessage object to the terminal, such that the quoted-printable
parts are turned (back) into UTF-8?

Cheers,

Loris

--
This signature is currently under constuction.

Subject: Re: Printing UTF-8 mail to terminal
From: dieter.maurer@online.de
Newsgroups: comp.lang.python
Date: Fri, 1 Nov 2024 16:38 UTC
References: 1 2 3 4 5 6
Path: eternal-september.org!news.eternal-september.org!feeder2.eternal-september.org!fu-berlin.de!uni-berlin.de!not-for-mail
From: dieter.maurer@online.de
Newsgroups: comp.lang.python
Subject: Re: Printing UTF-8 mail to terminal
Date: Fri, 1 Nov 2024 17:38:01 +0100
Lines: 8
Message-ID: <mailman.67.1730480556.4695.python-list@python.org>
References: <878qu49tii.fsf@zedat.fu-berlin.de>
<ZyPtsLSme7IJ-q4j@cskk.homeip.net>
<mailman.63.1730408232.4695.python-list@python.org>
<87msijo2cd.fsf@zedat.fu-berlin.de>
<875xp7nwus.fsf@zedat.fu-berlin.de>
<26405.1001.772245.235696@ixdm.fritz.box>
Mime-Version: 1.0
Content-Type: text/plain; charset=us-ascii
Content-Transfer-Encoding: 7bit
X-Trace: news.uni-berlin.de UyAyk2zu3udpW0Y+5nCZIwVh7ejoneOOskqFx0pJo9EQ==
Cancel-Lock: sha1:6IDueu8FTS6yWgfwgqXTpA9s3NY= sha256:jG/c7Qu6NJzQKdPlevHFWHUFOD/SEITzcY6nkKMPqgw=
Return-Path: <dieter.maurer@online.de>
X-Original-To: python-list@python.org
Delivered-To: python-list@mail.python.org
Authentication-Results: mail.python.org; dkim=pass
reason="2048-bit key; unprotected key"
header.d=online.de header.i=dieter.maurer@online.de
header.b=Zqzw+8Ws; dkim-adsp=pass; dkim-atps=neutral
X-Spam-Status: OK 0.119
X-Spam-Level: *
X-Spam-Evidence: '*H*': 0.80; '*S*': 0.04; 'received:212.227': 0.07;
'cc:addr:python-list': 0.09; 'skip:` 10': 0.09; 'cc:no real
name:2**0': 0.14; 'bennett': 0.16; 'cc:addr:python.org': 0.20;
'received:de': 0.23; 'cc:2**0': 0.25; 'received:kundenserver.de':
0.32; 'received:mout.kundenserver.de': 0.32; 'header:In-Reply-
To:1': 0.34; 'skip:" 20': 0.34; 'request': 0.35; '...': 0.37;
'received:192.168': 0.37; 'use': 0.39; 'wrote': 0.39; 'skip:m 20':
0.63; 'pass': 0.64; 'above,': 0.70; 'content': 0.72;
'subject:UTF': 0.91; 'subject:mail': 0.95
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=online.de;
s=s42582890; t=1730480555; x=1731085355; i=dieter.maurer@online.de;
bh=H10rwTyMZ1JGS7ZWy9895/XC8hDzixuzkjE1vokqMWU=;
h=X-UI-Sender-Class:MIME-Version:Content-Type:
Content-Transfer-Encoding:Message-ID:Date:From:To:Cc:Subject:
In-Reply-To:References:cc:content-transfer-encoding:content-type:
date:from:message-id:mime-version:reply-to:subject:to;
b=Zqzw+8WsTXKzuxSKT2XAiPPuGkifBSJDEPl1Z+/OrJmayhmqBjD99TDdfGbel4OM
XazqjxjE24nl2jOZ1RGU0pAb4G8hJT0aZKlOJTt3f7RAq3amIJrbjoMvk07Cz/Juv
XhOlxUSGKkESWaM/JJXT/4Ek3LpARE7kUmBebXMZM5DPnE6Q9kH6zOgx76jtQ7Zx4
IhsdPdYT10bNObzyzEtEBHaud9HvRdJWZG4q2GLJJJUyDnV4H1JXZ4PLG0J/vwvMs
1NSyvJaeLKGS2LTqmQ+9PEESWH0zGf140mWZ1nZAqlQj8bNHy59wgDPzwPaCOwIq8
eTQSKNs0pZtqsES7vQ==
X-UI-Sender-Class: 6003b46c-3fee-4677-9b8b-2b628d989298
In-Reply-To: <875xp7nwus.fsf@zedat.fu-berlin.de>
X-Mailer: VM 8.0.12-devo-585 under 21.4 (patch 24) "Standard C" XEmacs Lucid
(x86_64-linux-gnu)
X-Provags-ID: V03:K1:xdKXDKPDdxAlQrDcWWWal5dohohSQXbYYQh5uyKo/+B/lbwt7u7
zQU6A9yk61JaWTM1Ay1WW1LSa6jrKNd7uHkzxmwyZkK/er3EWncNZ6ETu7EPI5v7GhYCP+s
Nt2Ul5L0r2BPG3REIv9KsklWfgK0ZWDkeN4M73n9lLz0DY8N5QwW+5Y6tLXNRZxmaGwPUvB
+PudzwryQTOJJRFeIgBGA==
X-Spam-Flag: NO
UI-OutboundReport: notjunk:1;M01:P0:7yNH9MsW0cY=;8FSvq6gm7aEKjyaBbXMXIBND8bw
IYsgVmoC8e7SvL+JJ2Z16BZTieM5HfqVl5cEhhggE4yMN2lMetj5Q/Ja3nb13cTwhVGhkVTjC
4zxWyge2laMgk8+AURWHq77bLqqOQr8hX5QPLrLYwhz6oCjSw4Lpkfkrgl7aV5sA7dvHRCGZz
pspmj8KWY4dia4PXNTorA3zb45grp256yMJCuck6dKxFGbMbzHSZZKKLqSVO06p3UKdLjpJSo
3YtiV+TuvJGb6O0I4DshyOM6xMaAaHOlQeouN/xkOh8IgwsaO565saNGJbR7KiDI073kUsq/c
Ykc+2fnYUhovLqdoPIdWWZBgiILuTEn8XEMNXHU7UfjGJtvRCk6YwQiyKg/1lgnNUZ4nm1qPT
2+gPvdMXiMiU+sAqzClX0pIowopAAs0th6tUu3YU3XXZYUk5tJS+IOrx9lM3bFE1rcogKUelZ
lzNPESlY5qJ66Ikn0yNctPLOPbwjXHdmuFF0zWXZVCKz+nFu3wUB8TDqJ+rqw2+9GJrdTYH7/
sEJGak2q9Sd1wMF33Yv+uUgIQXdDderDMqbm8ZqqnyW7o611q3J16XlZIOvziSKbJb0jcm2p0
ucJhyMWZdEYrb5TyUA0W0kNQtYWooXT4SYgQWAdv56tnYcRI0jd+8biRRh9I0g+Wve7JC5fHD
EynQhb2Sb+RjxA0r8tOZHNZXIuzbFXpfdYCWyyb3K1V6sOagh9Su21URcKl3DArTRgYjEumz7
1qkuECxTaMsg7wqfhrezDIJ4eQo8SSzNg==
X-BeenThere: python-list@python.org
X-Mailman-Version: 2.1.39
Precedence: list
List-Id: General discussion list for the Python programming language
<python-list.python.org>
List-Unsubscribe: <https://mail.python.org/mailman/options/python-list>,
<mailto:python-list-request@python.org?subject=unsubscribe>
List-Archive: <https://mail.python.org/pipermail/python-list/>
List-Post: <mailto:python-list@python.org>
List-Help: <mailto:python-list-request@python.org?subject=help>
List-Subscribe: <https://mail.python.org/mailman/listinfo/python-list>,
<mailto:python-list-request@python.org?subject=subscribe>
X-Mailman-Original-Message-ID: <26405.1001.772245.235696@ixdm.fritz.box>
X-Mailman-Original-References: <878qu49tii.fsf@zedat.fu-berlin.de>
<ZyPtsLSme7IJ-q4j@cskk.homeip.net>
<mailman.63.1730408232.4695.python-list@python.org>
<87msijo2cd.fsf@zedat.fu-berlin.de>
<875xp7nwus.fsf@zedat.fu-berlin.de>
View all headers

Loris Bennett wrote at 2024-11-1 10:10 +0100:
> ...
> mail.set_content(body, cte="quoted-printable")

In the line above, you request the content to use
the "cte" (= "Content-Transfer-Encoding") "quoted-printable"
and consequently, the content is encoded with `quoted-printable`.
Maybe, you do not need to pass `cte`?

Subject: Re: Printing UTF-8 mail to terminal
From: Cameron Simpson
Newsgroups: comp.lang.python
Date: Fri, 1 Nov 2024 21:44 UTC
References: 1 2
Path: eternal-september.org!news.eternal-september.org!feeder2.eternal-september.org!fu-berlin.de!uni-berlin.de!not-for-mail
From: cs@cskk.id.au (Cameron Simpson)
Newsgroups: comp.lang.python
Subject: Re: Printing UTF-8 mail to terminal
Date: Sat, 2 Nov 2024 08:44:18 +1100
Lines: 41
Message-ID: <mailman.68.1730497471.4695.python-list@python.org>
References: <87msijo2cd.fsf@zedat.fu-berlin.de>
<ZyVLsn2l9jJ2Fikl@cskk.homeip.net>
Mime-Version: 1.0
Content-Type: text/plain; charset=us-ascii; format=flowed
X-Trace: news.uni-berlin.de icP4HYD7+Der+lEcbh569w+15SjMeNSXwW9qvy68AQ8A==
Cancel-Lock: sha1:FvwwmjqROz2Z0JgkRkJAFKzDheM= sha256:mJG7vIw5gvx9B2dJrL8P3MC4HgYFZN1DlVBOLr3WqEM=
Return-Path: <cameron@cskk.id.au>
X-Original-To: python-list@python.org
Delivered-To: python-list@mail.python.org
Authentication-Results: mail.python.org; dkim=none reason="no signature";
dkim-adsp=none (unprotected policy); dkim-atps=neutral
X-Spam-Status: OK 0.004
X-Spam-Evidence: '*H*': 0.99; '*S*': 0.00; 'string': 0.07;
'text/plain;': 0.07; 'cc:addr:python-list': 0.09; 'parse': 0.09;
'skip:` 20': 0.09; 'writes:': 0.09; 'yes.': 0.09; 'cc:no real
name:2**0': 0.14; 'bennett': 0.16; 'dies': 0.16; 'encoding': 0.16;
'encoding.': 0.16; 'from:addr:cs': 0.16; 'from:addr:cskk.id.au':
0.16; 'from:name:cameron simpson': 0.16; 'message-
id:@cskk.homeip.net': 0.16; 'obtained.': 0.16; 'received:13.237':
0.16; 'received:13.237.201': 0.16; 'received:13.237.201.189':
0.16; 'received:cskk.id.au': 0.16; 'received:id.au': 0.16;
'received:mail.cskk.id.au': 0.16; 'right.': 0.16; 'simpson': 0.16;
'skip:> 10': 0.16; 'sorted': 0.16; 'wrote:': 0.16; 'problem':
0.16; 'python': 0.16; 'cc:addr:python.org': 0.20; 'way.': 0.22;
'code': 0.23; "i'd": 0.24; 'actual': 0.25; 'cc:2**0': 0.25;
'seems': 0.26; 'object': 0.26; 'suspect': 0.26; 'expect': 0.28;
'thinking': 0.28; 'example,': 0.28; 'header:User-Agent:1': 0.30;
'printed': 0.31; 'approach': 0.31; 'module': 0.31; 'raw': 0.32;
'but': 0.32; 'header:In-Reply-To:1': 0.34; 'same': 0.34;
'meaning': 0.35; 'received:au': 0.35; 'yes,': 0.35; 'display':
0.36; '...': 0.37; 'using': 0.37; 'could': 0.37; 'example': 0.37;
'file': 0.38; 'way': 0.38; 'happen': 0.40; 'should': 0.40;
"there's": 0.61; 'forward': 0.62; 'subject': 0.63; 'email': 0.63;
'me.': 0.64; 'received:13': 0.64; 'received:userid': 0.66; 'body':
0.67; 'skip:e 20': 0.67; 'right': 0.68; 'following:': 0.69; 'ist':
0.69; 'out.': 0.80; 'leads': 0.81; 'client': 0.82; 'code)': 0.84;
'transcribe': 0.84; 'subject:UTF': 0.91; 'subject:mail': 0.95
Mail-Followup-To: Loris Bennett <loris.bennett@fu-berlin.de>,
python-list@python.org
Content-Disposition: inline
In-Reply-To: <87msijo2cd.fsf@zedat.fu-berlin.de>
User-Agent: Mutt/2.2.13 (2024-03-09)
X-BeenThere: python-list@python.org
X-Mailman-Version: 2.1.39
Precedence: list
List-Id: General discussion list for the Python programming language
<python-list.python.org>
List-Unsubscribe: <https://mail.python.org/mailman/options/python-list>,
<mailto:python-list-request@python.org?subject=unsubscribe>
List-Archive: <https://mail.python.org/pipermail/python-list/>
List-Post: <mailto:python-list@python.org>
List-Help: <mailto:python-list-request@python.org?subject=help>
List-Subscribe: <https://mail.python.org/mailman/listinfo/python-list>,
<mailto:python-list-request@python.org?subject=subscribe>
X-Mailman-Original-Message-ID: <ZyVLsn2l9jJ2Fikl@cskk.homeip.net>
X-Mailman-Original-References: <87msijo2cd.fsf@zedat.fu-berlin.de>
View all headers

On 01Nov2024 08:11, Loris Bennett <loris.bennett@fu-berlin.de> wrote:
>Cameron Simpson <cs@cskk.id.au> writes:
>> If you're using the Python email module to parse (or construct) the
>> message as a `Message` object I'd expect that to happen automatically.
>
>I am using
> email.message.EmailMessage

Noted. That seems like the correct approach to me.

>And you are right that encoding for the actual mail which is received
>is
>automatically sorted out. If I display the raw email in my client I get
>the following:
>
> Content-Type: text/plain; charset="utf-8"
> Content-Transfer-Encoding: quoted-printable
> ...
> Subject: =?utf-8?q?=C3=9Cbungsbetreff?=
> ...
> Dies ist eine =C3=9Cbung.

Right. Quoted-printable encoding for the transport.

>I would interpret that as meaning that the subject and body are encoded
>in the same way.

Yes.

>The problem just occurs with the unsent string representation printed to
>the terminal.

Yes, and I was thinking abut this yesterday. I suspect that
`print(some_message_object)` is intended to transcribe it for transport.
For example, one could write to an mbox file and just print() the
message into it and get correct transport/storage formatting, which
includes the qp encoding.

Can you should the code (or example code) which leads to the qp output?
I suspect there's a straight forward way to get the decoded Unicode, but
I'd need to see how what you've got was obtained.

Subject: Re: Printing UTF-8 mail to terminal
From: Cameron Simpson
Newsgroups: comp.lang.python
Date: Fri, 1 Nov 2024 21:47 UTC
References: 1 2
Path: eternal-september.org!news.eternal-september.org!feeder2.eternal-september.org!fu-berlin.de!uni-berlin.de!not-for-mail
From: cs@cskk.id.au (Cameron Simpson)
Newsgroups: comp.lang.python
Subject: Re: Printing UTF-8 mail to terminal
Date: Sat, 2 Nov 2024 08:47:39 +1100
Lines: 23
Message-ID: <mailman.69.1730497664.4695.python-list@python.org>
References: <875xp7nwus.fsf@zedat.fu-berlin.de>
<ZyVMe3Jspc0fJrel@cskk.homeip.net>
Mime-Version: 1.0
Content-Type: text/plain; charset=us-ascii; format=flowed
X-Trace: news.uni-berlin.de aiGkdu7mnBlC+QpbtciATQM7CP9PzZlKXCabZx5ABdCg==
Cancel-Lock: sha1:0D12DiuWhBmeStqJ0I7xYwzKgns= sha256:HzT96hZXHpHL9RF1EoAIDjF2IBOYjROoIoDVQ3LdPZI=
Return-Path: <cameron@cskk.id.au>
X-Original-To: python-list@python.org
Delivered-To: python-list@mail.python.org
Authentication-Results: mail.python.org; dkim=none reason="no signature";
dkim-adsp=none (unprotected policy); dkim-atps=neutral
X-Spam-Status: OK 0.004
X-Spam-Evidence: '*H*': 0.99; '*S*': 0.00; 'string': 0.07; 'cc:addr
:python-list': 0.09; 'cc:no real name:2**0': 0.14; 'that.': 0.15;
'bennett': 0.16; 'conversion': 0.16; 'expected.': 0.16;
'from:addr:cs': 0.16; 'from:addr:cskk.id.au': 0.16;
'from:name:cameron simpson': 0.16; 'message-id:@cskk.homeip.net':
0.16; 'objective': 0.16; 'presumably': 0.16; 'received:13.237':
0.16; 'received:13.237.201': 0.16; 'received:13.237.201.189':
0.16; 'received:cskk.id.au': 0.16; 'received:id.au': 0.16;
'received:mail.cskk.id.au': 0.16; 'skip:> 10': 0.16; 'unicode':
0.16; 'wrote:': 0.16; 'python': 0.16; 'cc:addr:python.org': 0.20;
'cc:2**0': 0.25; 'object': 0.26; 'header:User-Agent:1': 0.30;
'question': 0.32; 'obtain': 0.32; 'header:In-Reply-To:1': 0.34;
'received:au': 0.35; '...': 0.37; 'necessary': 0.39; 'text': 0.39;
'otherwise': 0.39; 'still': 0.40; 'skip:m 20': 0.63;
'received:13': 0.64; 'received:userid': 0.66; 'body': 0.67;
'order': 0.69; 'subject:UTF': 0.91; 'subject:mail': 0.95;
'turned': 0.95
Mail-Followup-To: Loris Bennett <loris.bennett@fu-berlin.de>,
python-list@python.org
Content-Disposition: inline
In-Reply-To: <875xp7nwus.fsf@zedat.fu-berlin.de>
User-Agent: Mutt/2.2.13 (2024-03-09)
X-BeenThere: python-list@python.org
X-Mailman-Version: 2.1.39
Precedence: list
List-Id: General discussion list for the Python programming language
<python-list.python.org>
List-Unsubscribe: <https://mail.python.org/mailman/options/python-list>,
<mailto:python-list-request@python.org?subject=unsubscribe>
List-Archive: <https://mail.python.org/pipermail/python-list/>
List-Post: <mailto:python-list@python.org>
List-Help: <mailto:python-list-request@python.org?subject=help>
List-Subscribe: <https://mail.python.org/mailman/listinfo/python-list>,
<mailto:python-list-request@python.org?subject=subscribe>
X-Mailman-Original-Message-ID: <ZyVMe3Jspc0fJrel@cskk.homeip.net>
X-Mailman-Original-References: <875xp7nwus.fsf@zedat.fu-berlin.de>
View all headers

On 01Nov2024 10:10, Loris Bennett <loris.bennett@fu-berlin.de> wrote:
>as expected. The non-UTF-8 text occurs when I do
>
> mail = EmailMessage()
> mail.set_content(body, cte="quoted-printable")
> ...
>
> if args.verbose:
> print(mail)
>
>which is presumably also correct.
>
>The question is: What conversion is necessary in order to print the
>EmailMessage object to the terminal, such that the quoted-printable
>parts are turned (back) into UTF-8?

Do you still have access to `body` ? That would be the original message
text? Otherwise maybe:

print(mail.get_content())

The objective is to obtain the message body Unicode text (i.e. a regular
Python string with the original text, unencoded). And to print that.

Subject: Re: Printing UTF-8 mail to terminal
From: Inada Naoki
Newsgroups: comp.lang.python
Date: Sun, 3 Nov 2024 03:08 UTC
References: 1 2 3 4 5
Path: eternal-september.org!news.eternal-september.org!feeder2.eternal-september.org!fu-berlin.de!uni-berlin.de!not-for-mail
From: songofacandy@gmail.com (Inada Naoki)
Newsgroups: comp.lang.python
Subject: Re: Printing UTF-8 mail to terminal
Date: Sun, 3 Nov 2024 12:08:41 +0900
Lines: 86
Message-ID: <mailman.75.1730603335.4695.python-list@python.org>
References: <878qu49tii.fsf@zedat.fu-berlin.de>
<CAJQBtgmwNbjYNr-LWYCia-9+CoRzaLj22YxzyP_EhwSspRD8_g@mail.gmail.com>
<mailman.61.1730392745.4695.python-list@python.org>
<87v7x7o37z.fsf@zedat.fu-berlin.de>
<CAEfz+Ty9LNyHWzouvxxSxmtjQJ=zGhBWBxq3NdH=bJLACBpmBg@mail.gmail.com>
Mime-Version: 1.0
Content-Type: text/plain; charset="UTF-8"
Content-Transfer-Encoding: quoted-printable
X-Trace: news.uni-berlin.de u/HO0jSrGa7fVFUh4jl+kw2NMW2rF6TVr9DeMkKMsl2w==
Cancel-Lock: sha1:pfUJ1pOkz6slXm16KGqblwJhN0o= sha256:N8hXZGdqa5waSLgw7oqnbT17gawS9zxDC6gdKwd9h80=
Return-Path: <songofacandy@gmail.com>
X-Original-To: python-list@python.org
Delivered-To: python-list@mail.python.org
Authentication-Results: mail.python.org; dkim=pass
reason="2048-bit key; unprotected key"
header.d=gmail.com header.i=@gmail.com header.b=D42OhfyA;
dkim-adsp=pass; dkim-atps=neutral
X-Spam-Status: OK 0.000
X-Spam-Evidence: '*H*': 1.00; '*S*': 0.00; 'looks': 0.02; 'generated':
0.03; '31,': 0.05; 'containing': 0.05; 'windows,': 0.05; 'thing.':
0.07; 'utf-8': 0.07; 'url:mailman': 0.09; 'can,': 0.09; 'cc:addr
:python-list': 0.09; 'debian': 0.09; 'locally': 0.09; 'terminal':
0.09; 'writes:': 0.09; 'cheers,': 0.11; '&gt;': 0.14;
'url:listinfo': 0.15; '&gt;&gt;\xc2\xa0': 0.16; 'believe,': 0.16;
'bennett': 0.16; 'bits': 0.16; 'cc:name:python': 0.16; 'command-
line': 0.16; 'dies': 0.16; 'displays': 0.16; 'encoding': 0.16;
'encoding.': 0.16; 'far,': 0.16; 'recall': 0.16; 'run,': 0.16;
'server,': 0.16; 'terminals': 0.16; 'terminology': 0.16; 'tmux':
0.16; 'which,': 0.16; 'windows.': 0.16; 'wrote:': 0.16; 'problem':
0.16; 'solve': 0.19; 'uses': 0.19; 'thu,': 0.19;
'cc:addr:python.org': 0.20; 'option': 0.20; 'url-
ip:188.166.95.178/32': 0.20; 'url-ip:188.166.95/24': 0.20;
'issue': 0.21; 'creates': 0.22; 'doubt': 0.22; 'version': 0.23;
'url-ip:188.166/16': 0.24; 'past': 0.25; 'cc:2**0': 0.25;
'behavior': 0.26; 'fact': 0.28; 'email addr:python.org&gt;': 0.28;
'seem': 0.31; 'default': 0.31; 'message-id:@mail.gmail.com': 0.31;
'program': 0.32; "doesn't": 0.32; 'good.': 0.32; 'python-list':
0.32; 'but': 0.32; "i'm": 0.33; 'there': 0.33; 'header:In-Reply-
To:1': 0.34; 'received:google.com': 0.34; 'windows': 0.34; 'mean':
0.34; 'printing': 0.34; 'from:addr:gmail.com': 0.34; 'berlin':
0.35; 'skip:2 20': 0.35; 'currently': 0.37; 'using': 0.37; 'read':
0.38; '8bit%:14': 0.38; 'use': 0.39; 'two': 0.39; "that's": 0.39;
'quite': 0.39; 'text': 0.39; 'program.': 0.40; 'try': 0.40;
'family': 0.60; 'physical': 0.60; 'remember': 0.61; "there's":
0.61; 'skip:\xc2 10': 0.62; 'subject': 0.63; 'similar': 0.63;
'email': 0.63; 'your': 0.64; 'german': 0.64; 'receiving': 0.66;
'lead': 0.67; 'body': 0.67; 'please,': 0.67; 'prevent': 0.67;
'site': 0.68; 'right': 0.68; 'and,': 0.69; 'ist': 0.69;
'signature': 0.76; 'dr.': 0.77; 'client': 0.82; 'left': 0.83;
'email name:&lt;python-list': 0.84; 'mail,': 0.91; 'subject:UTF':
0.91; 'skip:\xc3 10': 0.95; 'subject:mail': 0.95
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed;
d=gmail.com; s=20230601; t=1730603333; x=1731208133; darn=python.org;
h=cc:to:subject:message-id:date:from:in-reply-to:references
:mime-version:from:to:cc:subject:date:message-id:reply-to;
bh=Jiza1H9X8omr+8tPi2l1eXE/Uwbg0ZOeshgaLarlt8A=;
b=D42OhfyAZ31y/zFdrxDsNOjOgzD2ztrbNEWJeem+j/R0rU3cUBavH0RrFeqQMo5AjY
qF3mtFzMC+Sg2sHs9TAZJHFRtznFBam26XUP5yPAVtvgTHIHmgfpAL4k5TyGztdcZuom
81fxHpwjIrKiz3qnxn5yqgFqhXuqw+oYnreu6VlJ+huJWxlG2HM6uDOvupekMRRmdp9o
VstUTWtRaMXrcg3A6qyPrHRjm16HVbdSSOGCXXBjf7Mc4Mezpks0m5QFeMjQiATjnqzy
xGYeGqjMLgw9U99qjWlVs0k/7nz0cKAiia9ttb/P4PU2GUK71XSclz3nFKO/IUmaj+Gb
GOJQ==
X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed;
d=1e100.net; s=20230601; t=1730603333; x=1731208133;
h=cc:to:subject:message-id:date:from:in-reply-to:references
:mime-version:x-gm-message-state:from:to:cc:subject:date:message-id
:reply-to;
bh=Jiza1H9X8omr+8tPi2l1eXE/Uwbg0ZOeshgaLarlt8A=;
b=BVE0cWmfS6D+mk6C1vdsLzTEJ/BEE1IfZPrB6vVRYI4hHO64BCbj08L67q7mUjKBBd
v4E8JaZ4kY02EgcggJi+C/e4DOUF/ldoJs0MXDKXj+8UQb88JXKQNU0XbLn413H42UiP
4kbStoAM3tEgQCXB0NsuxNB/oJByQ8TvpPoJgGW4uJPHC2DOfz0tA3PYtq7b7D4z+3CX
nkW1sno0t/CZcivt2hH2Rk1Oft6PZ31FEZGRJc/YM8NjFFZutFnluBomkqiSFZowGmVl
Qcsl3ZY1loSlqMdCnK4sSIMw8LRgt+jzza6aUxzxM5OyHzXYzvJAF860AnXfuKLmAMkX
bgSQ==
X-Gm-Message-State: AOJu0Yx7h4CG0/VfW4D2sKyoB21ihCKkHerGjghGFW80ijHNKqUud7an
vzCMTR9W1xXh3xUY6jy/C616x4t8w3IgUuNiXHpkFLlIKEYQjLPdbdIcm+3Aoc0hZdr/8fX8Fyk
F80eWC517G7+5ZgRbk1LmOdPiMQg=
X-Google-Smtp-Source: AGHT+IFHI+6KNMH8DS1L/kITbNh7eGsOKYRo+Qn4QCny6YsPamOLw4i/lruHiGSZjEvzfZRotHRfmun5obhqXTgPUCs=
X-Received: by 2002:ac8:59c5:0:b0:461:263e:6939 with SMTP id
d75a77b69052e-461717c654cmr208010311cf.49.1730603332819; Sat, 02 Nov 2024
20:08:52 -0700 (PDT)
In-Reply-To: <87v7x7o37z.fsf@zedat.fu-berlin.de>
X-Content-Filtered-By: Mailman/MimeDel 2.1.39
X-BeenThere: python-list@python.org
X-Mailman-Version: 2.1.39
Precedence: list
List-Id: General discussion list for the Python programming language
<python-list.python.org>
List-Unsubscribe: <https://mail.python.org/mailman/options/python-list>,
<mailto:python-list-request@python.org?subject=unsubscribe>
List-Archive: <https://mail.python.org/pipermail/python-list/>
List-Post: <mailto:python-list@python.org>
List-Help: <mailto:python-list-request@python.org?subject=help>
List-Subscribe: <https://mail.python.org/mailman/listinfo/python-list>,
<mailto:python-list-request@python.org?subject=subscribe>
X-Mailman-Original-Message-ID: <CAEfz+Ty9LNyHWzouvxxSxmtjQJ=zGhBWBxq3NdH=bJLACBpmBg@mail.gmail.com>
X-Mailman-Original-References: <878qu49tii.fsf@zedat.fu-berlin.de>
<CAJQBtgmwNbjYNr-LWYCia-9+CoRzaLj22YxzyP_EhwSspRD8_g@mail.gmail.com>
<mailman.61.1730392745.4695.python-list@python.org>
<87v7x7o37z.fsf@zedat.fu-berlin.de>
View all headers

Try PYTHONUTF8=1 envver.

2024年11月2日(土) 0:36 Loris Bennett via Python-list <python-list@python.org>:

> Left Right <olegsivokon@gmail.com> writes:
>
> > There's quite a lot of misuse of terminology around terminal / console
> > / shell. Please, correct me if I'm wrong, but it looks like you are
> > printing that on MS Windows, right? MS Windows doesn't have or use
> > terminals (that's more of a Unix-related concept). And, by "terminal"
> > I mean terminal emulator (i.e. a program that emulates the behavior of
> > a physical terminal). You can, of course, find some terminal programs
> > for windows (eg. mintty), but I doubt that that's what you are dealing
> > with.
> >
> > What MS Windows users usually end up using is the console. If you
> > run, eg. cmd.exe, it will create a process that displays a graphical
> > console. The console uses an encoding scheme to represent the text
> > output. I believe that the default on MS Windows is to use some
> > single-byte encoding. This answer from SE family site tells you how to
> > set the console encoding to UTF-8 permanently:
> >
> https://superuser.com/questions/269818/change-default-code-page-of-windows-console-to-utf-8
> > , which, I believe, will solve your problem with how the text is
> > displayed.
>
> I'm not using MS Windows. I am using a Gnome terminal on Debian 12
> locally and connecting via SSH to a AlmaLinux 8 server, where I start a
> tmux session.
>
> > On Thu, Oct 31, 2024 at 5:19 PM Loris Bennett via Python-list
> > <python-list@python.org> wrote:
> >>
> >> Hi,
> >>
> >> I have a command-line program which creates an email containing German
> >> umlauts. On receiving the mail, my mail client displays the subject and
> >> body correctly:
> >>
> >> Subject: Übung
> >>
> >> Sehr geehrter Herr Dr. Bennett,
> >>
> >> Dies ist eine Übung.
> >>
> >> So far, so good. However, when I use the --verbose option to print
> >> the mail to the terminal via
> >>
> >> if args.verbose:
> >> print(mail)
> >>
> >> I get:
> >>
> >> Subject: Übungsbetreff
> >>
> >> Sehr geehrter Herr Dr. Bennett,
> >>
> >> Dies ist eine =C3=9Cbung.
> >>
> >> What do I need to do to prevent the body from getting mangled?
> >>
> >> I seem to remember that I had issues in the past with a Perl version of
> >> a similar program. As far as I recall there was an issue with fact the
> >> greeting is generated by querying a server, whereas the body is being
> >> read from a file, which lead to oddities when the two bits were
> >> concatenated. But that might just have been a Perl thing.
> >>
> >> Cheers,
> >>
> >> Loris
> >>
> >> --
> >> This signature is currently under constuction.
> >> --
> >> https://mail.python.org/mailman/listinfo/python-list
> --
> Dr. Loris Bennett (Herr/Mr)
> FUB-IT, Freie Universität Berlin
> --
> https://mail.python.org/mailman/listinfo/python-list
>

Subject: Re: Printing UTF-8 mail to terminal
From: Loris Bennett
Newsgroups: comp.lang.python
Organization: FUB-IT, Freie Universität Berlin
Date: Mon, 4 Nov 2024 10:44 UTC
References: 1 2 3
Path: eternal-september.org!news.eternal-september.org!feeder2.eternal-september.org!fu-berlin.de!uni-berlin.de!not-for-mail
From: loris.bennett@fu-berlin.de (Loris Bennett)
Newsgroups: comp.lang.python
Subject: Re: Printing UTF-8 mail to terminal
Date: Mon, 04 Nov 2024 11:44:03 +0100
Organization: FUB-IT, Freie Universität Berlin
Lines: 106
Message-ID: <87ed3rmg7g.fsf@zedat.fu-berlin.de>
References: <875xp7nwus.fsf@zedat.fu-berlin.de>
<ZyVMe3Jspc0fJrel@cskk.homeip.net>
<mailman.69.1730497664.4695.python-list@python.org>
Mime-Version: 1.0
Content-Type: text/plain; charset=utf-8
Content-Transfer-Encoding: 8bit
X-Trace: news.uni-berlin.de I6sr7DVD6wifKNv+7xA1QQRivru4J6Uerphy/jK84H61hg
Cancel-Lock: sha1:uhtHbYWYvAP5o4PG3Lei4QpP57s= sha1:eK9GbQf19aaAxFNmrCkYT+l04CQ= sha256:5IT9ussCKR7J9nvefP8DJlpGpwjQWwh6XsHrbgM/RjY=
User-Agent: Gnus/5.13 (Gnus v5.13) Emacs/28.2 (gnu/linux)
View all headers

Cameron Simpson <cs@cskk.id.au> writes:

> On 01Nov2024 10:10, Loris Bennett <loris.bennett@fu-berlin.de> wrote:
>>as expected. The non-UTF-8 text occurs when I do
>>
>> mail = EmailMessage()
>> mail.set_content(body, cte="quoted-printable")
>> ...
>>
>> if args.verbose:
>> print(mail)
>>
>>which is presumably also correct.
>>
>>The question is: What conversion is necessary in order to print the
>>EmailMessage object to the terminal, such that the quoted-printable
>>parts are turned (back) into UTF-8?
>
> Do you still have access to `body` ? That would be the original
> message text? Otherwise maybe:
>
> print(mail.get_content())
>
> The objective is to obtain the message body Unicode text (i.e. a
> regular Python string with the original text, unencoded). And to print
> that.

With the following:

######################################################################

import email.message

m = email.message.EmailMessage()

m['Subject'] = 'Übung'

m.set_content('Dies ist eine Übung')
print('== cte: default == \n')
print(m)

print('-- full mail ---')
print(m)
print('-- just content--')
print(m.get_content())

m.set_content('Dies ist eine Übung', cte='quoted-printable')
print('== cte: quoted-printable ==\n')
print('-- full mail --')
print(m)
print('-- just content --')
print(m.get_content())

######################################################################

I get the following output:

######################################################################

== cte: default ==

Subject: Übung
Content-Type: text/plain; charset="utf-8"
Content-Transfer-Encoding: base64
MIME-Version: 1.0

RGllcyBpc3QgZWluZSDDnGJ1bmcK

-- full mail ---
Subject: Übung
Content-Type: text/plain; charset="utf-8"
Content-Transfer-Encoding: base64
MIME-Version: 1.0

RGllcyBpc3QgZWluZSDDnGJ1bmcK

-- just content--
Dies ist eine Übung

== cte: quoted-printable ==

-- full mail --
Subject: Übung
MIME-Version: 1.0
Content-Type: text/plain; charset="utf-8"
Content-Transfer-Encoding: quoted-printable

Dies ist eine =C3=9Cbung

-- just content --
Dies ist eine Übung

######################################################################

So in both cases the subject is fine, but it is unclear to me how to
print the body. Or rather, I know how to print the body OK, but I don't
know how to print the headers separately - there seems to be nothing
like 'get_headers()'. I can use 'get('Subject) etc. and reconstruct the
headers, but that seems a little clunky.

Cheers,

Loris

--
This signature is currently under constuction.

Subject: Re: Printing UTF-8 mail to terminal
From: Loris Bennett
Newsgroups: comp.lang.python
Organization: FUB-IT, Freie Universität Berlin
Date: Mon, 4 Nov 2024 10:48 UTC
References: 1 2 3 4 5 6
Path: eternal-september.org!news.eternal-september.org!feeder2.eternal-september.org!fu-berlin.de!uni-berlin.de!not-for-mail
From: loris.bennett@fu-berlin.de (Loris Bennett)
Newsgroups: comp.lang.python
Subject: Re: Printing UTF-8 mail to terminal
Date: Mon, 04 Nov 2024 11:48:15 +0100
Organization: FUB-IT, Freie Universität Berlin
Lines: 79
Message-ID: <87a5efmg0g.fsf@zedat.fu-berlin.de>
References: <878qu49tii.fsf@zedat.fu-berlin.de>
<CAJQBtgmwNbjYNr-LWYCia-9+CoRzaLj22YxzyP_EhwSspRD8_g@mail.gmail.com>
<mailman.61.1730392745.4695.python-list@python.org>
<87v7x7o37z.fsf@zedat.fu-berlin.de>
<CAEfz+Ty9LNyHWzouvxxSxmtjQJ=zGhBWBxq3NdH=bJLACBpmBg@mail.gmail.com>
<mailman.75.1730603335.4695.python-list@python.org>
Mime-Version: 1.0
Content-Type: text/plain; charset=utf-8
Content-Transfer-Encoding: 8bit
X-Trace: news.uni-berlin.de 5kZ3YHKmVL/k0XpGW3AJrQwkNtDA2bCOHnSlTn5bbZNE+E
Cancel-Lock: sha1:FRGU1w6PVSajxiFoyZp7qOPdyaE= sha1:Gfms76l48SoyYgtLCZGcho+UsU8= sha256:EVB0+DTCYLHstjRBpVdIZDFxjhbvCEZnTlxp5Y5G8fE=
User-Agent: Gnus/5.13 (Gnus v5.13) Emacs/28.2 (gnu/linux)
View all headers

Inada Naoki <songofacandy@gmail.com> writes:

> 2024年11月2日(土) 0:36 Loris Bennett via Python-list <python-list@python.org>:
>
>> Left Right <olegsivokon@gmail.com> writes:
>>
>> > There's quite a lot of misuse of terminology around terminal / console
>> > / shell. Please, correct me if I'm wrong, but it looks like you are
>> > printing that on MS Windows, right? MS Windows doesn't have or use
>> > terminals (that's more of a Unix-related concept). And, by "terminal"
>> > I mean terminal emulator (i.e. a program that emulates the behavior of
>> > a physical terminal). You can, of course, find some terminal programs
>> > for windows (eg. mintty), but I doubt that that's what you are dealing
>> > with.
>> >
>> > What MS Windows users usually end up using is the console. If you
>> > run, eg. cmd.exe, it will create a process that displays a graphical
>> > console. The console uses an encoding scheme to represent the text
>> > output. I believe that the default on MS Windows is to use some
>> > single-byte encoding. This answer from SE family site tells you how to
>> > set the console encoding to UTF-8 permanently:
>> >
>> https://superuser.com/questions/269818/change-default-code-page-of-windows-console-to-utf-8
>> > , which, I believe, will solve your problem with how the text is
>> > displayed.
>>
>> I'm not using MS Windows. I am using a Gnome terminal on Debian 12
>> locally and connecting via SSH to a AlmaLinux 8 server, where I start a
>> tmux session.
>>
>> > On Thu, Oct 31, 2024 at 5:19 PM Loris Bennett via Python-list
>> > <python-list@python.org> wrote:
>> >>
>> >> Hi,
>> >>
>> >> I have a command-line program which creates an email containing German
>> >> umlauts. On receiving the mail, my mail client displays the subject and
>> >> body correctly:
>> >>
>> >> Subject: Übung
>> >>
>> >> Sehr geehrter Herr Dr. Bennett,
>> >>
>> >> Dies ist eine Übung.
>> >>
>> >> So far, so good. However, when I use the --verbose option to print
>> >> the mail to the terminal via
>> >>
>> >> if args.verbose:
>> >> print(mail)
>> >>
>> >> I get:
>> >>
>> >> Subject: Übungsbetreff
>> >>
>> >> Sehr geehrter Herr Dr. Bennett,
>> >>
>> >> Dies ist eine =C3=9Cbung.
>> >>
>> >> What do I need to do to prevent the body from getting mangled?
>> >>
>> >> I seem to remember that I had issues in the past with a Perl version of
>> >> a similar program. As far as I recall there was an issue with fact the
>> >> greeting is generated by querying a server, whereas the body is being
>> >> read from a file, which lead to oddities when the two bits were
>> >> concatenated. But that might just have been a Perl thing.
>> >>
>
> Try PYTHONUTF8=1 envver.
>

This does not seem to affect the way the email body is printed.

Cheers,

Loris

--
This signature is currently under constuction.

Subject: Re: Printing UTF-8 mail to terminal
From: Loris Bennett
Newsgroups: comp.lang.python
Organization: FUB-IT, Freie Universität Berlin
Date: Mon, 4 Nov 2024 10:57 UTC
References: 1 2 3 4
Path: eternal-september.org!news.eternal-september.org!feeder2.eternal-september.org!fu-berlin.de!uni-berlin.de!not-for-mail
From: loris.bennett@fu-berlin.de (Loris Bennett)
Newsgroups: comp.lang.python
Subject: Re: Printing UTF-8 mail to terminal
Date: Mon, 04 Nov 2024 11:57:37 +0100
Organization: FUB-IT, Freie Universität Berlin
Lines: 110
Message-ID: <875xp3mfku.fsf@zedat.fu-berlin.de>
References: <875xp7nwus.fsf@zedat.fu-berlin.de>
<ZyVMe3Jspc0fJrel@cskk.homeip.net>
<mailman.69.1730497664.4695.python-list@python.org>
<87ed3rmg7g.fsf@zedat.fu-berlin.de>
Mime-Version: 1.0
Content-Type: text/plain; charset=utf-8
Content-Transfer-Encoding: 8bit
X-Trace: news.uni-berlin.de K3NFHovAT4eiybs6ZWzXlwQ2SAvDFsVuO+/zgfdq87F6Pw
Cancel-Lock: sha1:dCJHonEM2ycvtkvQ3gYP8nEzZa0= sha1:aWb9lRBT+flGz+gjOvbOP1bYa9M= sha256:tuwMZ8fJUZg5Fd3ngrcj0mDobU4VFHGrzypxDdxhooY=
User-Agent: Gnus/5.13 (Gnus v5.13) Emacs/28.2 (gnu/linux)
View all headers

"Loris Bennett" <loris.bennett@fu-berlin.de> writes:

> Cameron Simpson <cs@cskk.id.au> writes:
>
>> On 01Nov2024 10:10, Loris Bennett <loris.bennett@fu-berlin.de> wrote:
>>>as expected. The non-UTF-8 text occurs when I do
>>>
>>> mail = EmailMessage()
>>> mail.set_content(body, cte="quoted-printable")
>>> ...
>>>
>>> if args.verbose:
>>> print(mail)
>>>
>>>which is presumably also correct.
>>>
>>>The question is: What conversion is necessary in order to print the
>>>EmailMessage object to the terminal, such that the quoted-printable
>>>parts are turned (back) into UTF-8?
>>
>> Do you still have access to `body` ? That would be the original
>> message text? Otherwise maybe:
>>
>> print(mail.get_content())
>>
>> The objective is to obtain the message body Unicode text (i.e. a
>> regular Python string with the original text, unencoded). And to print
>> that.
>
> With the following:
>
> ######################################################################
>
> import email.message
>
> m = email.message.EmailMessage()
>
> m['Subject'] = 'Übung'
>
> m.set_content('Dies ist eine Übung')
> print('== cte: default == \n')
> print(m)
>
> print('-- full mail ---')
> print(m)
> print('-- just content--')
> print(m.get_content())
>
> m.set_content('Dies ist eine Übung', cte='quoted-printable')
> print('== cte: quoted-printable ==\n')
> print('-- full mail --')
> print(m)
> print('-- just content --')
> print(m.get_content())
>
> ######################################################################
>
> I get the following output:
>
> ######################################################################
>
> == cte: default ==
>
> Subject: Übung
> Content-Type: text/plain; charset="utf-8"
> Content-Transfer-Encoding: base64
> MIME-Version: 1.0
>
> RGllcyBpc3QgZWluZSDDnGJ1bmcK
>
> -- full mail ---
> Subject: Übung
> Content-Type: text/plain; charset="utf-8"
> Content-Transfer-Encoding: base64
> MIME-Version: 1.0
>
> RGllcyBpc3QgZWluZSDDnGJ1bmcK
>
> -- just content--
> Dies ist eine Übung
>
> == cte: quoted-printable ==
>
> -- full mail --
> Subject: Übung
> MIME-Version: 1.0
> Content-Type: text/plain; charset="utf-8"
> Content-Transfer-Encoding: quoted-printable
>
> Dies ist eine =C3=9Cbung
>
> -- just content --
> Dies ist eine Übung
>
> ######################################################################
>
> So in both cases the subject is fine, but it is unclear to me how to
> print the body. Or rather, I know how to print the body OK, but I don't
> know how to print the headers separately - there seems to be nothing
> like 'get_headers()'. I can use 'get('Subject) etc. and reconstruct the
> headers, but that seems a little clunky.

Sorry, I am confusing the terminology here. The 'body' seems to be the
headers plus the 'content'. So I can print the *content* without the
headers OK, but I can't easily print all the headers separately. If
just print the body, i.e. headers plus content, the umlauts in the
content are not resolved.

--
This signature is currently under constuction.

Subject: Re: Printing UTF-8 mail to terminal
From: Loris Bennett
Newsgroups: comp.lang.python
Organization: FUB-IT, Freie Universität Berlin
Date: Mon, 4 Nov 2024 12:02 UTC
References: 1 2 3 4 5
Path: eternal-september.org!news.eternal-september.org!feeder2.eternal-september.org!fu-berlin.de!uni-berlin.de!not-for-mail
From: loris.bennett@fu-berlin.de (Loris Bennett)
Newsgroups: comp.lang.python
Subject: Re: Printing UTF-8 mail to terminal
Date: Mon, 04 Nov 2024 13:02:21 +0100
Organization: FUB-IT, Freie Universität Berlin
Lines: 131
Message-ID: <871pzrmcky.fsf@zedat.fu-berlin.de>
References: <875xp7nwus.fsf@zedat.fu-berlin.de>
<ZyVMe3Jspc0fJrel@cskk.homeip.net>
<mailman.69.1730497664.4695.python-list@python.org>
<87ed3rmg7g.fsf@zedat.fu-berlin.de>
<875xp3mfku.fsf@zedat.fu-berlin.de>
Mime-Version: 1.0
Content-Type: text/plain; charset=utf-8
Content-Transfer-Encoding: 8bit
X-Trace: news.uni-berlin.de qcYYdOG7c7e7KSYkx8/6twZ3kLenB6GSyOLOMeUpcfwmxa
Cancel-Lock: sha1:f5YK6mGo+QOXycGyerakkhD5MEM= sha1:4Y0kJ4CwD39m5hxgmNvhqaVY63s= sha256:eLDRXfJWKFlBr2B7A/UCvcnDDZNaUnkdNwbZGepKcyE=
User-Agent: Gnus/5.13 (Gnus v5.13) Emacs/28.2 (gnu/linux)
View all headers

"Loris Bennett" <loris.bennett@fu-berlin.de> writes:

> "Loris Bennett" <loris.bennett@fu-berlin.de> writes:
>
>> Cameron Simpson <cs@cskk.id.au> writes:
>>
>>> On 01Nov2024 10:10, Loris Bennett <loris.bennett@fu-berlin.de> wrote:
>>>>as expected. The non-UTF-8 text occurs when I do
>>>>
>>>> mail = EmailMessage()
>>>> mail.set_content(body, cte="quoted-printable")
>>>> ...
>>>>
>>>> if args.verbose:
>>>> print(mail)
>>>>
>>>>which is presumably also correct.
>>>>
>>>>The question is: What conversion is necessary in order to print the
>>>>EmailMessage object to the terminal, such that the quoted-printable
>>>>parts are turned (back) into UTF-8?
>>>
>>> Do you still have access to `body` ? That would be the original
>>> message text? Otherwise maybe:
>>>
>>> print(mail.get_content())
>>>
>>> The objective is to obtain the message body Unicode text (i.e. a
>>> regular Python string with the original text, unencoded). And to print
>>> that.
>>
>> With the following:
>>
>> ######################################################################
>>
>> import email.message
>>
>> m = email.message.EmailMessage()
>>
>> m['Subject'] = 'Übung'
>>
>> m.set_content('Dies ist eine Übung')
>> print('== cte: default == \n')
>> print(m)
>>
>> print('-- full mail ---')
>> print(m)
>> print('-- just content--')
>> print(m.get_content())
>>
>> m.set_content('Dies ist eine Übung', cte='quoted-printable')
>> print('== cte: quoted-printable ==\n')
>> print('-- full mail --')
>> print(m)
>> print('-- just content --')
>> print(m.get_content())
>>
>> ######################################################################
>>
>> I get the following output:
>>
>> ######################################################################
>>
>> == cte: default ==
>>
>> Subject: Übung
>> Content-Type: text/plain; charset="utf-8"
>> Content-Transfer-Encoding: base64
>> MIME-Version: 1.0
>>
>> RGllcyBpc3QgZWluZSDDnGJ1bmcK
>>
>> -- full mail ---
>> Subject: Übung
>> Content-Type: text/plain; charset="utf-8"
>> Content-Transfer-Encoding: base64
>> MIME-Version: 1.0
>>
>> RGllcyBpc3QgZWluZSDDnGJ1bmcK
>>
>> -- just content--
>> Dies ist eine Übung
>>
>> == cte: quoted-printable ==
>>
>> -- full mail --
>> Subject: Übung
>> MIME-Version: 1.0
>> Content-Type: text/plain; charset="utf-8"
>> Content-Transfer-Encoding: quoted-printable
>>
>> Dies ist eine =C3=9Cbung
>>
>> -- just content --
>> Dies ist eine Übung
>>
>> ######################################################################
>>
>> So in both cases the subject is fine, but it is unclear to me how to
>> print the body. Or rather, I know how to print the body OK, but I don't
>> know how to print the headers separately - there seems to be nothing
>> like 'get_headers()'. I can use 'get('Subject) etc. and reconstruct the
>> headers, but that seems a little clunky.
>
> Sorry, I am confusing the terminology here. The 'body' seems to be the
> headers plus the 'content'. So I can print the *content* without the
> headers OK, but I can't easily print all the headers separately. If
> just print the body, i.e. headers plus content, the umlauts in the
> content are not resolved.

OK, so I can do:

######################################################################
if args.verbose:
for k in mail.keys():
print(f"{k}: {mail.get(k)}")
print('')
print(mail.get_content())
######################################################################

prints what I want and is not wildly clunky, but I am a little surprised
that I can't get a string representation of the whole email in one go.

Cheers,

Loris

--
Dr. Loris Bennett (Herr/Mr)
FUB-IT, Freie Universität Berlin

Subject: Re: Printing UTF-8 mail to terminal
From: Peter J. Holzer
Newsgroups: comp.lang.python
Date: Tue, 5 Nov 2024 20:39 UTC
References: 1 2 3 4 5 6 7
Attachments: signature.asc (application/pgp-signature)
Path: eternal-september.org!news.eternal-september.org!feeder2.eternal-september.org!fu-berlin.de!uni-berlin.de!not-for-mail
From: hjp-python@hjp.at (Peter J. Holzer)
Newsgroups: comp.lang.python
Subject: Re: Printing UTF-8 mail to terminal
Date: Tue, 5 Nov 2024 21:39:32 +0100
Lines: 78
Message-ID: <mailman.81.1730839621.4695.python-list@python.org>
References: <875xp7nwus.fsf@zedat.fu-berlin.de>
<ZyVMe3Jspc0fJrel@cskk.homeip.net>
<mailman.69.1730497664.4695.python-list@python.org>
<87ed3rmg7g.fsf@zedat.fu-berlin.de>
<875xp3mfku.fsf@zedat.fu-berlin.de>
<871pzrmcky.fsf@zedat.fu-berlin.de>
<20241105203932.rclo4j5g763cfnmh@hjp.at>
Mime-Version: 1.0
Content-Type: multipart/signed; micalg=pgp-sha512;
protocol="application/pgp-signature"; boundary="x3fu3allroqu7sen"
X-Trace: news.uni-berlin.de Exk6+O+WaBhGUVY33AkcmwR2keULdHuYUkGjkB2vsr7Q==
Cancel-Lock: sha1:DZUmMS8W0q7Ik47ddQfKO+IZkLA= sha256:dqRVgLQClB3B5Hc5FrU32IyjqN51WWFkI01kdznOmVg=
Return-Path: <hjp-python@hjp.at>
X-Original-To: python-list@python.org
Delivered-To: python-list@mail.python.org
Authentication-Results: mail.python.org; dkim=none reason="no signature";
dkim-adsp=none (unprotected policy); dkim-atps=neutral
X-Spam-Status: OK 0.000
X-Spam-Evidence: '*H*': 1.00; '*S*': 0.00; 'content-
type:multipart/signed': 0.05; 'string': 0.07; 'be?': 0.09;
'content-type:application/pgp-signature': 0.09; 'describe': 0.09;
'filename:fname piece:asc': 0.09; 'filename:fname
piece:signature': 0.09; 'filename:fname:signature.asc': 0.09;
'ok,': 0.09; 'prints': 0.09; 'readable': 0.09; 'rendering': 0.09;
'writes:': 0.09; '"creative': 0.16; '>>>>': 0.16; '__/': 0.16;
'bennett': 0.16; 'cameron': 0.16; 'challenge!"': 0.16;
'conversion': 0.16; 'expected.': 0.16; 'filename': 0.16;
'from:addr:hjp-python': 0.16; 'from:addr:hjp.at': 0.16;
'from:name:peter j. holzer': 0.16; 'hjp@hjp.at': 0.16; 'holzer':
0.16; 'presumably': 0.16; 'reality.': 0.16; 'simpson': 0.16;
'skip:> 10': 0.16; 'stross,': 0.16; 'sufficient.': 0.16; 'url-
ip:212.17.106.129/32': 0.16; 'url-ip:212.17.106/24': 0.16; 'url-
ip:212.17/16': 0.16; 'url:hjp': 0.16; 'wildly': 0.16; '|_|_)':
0.16; 'wrote:': 0.16; "can't": 0.17; 'uses': 0.19; 'it?': 0.19;
'to:addr:python-list': 0.20; 'object': 0.26; '>>>': 0.28; 'sense':
0.28; 'example,': 0.28; 'whole': 0.30; 'question': 0.32; 'mails':
0.32; 'python-list': 0.32; 'suitable': 0.32; 'but': 0.32; 'there':
0.33; 'header:In-Reply-To:1': 0.34; '...': 0.37; 'use': 0.39;
'necessary': 0.39; 'text': 0.39; 'decide': 0.39; 'want': 0.40;
'try': 0.40; 'should': 0.40; 'email.': 0.61; 'skip:m 20': 0.63;
'email': 0.63; 'received:userid': 0.66; 'order': 0.69; 'little':
0.73; 'html': 0.80; 'received:at': 0.84; 'stuff,': 0.84;
'surprised': 0.84; 'text/html': 0.84; 'subject:UTF': 0.91;
'subject:mail': 0.95; 'turned': 0.95
Mail-Followup-To: python-list@python.org
Content-Disposition: inline
In-Reply-To: <871pzrmcky.fsf@zedat.fu-berlin.de>
X-BeenThere: python-list@python.org
X-Mailman-Version: 2.1.39
Precedence: list
List-Id: General discussion list for the Python programming language
<python-list.python.org>
List-Unsubscribe: <https://mail.python.org/mailman/options/python-list>,
<mailto:python-list-request@python.org?subject=unsubscribe>
List-Archive: <https://mail.python.org/pipermail/python-list/>
List-Post: <mailto:python-list@python.org>
List-Help: <mailto:python-list-request@python.org?subject=help>
List-Subscribe: <https://mail.python.org/mailman/listinfo/python-list>,
<mailto:python-list-request@python.org?subject=subscribe>
X-Mailman-Original-Message-ID: <20241105203932.rclo4j5g763cfnmh@hjp.at>
X-Mailman-Original-References: <875xp7nwus.fsf@zedat.fu-berlin.de>
<ZyVMe3Jspc0fJrel@cskk.homeip.net>
<mailman.69.1730497664.4695.python-list@python.org>
<87ed3rmg7g.fsf@zedat.fu-berlin.de>
<875xp3mfku.fsf@zedat.fu-berlin.de>
<871pzrmcky.fsf@zedat.fu-berlin.de>
View all headers

On 2024-11-04 13:02:21 +0100, Loris Bennett via Python-list wrote:
> "Loris Bennett" <loris.bennett@fu-berlin.de> writes:
> > "Loris Bennett" <loris.bennett@fu-berlin.de> writes:
> >> Cameron Simpson <cs@cskk.id.au> writes:
> >>> On 01Nov2024 10:10, Loris Bennett <loris.bennett@fu-berlin.de> wrote:
> >>>>as expected. The non-UTF-8 text occurs when I do
> >>>>
> >>>> mail = EmailMessage()
> >>>> mail.set_content(body, cte="quoted-printable")
> >>>> ...
> >>>>
> >>>> if args.verbose:
> >>>> print(mail)
> >>>>
> >>>>which is presumably also correct.
> >>>>
> >>>>The question is: What conversion is necessary in order to print the
> >>>>EmailMessage object to the terminal, such that the quoted-printable
> >>>>parts are turned (back) into UTF-8?
[...]
> OK, so I can do:
>
> ######################################################################
> if args.verbose:
> for k in mail.keys():
> print(f"{k}: {mail.get(k)}")
> print('')
> print(mail.get_content())
> ######################################################################
>
> prints what I want and is not wildly clunky, but I am a little surprised
> that I can't get a string representation of the whole email in one go.

Mails can contain lots of stuff, so there is in general no suitable
human readable string representation of a whole email. You have to go
through it part by part and decide what you want to do with each. For
example, if you have a multipart/alternative with a text/plain and a
text/html part what should the "string representation" be? For some uses
the text/plain part might be sufficient. For some you might want the
HTML part or some rendering of it. Or what would you do with an image?
Omit it completely? Just use the filename (if any)? Try to convert it to
ASCII-Art? Use an AI to describe it?

hp

--
_ | Peter J. Holzer | Story must make more sense than reality.
|_|_) | |
| | | hjp@hjp.at | -- Charles Stross, "Creative writing
__/ | http://www.hjp.at/ | challenge!"

Attachments: signature.asc (application/pgp-signature)
Subject: Re: Printing UTF-8 mail to terminal
From: Cameron Simpson
Newsgroups: comp.lang.python
Date: Tue, 5 Nov 2024 21:20 UTC
References: 1 2
Path: eternal-september.org!news.eternal-september.org!feeder2.eternal-september.org!fu-berlin.de!uni-berlin.de!not-for-mail
From: cs@cskk.id.au (Cameron Simpson)
Newsgroups: comp.lang.python
Subject: Re: Printing UTF-8 mail to terminal
Date: Wed, 6 Nov 2024 08:20:44 +1100
Lines: 38
Message-ID: <mailman.84.1730841650.4695.python-list@python.org>
References: <871pzrmcky.fsf@zedat.fu-berlin.de>
<ZyqMLOUxvnwARS2e@cskk.homeip.net>
Mime-Version: 1.0
Content-Type: text/plain; charset=us-ascii; format=flowed
X-Trace: news.uni-berlin.de jIbULRnIjMmC8kIz6LzvqQ/9DA9+55Ijbk2TzNf74JuA==
Cancel-Lock: sha1:8ginHsUQk9nkalDSVgVUAqnZqXI= sha256:AcLfFAjFJI7MMBw2Lso/QyctG4rCvy6fSJfGugZeuMk=
Return-Path: <cameron@cskk.id.au>
X-Original-To: python-list@python.org
Delivered-To: python-list@mail.python.org
Authentication-Results: mail.python.org; dkim=none reason="no signature";
dkim-adsp=none (unprotected policy); dkim-atps=neutral
X-Spam-Status: OK 0.005
X-Spam-Evidence: '*H*': 0.99; '*S*': 0.00; 'this:': 0.03; 'string':
0.07; 'cc:addr:python-list': 0.09; 'set.': 0.09; 'cheers,': 0.11;
'cc:no real name:2**0': 0.14; '(even': 0.16; 'bennett': 0.16;
'cameron': 0.16; 'encoding': 0.16; 'encoding.': 0.16;
'from:addr:cs': 0.16; 'from:addr:cskk.id.au': 0.16;
'from:name:cameron simpson': 0.16; 'message-id:@cskk.homeip.net':
0.16; 'received:13.237': 0.16; 'received:13.237.201': 0.16;
'received:13.237.201.189': 0.16; 'received:cskk.id.au': 0.16;
'received:id.au': 0.16; 'received:mail.cskk.id.au': 0.16;
'simpson': 0.16; 'wildly': 0.16; 'wrote:': 0.16; "can't": 0.17;
'cc:addr:python.org': 0.20; 'cc:2**0': 0.25; 'bit': 0.27;
'present': 0.30; 'whole': 0.30; 'header:User-Agent:1': 0.30;
"doesn't": 0.32; 'requiring': 0.32; 'but': 0.32; 'header:In-Reply-
To:1': 0.34; 'printing': 0.34; 'meaning': 0.35; 'received:au':
0.35; "it's": 0.37; 'example': 0.37; 'means': 0.38; 'text': 0.39;
'valid': 0.39; 'want': 0.40; 'identified': 0.62; 'subject': 0.63;
'email': 0.63; 'received:13': 0.64; 'thus': 0.64;
'received:userid': 0.66; 'further': 0.69; 'content': 0.72;
'little': 0.73; 'lines,': 0.84; 'surprised': 0.84; 'subject:UTF':
0.91; 'subject:mail': 0.95
Mail-Followup-To: Loris Bennett <loris.bennett@fu-berlin.de>,
python-list@python.org
Content-Disposition: inline
In-Reply-To: <871pzrmcky.fsf@zedat.fu-berlin.de>
User-Agent: Mutt/2.2.13 (2024-03-09)
X-BeenThere: python-list@python.org
X-Mailman-Version: 2.1.39
Precedence: list
List-Id: General discussion list for the Python programming language
<python-list.python.org>
List-Unsubscribe: <https://mail.python.org/mailman/options/python-list>,
<mailto:python-list-request@python.org?subject=unsubscribe>
List-Archive: <https://mail.python.org/pipermail/python-list/>
List-Post: <mailto:python-list@python.org>
List-Help: <mailto:python-list-request@python.org?subject=help>
List-Subscribe: <https://mail.python.org/mailman/listinfo/python-list>,
<mailto:python-list-request@python.org?subject=subscribe>
X-Mailman-Original-Message-ID: <ZyqMLOUxvnwARS2e@cskk.homeip.net>
X-Mailman-Original-References: <871pzrmcky.fsf@zedat.fu-berlin.de>
View all headers

On 04Nov2024 13:02, Loris Bennett <loris.bennett@fu-berlin.de> wrote:
>OK, so I can do:
>
>######################################################################
>if args.verbose:
> for k in mail.keys():
> print(f"{k}: {mail.get(k)}")
> print('')
> print(mail.get_content())
>######################################################################
>
>prints what I want and is not wildly clunky, but I am a little surprised
>that I can't get a string representation of the whole email in one go.

A string representation of the whole message needs to be correctly
encoded so that its components can be identified mechanically. So it
needs to be a syntacticly valid RFC5322 message. Thus the encoding.

As an example (slightly contrived) of why this is important, multipart
messages are delimited with distinct lines, and their content may not
present such a line (even f it's in the "raw" original data).

So printing a whole message transcribes it in the encoded form so that
it can be decoded mechanically. And conservativly, this is usually an
ASCII compatibly encoding so that it can traverse various systems
undamaged. This means the text requiring UTF8 encoding get further
encoded as quoted printable to avoid ambiguity about the meaning of
bytes/octets which have their high bit set.

BTW, doesn't this:

for k in mail.keys():
print(f"{k}: {mail.get(k)}")

print the quoted printable (i.e. not decoded) form of subject lines?

Cheers,
Cameron Simpson <cs@cskk.id.au>

1

rocksolid light 0.9.8
clearnet tor