Rocksolid Light

News from da outaworlds

mail  files  register  groups  login

Message-ID:  

BOFH excuse #399: We are a 100% Microsoft Shop.


comp / comp.lang.python / Re: First two bytes of 'stdout' are lost

SubjectAuthor
o Re: First two bytes of 'stdout' are lostCameron Simpson

1
Subject: Re: First two bytes of 'stdout' are lost
From: Cameron Simpson
Newsgroups: comp.lang.python
Date: Thu, 11 Apr 2024 21:55 UTC
References: 1 2
Path: eternal-september.org!news.eternal-september.org!feeder3.eternal-september.org!fu-berlin.de!uni-berlin.de!not-for-mail
From: cs@cskk.id.au (Cameron Simpson)
Newsgroups: comp.lang.python
Subject: Re: First two bytes of 'stdout' are lost
Date: Fri, 12 Apr 2024 07:55:55 +1000
Lines: 46
Message-ID: <mailman.100.1712872561.3468.python-list@python.org>
References: <CA+cSArh=U-H6nW7nYBBD5v=ZUAvGpE1M_nB9BUoYzvca9XsKfA@mail.gmail.com>
<Zhhca5UQ5Sql3ln8@cskk.homeip.net>
Mime-Version: 1.0
Content-Type: text/plain; charset=us-ascii; format=flowed
X-Trace: news.uni-berlin.de UckPsXiCxsLzFzqsPKjb7AgsaXPKG1sWM1S7uokNW/Rw==
Cancel-Lock: sha1:9762V3++JX9vPWOWAtMVD0Pxgkw= sha256:JtdxqLGYPYSUtBrCermOWboWqORVsOIgKQO7BgTTp+0=
Return-Path: <cameron@cskk.id.au>
X-Original-To: python-list@python.org
Delivered-To: python-list@mail.python.org
Authentication-Results: mail.python.org; dkim=none reason="no signature";
dkim-adsp=none (unprotected policy); dkim-atps=neutral
X-Spam-Status: OK 0.000
X-Spam-Evidence: '*H*': 1.00; '*S*': 0.00; 'looks': 0.02; 'pfxlen:0':
0.03; 'this:': 0.03; 'utf-8': 0.07; '>from': 0.09; 'cc:addr
:python-list': 0.09; 'skip:` 10': 0.09; 'subject:two': 0.09;
'import': 0.15; 'that.': 0.15; 'api.': 0.16; 'bits': 0.16; 'c++':
0.16; 'changed:': 0.16; 'encoding': 0.16; 'from:addr:cs': 0.16;
'from:addr:cskk.id.au': 0.16; 'from:name:cameron simpson': 0.16;
'manipulating': 0.16; 'message-id:@cskk.homeip.net': 0.16;
'received:13.237': 0.16; 'received:13.237.201': 0.16;
'received:13.237.201.189': 0.16; 'received:cskk.id.au': 0.16;
'received:id.au': 0.16; 'received:mail.cskk.id.au': 0.16; 'skip:>
10': 0.16; 'skip:> 20': 0.16; 'stdout.': 0.16; 'using.': 0.16;
'vague': 0.16; 'wrote:': 0.16; 'python': 0.16; 'api': 0.17;
'code.': 0.17; 'instead': 0.17; 'cc:addr:python.org': 0.20;
'all,': 0.20; 'written': 0.22; 'maybe': 0.22; 'code': 0.23; 'run':
0.23; 'cc:2**0': 0.25; 'seems': 0.26; 'tried': 0.26; 'ideas':
0.28; 'header:User-Agent:1': 0.30; 'default': 0.31; 'think': 0.32;
'good.': 0.32; 'point,': 0.32; 'but': 0.32; "i'm": 0.33;
'windows': 0.34; 'someone': 0.34; "didn't": 0.34; 'header:In-
Reply-To:1': 0.34; 'trying': 0.35; 'received:au': 0.35;
'possibly': 0.36; 'using': 0.37; 'put': 0.38; 'two': 0.39;
'quite': 0.39; 'text': 0.39; 'this,': 0.39; 'use': 0.39; 'exact':
0.40; 'limited.': 0.40; 'something': 0.40; 'should': 0.40; 'me.':
0.62; 'point.': 0.62; 'limited': 0.62; 'experience': 0.64;
'received:13': 0.64; 'received:userid': 0.66; 'now,': 0.67;
'worked': 0.67; 'right': 0.68; 'exactly': 0.68; 'compared': 0.71;
'left': 0.83; 'capture': 0.84; 'clue': 0.84; 'rigorously': 0.84;
'subject:First': 0.84
Mail-Followup-To: "Olivier B." <perso.olivier.barthelemy@gmail.com>,
"python-list@python.org" <python-list@python.org>
Content-Disposition: inline
In-Reply-To: <CA+cSArh=U-H6nW7nYBBD5v=ZUAvGpE1M_nB9BUoYzvca9XsKfA@mail.gmail.com>
User-Agent: Mutt/2.2.7 (2022-08-07)
X-BeenThere: python-list@python.org
X-Mailman-Version: 2.1.39
Precedence: list
List-Id: General discussion list for the Python programming language
<python-list.python.org>
List-Unsubscribe: <https://mail.python.org/mailman/options/python-list>,
<mailto:python-list-request@python.org?subject=unsubscribe>
List-Archive: <https://mail.python.org/pipermail/python-list/>
List-Post: <mailto:python-list@python.org>
List-Help: <mailto:python-list-request@python.org?subject=help>
List-Subscribe: <https://mail.python.org/mailman/listinfo/python-list>,
<mailto:python-list-request@python.org?subject=subscribe>
X-Mailman-Original-Message-ID: <Zhhca5UQ5Sql3ln8@cskk.homeip.net>
X-Mailman-Original-References: <CA+cSArh=U-H6nW7nYBBD5v=ZUAvGpE1M_nB9BUoYzvca9XsKfA@mail.gmail.com>
View all headers

On 11Apr2024 14:42, Olivier B. <perso.olivier.barthelemy@gmail.com> wrote:
>I am trying to use StringIO to capture stdout, in code that looks like this:
>
>import sys
>from io import StringIO
>old_stdout = sys.stdout
>sys.stdout = mystdout = StringIO()
>print( "patate")
>mystdout.seek(0)
>sys.stdout = old_stdout
>print(mystdout.read())
>
>Well, it is not exactly like this, since this works properly

Aye, I just tried that. All good.

>This code is actually run from C++ using the C Python API.
>This worked quite well, so the code was right at some point. But now,
>two things changed:
> - Now using python 3.11.7 instead of 3.7.12
> - Now using only the python limited C API

Maybe you should post the code then: the exact Python code and the exact
C++ code.

>And it seems that now, mystdout.read() always misses the first two
>characters that have been written to stdout.
>
>My first ideas was something related to the BOM improperly truncated
>at some point, but i am manipulating UTF-8, so the bom would be 3
>bytes, not 2.

I didn't think UTF-8 needed a BOM. Somone will doubtless correct me.

However, does the `mystdout.read()` code _know_ you're using UTF-8? I
have the vague impression that eg some Windows systems default to UTF-16
of some flavour, possibly _with_ a BOM.

I'm suggesting that you rigorously check that the bytes->text bits know
what text encoding they're using. If you've left an encoding out
anywhere, put it in explicitly.

>Hopefully someone has a clue on what would have changed in Python for
>this to stop working compared to python 3.7?

None at all, alas. My experience with the Python C API is very limited.

1

rocksolid light 0.9.8
clearnet tor