Rocksolid Light

News from da outaworlds

mail  files  register  groups  login

Message-ID:  

BOFH excuse #442: Trojan horse ran out of hay


comp / comp.misc / Re: strlcpy and how CPUs can defy common sense

SubjectAuthor
* strlcpy and how CPUs can defy common senseBen Collver
+* Re: strlcpy and how CPUs can defy common senseStefan Ram
|`* Re: strlcpy and how CPUs can defy common senseStefan Ram
| `- Re: strlcpy and how CPUs can defy common senseLawrence D'Oliveiro
+* Re: strlcpy and how CPUs can defy common senseLawrence D'Oliveiro
|`* Re: strlcpy and how CPUs can defy common senseJohn McCue
| +- Re: strlcpy and how CPUs can defy common senseJoerg Mertens
| `* Re: strlcpy and how CPUs can defy common senseLawrence D'Oliveiro
|  `- Re: strlcpy and how CPUs can defy common senseJoerg Mertens
`* Re: strlcpy and how CPUs can defy common senseBruce Horrocks
 `- Re: strlcpy and how CPUs can defy common senseJohanne Fairchild

1
Subject: strlcpy and how CPUs can defy common sense
From: Ben Collver
Newsgroups: comp.misc
Organization: A noiseless patient Spider
Date: Fri, 26 Jul 2024 15:36 UTC
Path: eternal-september.org!news.eternal-september.org!.POSTED!not-for-mail
From: bencollver@tilde.pink (Ben Collver)
Newsgroups: comp.misc
Subject: strlcpy and how CPUs can defy common sense
Date: Fri, 26 Jul 2024 15:36:17 -0000 (UTC)
Organization: A noiseless patient Spider
Lines: 256
Message-ID: <slrnva7gg6.39g.bencollver@svadhyaya.localdomain>
Injection-Date: Fri, 26 Jul 2024 17:36:18 +0200 (CEST)
Injection-Info: dont-email.me; posting-host="25c82515a3793f0405007c5555736d3a";
logging-data="3051943"; mail-complaints-to="abuse@eternal-september.org"; posting-account="U2FsdGVkX18UzezBadE+UlSEjcbOWfzX9bHkRUydp+0="
User-Agent: slrn/1.0.3 (Linux)
Cancel-Lock: sha1:n/wW5GS3Oo5+79tmoYd5blm3E7s=
View all headers

strlcpy and how CPUs can defy common sense
==========================================
24 Jul 2024

Recently one of my older post about strlcpy has sparked some
discussion on various forums. Presumably the recently released POSIX
edition had something to do with it. One particular counter-argument
was raised by multiple posters - and it's an argument that I've heard
before as well:

* In the common case where the source string fits in to the
destination buffer, strlcpy would only traverse the string once
whereas strlen + memcpy would traverse it twice always.

Hidden in this argument is the assumption that traversing the string
once is faster. Which - to be clear - is not at all an unreasonable
assumption. But is it actually true? That's the focus of today's
article.

<https://nrk.neocities.org/articles/not-a-fan-of-strlcpy>

CPU vs common sense
===================
> Computers do not have common sense. Computers are surprising.
> - Tony Hoare to Lomuto

The following is from openbsd, where strlcpy originated - modified a
bit for brevity.

size_t strlcpy(char *dst, const char *src, size_t dsize)
{
const char *osrc = src;
size_t nleft = dsize;

/* Copy as many bytes as will fit. */
if (nleft != 0) while (--nleft != 0) {
if ((*dst++ = *src++) == '\0')
break;
}

/* Not enough room in dst, add NUL and traverse rest of src. */
if (nleft == 0) {
if (dsize != 0) *dst = '\0'; /* NUL-terminate dst */
while (*src++) ;
}

return(src - osrc - 1); /* count does not include NUL */
}

It starts by copying from src to dst as much as it can, and if it has
to truncate due to insufficient dst size, then traverses the rest of
src in order to get the strlen(src) value for returning. And so if
the source string fits, it will be traversed only once.

Now if you try to take a look at the glibc implementation of strlcpy,
immediately you'll notice that the first line is this...

size_t src_length = strlen (src);

.... followed by the rest of the code using memcpy to do the copying.
This already shatters the illusion that strlcpy will traverse the
string once, there's no requirement for that to happen, and as you
can see in practice, one of the major libcs will always traverse the
string twice, once in strlen and once in memcpy.

But before you open a bug report against glibc for being inefficient,
here's some benchmark number when copying a 512 byte string
repeatedly in a loop:

512 byte
openbsd: 242us
glibc: 12us

<https://gist.github.com/N-R-K/ebf096448c0a7f3fdd8b93d280747550>

Perhaps the string is so small that the double traversal doesn't
matter? How about a string of 1MiB?

1MiB
openbsd: 501646us
glibc: 31793us

The situation only gets worse for the openbsd version here, not
better. To be fair, this huge speed up is coming from the fact that
glibc punts all the work over to strlen and memcpy which on glibc are
SIMD optimized. But regardless, we can already see that doing
something fast, twice - is faster than doing it once but slowly.

Apples to apples
================
In order to do an apples to apples comparison I've written the
following strlcpy implementation, which is pretty close to the glibc
implementation except with the strlen and memcpy calls written out in
for loops.

size_t bespoke_strlcpy(char *dst, const char *src, size_t size)
{
size_t len = 0;
for (; src[len] != '\0'; ++len) {} // strlen() loop

if (size > 0) {
size_t to_copy = len < size ? len : size - 1;
for (size_t i = 0; i < to_copy; ++i) // memcpy() loop
dst[i] = src[i];
dst[to_copy] = '\0';
}
return len;
}

It's important to note that in order to do a truly apples to apples
comparison, you'd need to also use -fno-builtin when compiling.
Otherwise gcc will realize that the "strlen loop" can be "optimized"
down to a strlen call and emit that. -fno-builtin avoids that from
happening and keeps the comparison fair.

So how does this version, which traverses src twice, perform against
the openbsd's variant which traverses src only once?

512 byte
openbsd: 237us
bespoke: 139us

It's almost twice as fast. How about on bigger strings?

1MiB
openbsd: 488469us
bespoke: 277183us

Still roughly twice as fast. How come?

Dependencies
============
The importance of cache misses (rightfully) gets plenty of spotlight,
dependencies on the other hand are not talked about as much. Your cpu
has multiple cores, and each core has multiple ports (or logic units)
capable of executing instructions. Which means that if you have some
instructions like this (in pseudo assembly, where upper case alphabet
denotes a register):

A <- add B, C
X <- add Y, Z
E <- add A, X

The computation of A and X are independent, and thus can be executed
in parallel. But computation of E requires the result of A and X and
thus cannot be parallelized. This process of being able to execute
independent instructions simultaneously is called
instruction-level-parallelism (or ILP). And dependencies are it's
kryptonite.

If you try to profile the "bespoke" strlcpy version, you'll notice
that nearly 100% of the cpu time is spent on the "strlen loop" while
the copy loop is basically free. Indeed if you replace the "strlen
loop" with an actual strlen call (reminder: that it's SIMD optimized
on glibc) then the bespoke version starts competing with the glibc
version quite well even though we aren't using an optimized memcpy.
In order to understand why this is happening, let's look at the
"strlen loop", written in a verbose manner below:

len = 0;
while (true) {
if (src[len] == '\0')
break; // <- this affects the next iteration
else
++len;
}

In the above loop, whether or not the next iteration of the loop will
execute depends on the result of the previous iteration (whether
src[len] was nul or not). We pay for this in our strlen loop. But our
memcpy loop is free of such loop-carried-dependencies, the current
iteration happens regardless of what happened on the last iteration.

for (size_t i = 0; i < to_copy; ++i) // memcpy() loop
dst[i] = src[i]; // <- does NOT depend on previous iteration

In the openbsd version, because the length and copy loop are fused
together, whether or not the next byte will be copied depends on the
byte value of the previous iteration.

while (--nleft != 0) { // openbsd copy loop
// <- the branch taken here affect the next iteration
if ((*dst++ = *src++) == '\0')
break;
}

Effectively the cost of this dependency is now not just imposed on
the length computation but also on the copy operation. And to add
insult to injury, dependencies are not just difficult for the CPU,
they are also difficult for the compiler to optimize/auto-vectorize
resulting in worse code generation - a compounding effect.

Addendum: don't throw the length away
=====================================
> The key to making programs fast is to make them do practically nothing.
> - Mike Haertel, why GNU grep is fast

<https://lists.freebsd.org/pipermail/freebsd-current/2010-August/
019310.html>

2 years ago when I wrote the strlcpy article I was still of the
opinion that nul-terminated strings were "fine" and the problem was
due to the standard library being poor. But even with better
nul-string routines, I noticed that a disproportionate amount of
mental effort was spent, and bugs written, trying to program with
them. Two very important observations since then:

* The length of a string is an invaluable information.

<https://www.symas.com/post/the-sad-state-of-c-strings>

Without knowing the length, strings become more closer to a linked
list - forcing a serial access pattern - rather than an array that
can be randomly accessed. Many common string functions are better
expressed (read: less error-prone) when the length can be cheaply
known. Nul-terminated strings on the other hand encourages you to
continuously keep throwing this very valuable information away -
leading to having to spuriously recompute it again and again and
again (the GTA loading screen incident always comes to mind).

* Ability to have zero-copy substrings is huge.

<https://nee.lv/2021/02/28/How-I-cut-GTA-Online-loading-times-by-70/>

They get rid of a lot of spurious copies (i.e more efficiency) as
well as allocations (i.e avoids unnecessary memory management). And
as a result, a great deal of logic and code that were necessary when
managing nul-terminated strings simply disappear.

With these two in mind, nowadays I just use sized-strings (something
akin to C++'s std::string_view) and only convert to nul-string when
an external API demands it. This topic is worth an article on it's
own, but since that is not the focus of this article, I'll digress.


Click here to read the complete article
Subject: Re: strlcpy and how CPUs can defy common sense
From: Stefan Ram
Newsgroups: comp.misc
Organization: Stefan Ram
Date: Fri, 26 Jul 2024 16:19 UTC
References: 1
Path: eternal-september.org!news.eternal-september.org!feeder3.eternal-september.org!fu-berlin.de!uni-berlin.de!not-for-mail
From: ram@zedat.fu-berlin.de (Stefan Ram)
Newsgroups: comp.misc
Subject: Re: strlcpy and how CPUs can defy common sense
Date: 26 Jul 2024 16:19:01 GMT
Organization: Stefan Ram
Lines: 50
Expires: 1 Jul 2025 11:59:58 GMT
Message-ID: <strings-20240726170151@ram.dialup.fu-berlin.de>
References: <slrnva7gg6.39g.bencollver@svadhyaya.localdomain>
Mime-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit
X-Trace: news.uni-berlin.de gl5Czs0axO2Ho+65LsKcFgIeOay4xxD05TwFBakCBAVP5I
Cancel-Lock: sha1:pNadoyQiNkBuhImB+wJFScFfxyw= sha256:kpOR7wi3YEdKwoXkpLLAg+m9RRztBNcW/NWOd5gR1GU=
X-Copyright: (C) Copyright 2024 Stefan Ram. All rights reserved.
Distribution through any means other than regular usenet
channels is forbidden. It is forbidden to publish this
article in the Web, to change URIs of this article into links,
and to transfer the body without this notice, but quotations
of parts in other Usenet posts are allowed.
X-No-Archive: Yes
Archive: no
X-No-Archive-Readme: "X-No-Archive" is set, because this prevents some
services to mirror the article in the web. But the article may
be kept on a Usenet archive server with only NNTP access.
X-No-Html: yes
Content-Language: en-US
View all headers

Ben Collver <bencollver@tilde.pink> wrote or quoted:
>Hidden in this argument is the assumption that traversing the string
>once is faster. Which - to be clear - is not at all an unreasonable
>assumption. But is it actually true? That's the focus of today's
>article.

If the string is long, it might not fit into some level caches,
meaning it would need to be fetched from main memory twice,
which is a drag. But if the string is short, that's not the
case, and looping through it twice doesn't have to be slower.

>surprises. And so the performance of an algorithm doesn't just depend
>on high level algorithmic factors - lower level factors such as cache
>misses, ILP, branch mispredictions etc, also need to taken into
>account. Many things which seems to be faster from a common sense
>perspective might in practice end up being slower and vice versa.

Especially when it comes to cache misses. And for loops,
there's also the regularity factor.

This video might be a hit with some folks: "The strange details
of std__string at Facebook" - Nicholas Ormrod, CppCon 2016.

And who said this?

|Rob Pike's 5 Rules of Programming
| |Rule 1. You can't tell where a program is going to spend
|its time. Bottlenecks occur in surprising places, so don't
|try to second guess and put in a speed hack until you've
|proven that's where the bottleneck is.
| |Rule 2. Measure. Don't tune for speed until you've measured, and
|even then don't unless one part of the code overwhelms the rest.
| |Rule 3. Fancy algorithms are slow when n is small, and n is
|usually small. Fancy algorithms have big constants. Until you
|know that n is frequently going to be big, don't get fancy.
|(Even if n does get big, use Rule 2 first.)
| |Rule 4. Fancy algorithms are buggier than simple ones, and
|they're much harder to implement. Use simple algorithms as
|well as simple data structures.
| |Rule 5. Data dominates. If you've chosen the right data
|structures and organized things well, the algorithms will
|almost always be self-evident. Data structures, not algorithms,
|are central to programming.

If you said, "Rob Pike", you were right!

Subject: Re: strlcpy and how CPUs can defy common sense
From: Stefan Ram
Newsgroups: comp.misc
Organization: Stefan Ram
Date: Fri, 26 Jul 2024 16:34 UTC
References: 1 2
Path: eternal-september.org!news.eternal-september.org!feeder3.eternal-september.org!fu-berlin.de!uni-berlin.de!not-for-mail
From: ram@zedat.fu-berlin.de (Stefan Ram)
Newsgroups: comp.misc
Subject: Re: strlcpy and how CPUs can defy common sense
Date: 26 Jul 2024 16:34:25 GMT
Organization: Stefan Ram
Lines: 26
Expires: 1 Jul 2025 11:59:58 GMT
Message-ID: <Cache-20240726173402@ram.dialup.fu-berlin.de>
References: <slrnva7gg6.39g.bencollver@svadhyaya.localdomain> <strings-20240726170151@ram.dialup.fu-berlin.de>
Mime-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit
X-Trace: news.uni-berlin.de 67mVfpzbUTEc/GX++tP25AYQnAu5KepuI3StACVAh78TS4
Cancel-Lock: sha1:vFoeYw/7JeZWMAhLdbKmhtRQ74U= sha256:0Sa+TJXAeqyxSpJqkcJjTlrqSzUTQo6OwhoN/l7FwJc=
X-Copyright: (C) Copyright 2024 Stefan Ram. All rights reserved.
Distribution through any means other than regular usenet
channels is forbidden. It is forbidden to publish this
article in the Web, to change URIs of this article into links,
and to transfer the body without this notice, but quotations
of parts in other Usenet posts are allowed.
X-No-Archive: Yes
Archive: no
X-No-Archive-Readme: "X-No-Archive" is set, because this prevents some
services to mirror the article in the web. But the article may
be kept on a Usenet archive server with only NNTP access.
X-No-Html: yes
Content-Language: en-US
View all headers

ram@zedat.fu-berlin.de (Stefan Ram) wrote or quoted:
>This video might be a hit with some folks: "The strange details
>of std__string at Facebook" - Nicholas Ormrod, CppCon 2016.

Also, "Efficiency with Algorithms, Performance with Data Structures"
- Chandler Carruth, CppCon 2014, from which I take:

CPUS HAVE A HIERARCHICAL CACHE SYSTEM

One cycle on a 3 GHz processor 1 ns
L1 cache reference 0.5 ns
Branch mispredict 5 ns
L2 cache reference 7 ns 14x L1 cache
Mutex lock/unlock 25 ns
Main memory reference 100 ns 20xL2, 200xL1
Compress 1K bytes with Snappy 3,000 ns
Send 1K bytes over 1 Gbps network 10,000 ns 0.01 ms
Read 4K randomly from SSD 150,000 ns 0.15 ms
Read 1 MB sequentially from memory 250,000 ns 0.25 ms
Round trip within same datacenter 500,000 ns 0.5 ms
Read 1 MB sequentially From SSD 1,000,000 ns 1 ms 4x memory
Disk seek 10,000,000 ns 10 ms 20xdatacen. RT
Read 1 MB sequentially from disk 20,000,000 ns 20 ms 80xmem.,20xSSD
Send packet CA->Netherlands->CA 150,000,000 ns 150 ms

.

Subject: Re: strlcpy and how CPUs can defy common sense
From: Lawrence D'Oliv
Newsgroups: comp.misc
Organization: A noiseless patient Spider
Date: Sat, 27 Jul 2024 00:31 UTC
References: 1
Path: eternal-september.org!news.eternal-september.org!.POSTED!not-for-mail
From: ldo@nz.invalid (Lawrence D'Oliveiro)
Newsgroups: comp.misc
Subject: Re: strlcpy and how CPUs can defy common sense
Date: Sat, 27 Jul 2024 00:31:05 -0000 (UTC)
Organization: A noiseless patient Spider
Lines: 9
Message-ID: <v81f49$32fuh$1@dont-email.me>
References: <slrnva7gg6.39g.bencollver@svadhyaya.localdomain>
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit
Injection-Date: Sat, 27 Jul 2024 02:31:06 +0200 (CEST)
Injection-Info: dont-email.me; posting-host="923de006914fc707793912d16389c2d1";
logging-data="3227601"; mail-complaints-to="abuse@eternal-september.org"; posting-account="U2FsdGVkX1+Xpi+Lk91JzME0b8PE3wHX"
User-Agent: Pan/0.159 (Vovchansk; )
Cancel-Lock: sha1:Ja9ILtsPL9OJFQoA4vE8kQF5ysk=
View all headers

On Fri, 26 Jul 2024 15:36:17 -0000 (UTC), Ben Collver wrote:

> The situation only gets worse for the openbsd version here, not better.

Not the only time the GNU folks have done something smarter than the BSD
folks.

<http://trillian.mit.edu/~jc/humor/ATT_Copyright_true.html>
<http://www.theregister.co.uk/2016/02/10/line_break_ep2/>

Subject: Re: strlcpy and how CPUs can defy common sense
From: Lawrence D'Oliv
Newsgroups: comp.misc
Organization: A noiseless patient Spider
Date: Sat, 27 Jul 2024 02:09 UTC
References: 1 2 3
Path: eternal-september.org!news.eternal-september.org!.POSTED!not-for-mail
From: ldo@nz.invalid (Lawrence D'Oliveiro)
Newsgroups: comp.misc
Subject: Re: strlcpy and how CPUs can defy common sense
Date: Sat, 27 Jul 2024 02:09:05 -0000 (UTC)
Organization: A noiseless patient Spider
Lines: 9
Message-ID: <v81ks1$372d7$1@dont-email.me>
References: <slrnva7gg6.39g.bencollver@svadhyaya.localdomain>
<strings-20240726170151@ram.dialup.fu-berlin.de>
<Cache-20240726173402@ram.dialup.fu-berlin.de>
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit
Injection-Date: Sat, 27 Jul 2024 04:09:16 +0200 (CEST)
Injection-Info: dont-email.me; posting-host="923de006914fc707793912d16389c2d1";
logging-data="3377575"; mail-complaints-to="abuse@eternal-september.org"; posting-account="U2FsdGVkX1+W1sY3G+CDnbOCMgEzVxih"
User-Agent: Pan/0.159 (Vovchansk; )
Cancel-Lock: sha1:97/YSMvCGKzciKsnas8tH6DWc5E=
View all headers

On 26 Jul 2024 16:34:25 GMT, Stefan Ram wrote:

> One cycle on a 3 GHz processor 1 ns

Shouldn’t that be ⅓ns?

> Send 1K bytes over 1 Gbps network 10,000 ns 0.01 ms

Perhaps more easily written as 10µs.

Subject: Re: strlcpy and how CPUs can defy common sense
From: Bruce Horrocks
Newsgroups: comp.misc
Date: Sat, 27 Jul 2024 10:32 UTC
References: 1
Path: eternal-september.org!news.eternal-september.org!feeder3.eternal-september.org!fu-berlin.de!uni-berlin.de!individual.net!not-for-mail
From: 07.013@scorecrow.com (Bruce Horrocks)
Newsgroups: comp.misc
Subject: Re: strlcpy and how CPUs can defy common sense
Date: Sat, 27 Jul 2024 11:32:25 +0100
Lines: 9
Message-ID: <c1cf4f8b-00e3-4c69-8ea3-160ef0872861@scorecrow.com>
References: <slrnva7gg6.39g.bencollver@svadhyaya.localdomain>
Mime-Version: 1.0
Content-Type: text/plain; charset=UTF-8; format=flowed
Content-Transfer-Encoding: 7bit
X-Trace: individual.net yT5FDlxTQv7Yx/P3FI+1TQ6Rp+LKKR++rmixUF6/UNjmu9sp4x
Cancel-Lock: sha1:JXLCh/bP7SqX4UhOqsdYQpwRmm8= sha256:ARehFRmF7OJ1fdyWPd8qGxOMCe7VWjdgaKSM7GIr5Ms=
User-Agent: Mozilla Thunderbird
Content-Language: en-GB
In-Reply-To: <slrnva7gg6.39g.bencollver@svadhyaya.localdomain>
View all headers

On 26/07/2024 16:36, Ben Collver wrote:
> strlcpy and how CPUs can defy common sense

Thank-you Ben for re-posting that. Very interesting.

--
Bruce Horrocks
Surrey, England

Subject: Re: strlcpy and how CPUs can defy common sense
From: Johanne Fairchild
Newsgroups: comp.misc
Organization: A noiseless patient Spider
Date: Sat, 27 Jul 2024 12:30 UTC
References: 1 2
Path: eternal-september.org!news.eternal-september.org!.POSTED!not-for-mail
From: jfairchild@tudado.org (Johanne Fairchild)
Newsgroups: comp.misc
Subject: Re: strlcpy and how CPUs can defy common sense
Date: Sat, 27 Jul 2024 09:30:30 -0300
Organization: A noiseless patient Spider
Lines: 8
Message-ID: <875xsrj9uh.fsf@tudado.org>
References: <slrnva7gg6.39g.bencollver@svadhyaya.localdomain>
<c1cf4f8b-00e3-4c69-8ea3-160ef0872861@scorecrow.com>
MIME-Version: 1.0
Content-Type: text/plain
Injection-Date: Sat, 27 Jul 2024 14:30:30 +0200 (CEST)
Injection-Info: dont-email.me; posting-host="3de007a49cdacb6523db77ab69dfcb58";
logging-data="3563889"; mail-complaints-to="abuse@eternal-september.org"; posting-account="U2FsdGVkX19Uv84IMY7EUwghukT29WwJA8QK00aHqHw="
Cancel-Lock: sha1:pAO1y4C4pBKTApjr+ExJNq1PCgc=
sha1:jUSFJQYfUD9Y2KQSAvQhUEhmtvU=
View all headers

Bruce Horrocks <07.013@scorecrow.com> writes:

> On 26/07/2024 16:36, Ben Collver wrote:
>> strlcpy and how CPUs can defy common sense
>
> Thank-you Ben for re-posting that. Very interesting.

Ditto. I loved it.

Subject: Re: strlcpy and how CPUs can defy common sense
From: John McCue
Newsgroups: comp.misc
Organization: A noiseless patient Spider
Date: Sat, 27 Jul 2024 19:58 UTC
References: 1 2
Path: eternal-september.org!news.eternal-september.org!.POSTED!not-for-mail
From: jmccue@magnetar.jmcunx.com (John McCue)
Newsgroups: comp.misc
Subject: Re: strlcpy and how CPUs can defy common sense
Date: Sat, 27 Jul 2024 19:58:06 -0000 (UTC)
Organization: A noiseless patient Spider
Lines: 22
Message-ID: <v83jge$3h76j$1@dont-email.me>
References: <slrnva7gg6.39g.bencollver@svadhyaya.localdomain> <v81f49$32fuh$1@dont-email.me>
Reply-To: jmclnx@SPAMisBADgmail.com
Injection-Date: Sat, 27 Jul 2024 21:58:07 +0200 (CEST)
Injection-Info: dont-email.me; posting-host="68591cb28b418ed31cbcedccd71dedc4";
logging-data="3710163"; mail-complaints-to="abuse@eternal-september.org"; posting-account="U2FsdGVkX1+Xb301JbVvozOKfhgRm0i1"
User-Agent: tin/2.6.1-20211226 ("Convalmore") (Linux/5.15.161 (x86_64))
Cancel-Lock: sha1:bS3q2IJQFy8YXjC78jz1kELL80s=
X-OS-Version: Slackware 15.0 x86_64
View all headers

Lawrence D'Oliveiro <ldo@nz.invalid> wrote:
> On Fri, 26 Jul 2024 15:36:17 -0000 (UTC), Ben Collver wrote:
>
>> The situation only gets worse for the openbsd version here, not better.
>
> Not the only time the GNU folks have done something smarter than the BSD
> folks.

I do not understand this statement in regards to true(1).

> <http://trillian.mit.edu/~jc/humor/ATT_Copyright_true.html>

This is interesting

> <http://www.theregister.co.uk/2016/02/10/line_break_ep2/>

How is GNU's version of true better than OpenBSD's ?
See page 2 in the articke.

--
[t]csh(1) - "An elegant shell, for a more... civilized age."
- Paraphrasing Star Wars

Subject: Re: strlcpy and how CPUs can defy common sense
From: Joerg Mertens
Newsgroups: comp.misc
Organization: A noiseless patient Spider
Date: Sat, 27 Jul 2024 20:53 UTC
References: 1 2 3
Path: eternal-september.org!news.eternal-september.org!jmertens.eternal-september.org!.POSTED!not-for-mail
From: joerg-mertens@t-online.de (Joerg Mertens)
Newsgroups: comp.misc
Subject: Re: strlcpy and how CPUs can defy common sense
Date: Sat, 27 Jul 2024 22:53:11 +0200
Organization: A noiseless patient Spider
Lines: 36
Message-ID: <v83mno$3hssg$1@jmertens.eternal-september.org>
References: <slrnva7gg6.39g.bencollver@svadhyaya.localdomain> <v81f49$32fuh$1@dont-email.me> <v83jge$3h76j$1@dont-email.me>
Injection-Date: Sat, 27 Jul 2024 22:53:15 +0200 (CEST)
Injection-Info: jmertens.eternal-september.org; posting-host="d180ea570a579f6feeab76ab09f3a39d";
logging-data="3732383"; mail-complaints-to="abuse@eternal-september.org"; posting-account="U2FsdGVkX1+IxKNDwvF093+NGv8T2SG/A1vfeZmRkRc="
User-Agent: tin/2.6.2-20221225 ("Pittyvaich") (OpenBSD/7.5 (amd64)) tinews.pl/1.1.61
Cancel-Lock: sha1:K5sROdJGIdV1WtUL9ifT45V59bE=
View all headers

John McCue <jmccue@magnetar.jmcunx.com> wrote:
> Lawrence D'Oliveiro <ldo@nz.invalid> wrote:
>> On Fri, 26 Jul 2024 15:36:17 -0000 (UTC), Ben Collver wrote:
>>
>>> The situation only gets worse for the openbsd version here, not better.
>>
>> Not the only time the GNU folks have done something smarter than the BSD
>> folks.
>
> I do not understand this statement in regards to true(1).
>
>> <http://trillian.mit.edu/~jc/humor/ATT_Copyright_true.html>
>
> This is interesting
>
>> <http://www.theregister.co.uk/2016/02/10/line_break_ep2/>
>
> How is GNU's version of true better than OpenBSD's ?

It's definitely better if speed is your only quality critereon.

> See page 2 in the articke.

A similar case is yes(1):

https://github.com/coreutils/coreutils/blob/master/src/yes.c

versus

https://github.com/openbsd/src/blob/master/usr.bin/yes/yes.c

It was discussed in this Hacker News article:

https://news.ycombinator.com/item?id=14542938

Regards

Subject: Re: strlcpy and how CPUs can defy common sense
From: Lawrence D'Oliv
Newsgroups: comp.misc
Organization: A noiseless patient Spider
Date: Sat, 27 Jul 2024 22:40 UTC
References: 1 2 3
Path: eternal-september.org!news.eternal-september.org!.POSTED!not-for-mail
From: ldo@nz.invalid (Lawrence D'Oliveiro)
Newsgroups: comp.misc
Subject: Re: strlcpy and how CPUs can defy common sense
Date: Sat, 27 Jul 2024 22:40:38 -0000 (UTC)
Organization: A noiseless patient Spider
Lines: 31
Message-ID: <v83t16$3iv51$2@dont-email.me>
References: <slrnva7gg6.39g.bencollver@svadhyaya.localdomain>
<v81f49$32fuh$1@dont-email.me> <v83jge$3h76j$1@dont-email.me>
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit
Injection-Date: Sun, 28 Jul 2024 00:40:38 +0200 (CEST)
Injection-Info: dont-email.me; posting-host="3b558a041f0a0aed486aeb8fa027d259";
logging-data="3767457"; mail-complaints-to="abuse@eternal-september.org"; posting-account="U2FsdGVkX1/3HmxVLPpxl8Nrhu5IWOIx"
User-Agent: Pan/0.159 (Vovchansk; )
Cancel-Lock: sha1:jCND40atKhs9XSG5H8bes+TJwFg=
View all headers

On Sat, 27 Jul 2024 19:58:06 -0000 (UTC), John McCue wrote:

> Lawrence D'Oliveiro <ldo@nz.invalid> wrote:
>
>> On Fri, 26 Jul 2024 15:36:17 -0000 (UTC), Ben Collver wrote:
>>
>>> The situation only gets worse for the openbsd version here, not
>>> better.
>>
>> Not the only time the GNU folks have done something smarter than the
>> BSD folks.
>
> I do not understand this statement in regards to true(1).
>
>> <http://trillian.mit.edu/~jc/humor/ATT_Copyright_true.html>
>
> This is interesting
>
>> <http://www.theregister.co.uk/2016/02/10/line_break_ep2/>
>
> How is GNU's version of true better than OpenBSD's ?
> See page 2 in the articke.

You have to put the two together to realize how hilariously wrong the
“Register” article is. The OpenBSD version of “true” may seem concise and
elegant, until you notice that it requires the loading of an entirely new
shell instance to run each time.

Whereas the GNU version, with its much longer source code entirely in C,
loads faster and runs in less memory. Which was one of the points made in
the first article.

Subject: Re: strlcpy and how CPUs can defy common sense
From: Joerg Mertens
Newsgroups: comp.misc
Organization: A noiseless patient Spider
Date: Sun, 28 Jul 2024 09:42 UTC
References: 1 2 3 4
Path: eternal-september.org!news.eternal-september.org!jmertens.eternal-september.org!.POSTED!not-for-mail
From: joerg-mertens@t-online.de (Joerg Mertens)
Newsgroups: comp.misc
Subject: Re: strlcpy and how CPUs can defy common sense
Date: Sun, 28 Jul 2024 11:42:29 +0200
Organization: A noiseless patient Spider
Lines: 49
Message-ID: <v853q6$3slc2$1@jmertens.eternal-september.org>
References: <slrnva7gg6.39g.bencollver@svadhyaya.localdomain> <v81f49$32fuh$1@dont-email.me> <v83jge$3h76j$1@dont-email.me> <v83t16$3iv51$2@dont-email.me>
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit
Injection-Date: Sun, 28 Jul 2024 11:42:36 +0200 (CEST)
Injection-Info: jmertens.eternal-september.org; posting-host="414235435f258140e15466b87bfb5a84";
logging-data="4085148"; mail-complaints-to="abuse@eternal-september.org"; posting-account="U2FsdGVkX1+tOOdJA4OEoGmzoh/5EPPgowJBCGph+n0="
User-Agent: tin/2.6.2-20221225 ("Pittyvaich") (OpenBSD/7.5 (amd64)) tinews.pl/1.1.61
Cancel-Lock: sha1:x/5ddwGZNBIs1Rbun8huvQVxtdY=
View all headers

Lawrence D'Oliveiro <ldo@nz.invalid> wrote:
> On Sat, 27 Jul 2024 19:58:06 -0000 (UTC), John McCue wrote:
>
>> Lawrence D'Oliveiro <ldo@nz.invalid> wrote:
>>
>>> On Fri, 26 Jul 2024 15:36:17 -0000 (UTC), Ben Collver wrote:
>>>
>>>> The situation only gets worse for the openbsd version here, not
>>>> better.
>>>
>>> Not the only time the GNU folks have done something smarter than the
>>> BSD folks.
>>
>> I do not understand this statement in regards to true(1).
>>
>>> <http://trillian.mit.edu/~jc/humor/ATT_Copyright_true.html>
>>
>> This is interesting
>>
>>> <http://www.theregister.co.uk/2016/02/10/line_break_ep2/>
>>
>> How is GNU's version of true better than OpenBSD's ?
>> See page 2 in the articke.
>
> You have to put the two together to realize how hilariously wrong the
> “Register” article is. The OpenBSD version of “true” may seem concise and
> elegant, until you notice that it requires the loading of an entirely new
> shell instance to run each time.
>
> Whereas the GNU version, with its much longer source code entirely in C,
> loads faster and runs in less memory. Which was one of the points made in
> the first article.

At least Theo de Raadt agrees with you in this commit message from
about eight years ago¹:

-----
Switch back to C versions of true/false. I do not accept any of the
arguments made 20 years ago. A small elf binary is smaller and faster
than a large elf binary running a script. Noone cares about the file
sizes on disk.
-----

The interesting word is `back´, which means, they already had had a
C version in earlier days and then at some point had switched to a
script. Someone would have to go through CVS history to find the
reason why.

1) https://cvsweb.openbsd.org/src/usr.bin/true/true.c

1

rocksolid light 0.9.8
clearnet tor