Rocksolid Light

News from da outaworlds

mail  files  register  groups  login

Message-ID:  

You have a deep interest in all that is artistic.


comp / comp.unix.shell / Re: "sed" question

SubjectAuthor
o Re: "sed" questionGrant Taylor

1
Subject: Re: "sed" question
From: Grant Taylor
Newsgroups: comp.unix.shell, comp.lang.awk
Followup: comp.lang.awk
Organization: TNet Consulting
Date: Fri, 8 Mar 2024 02:38 UTC
References: 1 2 3 4 5 6 7
Path: eternal-september.org!news.eternal-september.org!feeder3.eternal-september.org!tncsrv06.tnetconsulting.net!tncsrv09.home.tnetconsulting.net!.POSTED.omega.home.tnetconsulting.net!not-for-mail
From: gtaylor@tnetconsulting.net (Grant Taylor)
Newsgroups: comp.unix.shell,comp.lang.awk
Subject: Re: "sed" question
Followup-To: comp.lang.awk
Date: Thu, 7 Mar 2024 20:38:28 -0600
Organization: TNet Consulting
Message-ID: <usdtn4$j2n$1@tncsrv09.home.tnetconsulting.net>
References: <us9vka$fepq$1@dont-email.me> <usa01v$fj5h$1@dont-email.me>
<usagql$j9bc$1@dont-email.me> <usb5jv$4qa$3@tncsrv09.home.tnetconsulting.net>
<usb6pa$ncok$1@dont-email.me> <usdk6k$so1$1@tncsrv09.home.tnetconsulting.net>
<87bk7poa7u.fsf@nosuchdomain.example.com>
Mime-Version: 1.0
Content-Type: text/plain; charset=UTF-8; format=flowed
Content-Transfer-Encoding: 7bit
Injection-Date: Fri, 8 Mar 2024 02:38:28 -0000 (UTC)
Injection-Info: tncsrv09.home.tnetconsulting.net; posting-host="omega.home.tnetconsulting.net:198.18.1.140";
logging-data="19543"; mail-complaints-to="newsmaster@tnetconsulting.net"
User-Agent: Mozilla Thunderbird
Content-Language: en-US
In-Reply-To: <87bk7poa7u.fsf@nosuchdomain.example.com>
View all headers

On 3/7/24 18:09, Keith Thompson wrote:
> I know that's what awk does, but I don't think I would have expected
> it if I didn't know about it.

Okay. I think that's a fair observation.

> $0 is the current input line.

Or $0 is the current /record/ in awk parlance.

> If you don't change anything, or if you modify $0 itself, whitespace
> betweeen fields is preserved.

> If you modify any of the fields, $0 is recomputed and whitespace
> between tokens is collapsed.

I don't agree with that.

% echo 'one two three' | awk '{print $0; print $1,$2,$3}'
one two three
one two three

I didn't /modify/ anything and awk does print the fields with different
white space.

> awk *could* have been defined to preserve inter-field whitespace even
> when you modify individual fields,

I question the veracity of that. Specifically when lengthening or
shortening the value of a field. E.g. replacing "two" with "fifteen".
This is particularly germane when you look at $0 as a fixed width
formatted output.

> and I think I would have found that more intuitive.

I don't agree.

> (And ideally there would be a way to refer to that inter-field
> whitespace.)

Remember, awk is meant for working on fields of data in a record. By
default, the fields are delimited by white space characters. I'll say
it this way, awk is meant for working on the non-white space characters.
Or yet another way, awk is not meant for working on white space charters.

> The fact that modifying a field has the side effect of messing up $0
> seems counterintuitive.

Maybe.

But I think it's one that is acceptable for what awk is intended to do.

> Perhaps the behavior matches your intuition better than it matches
> mine.

I sort of feel like you are wanting to / trying to use awk in places
where sed might be better. sed just sees a string of text and is
ignorant of any structure without a carefully crafted RE to provide it.

Conversely awk is quite happy working with an easily identified field
based on the count with field separators of one or more white space
characters.

Consider the output of `netstat -an` wherein you have multiple columns
of IP addresses.

Please find a quick way, preferably that doesn't involve negation
(because what needs to be negated may bey highly dynamic) that lists
inbound SMTP connections on an email server but doesn't list outbound
SMTP connections.

awk makes it trivial to identify and print records that have the SMTP
port in the local IP column, thus ignoring outbound connections with
SMTP in the remote column.

Aside: Yes, I know that ss and the likes have more features for this,
but this is my example and ss is not installed everywhere.

I sort of view awk as somewhat akin to SQL wherein fields in awk are
like columns in SQL.

I'd be more than a little bit surprised to find an SQL interface that
preserved white space /between/ columns. -- Many will do it /within/
columns.

awk makes it trivial to take field oriented output from commands and
apply some logic / parsing / action on specific fields in records.

> (And perhaps this should be moved to comp.lang.awk if it doesn't die
> out soon.

comp.lang.awk added and followup pointed there.

> Though both sed and awk are both languages in their own right
> and tools that can be used from the shell, so I'd argue there's a
> topicality overlap.)

;-)

--
Grant. . . .

1

rocksolid light 0.9.8
clearnet tor