Rocksolid Light

News from da outaworlds

mail  files  register  groups  login

Message-ID:  

You feel a whole lot more like you do now than you did when you used to.


comp / comp.lang.python / Re: Correct syntax for pathological re.search()

SubjectAuthor
* Correct syntax for pathological re.search()Michael F. Stemper
+* Re: Correct syntax for pathological re.search()Stefan Ram
|`* Re: Correct syntax for pathological re.search()Michael F. Stemper
| `* Re: Correct syntax for pathological re.search()Stefan Ram
|  +- Re: Correct syntax for pathological re.search()Jon Ribbens
|  `* Re: Correct syntax for pathological re.search()Pieter van Oostrum
|   `- Re: Correct syntax for re.search() (Posting On Python-List Prohibited)Lawrence D'Oliveiro
+- Re: Correct syntax for pathological re.search()Karsten Hilbert
+- Re: Correct syntax for pathological re.search()MRAB
+* Re: Correct syntax for pathological re.search()MRAB
|`* Re: Correct syntax for pathological re.search()Stefan Ram
| `- Re: Correct syntax for pathological re.search()Stefan Ram
+* Re: Correct syntax for pathological re.search()Karsten Hilbert
|`* Re: Correct syntax for pathological re.search()Alan Bawden
| +- Re: Correct syntax for pathological re.search()MRAB
| `- Re: Correct syntax for pathological re.search()Karsten Hilbert
`* Re: Correct syntax for pathological re.search()Gilmeh Serda
 +- RE: Correct syntax for pathological re.search()<avi.e.gross
 +- Re: Correct syntax for pathological re.search()MRAB
 +* Re: Correct syntax for pathological re.search()Peter J. Holzer
 |`- Re: Correct syntax for pathological re.search()Stefan Ram
 +- Re: Correct syntax for pathological re.search()Thomas Passin
 +- RE: Correct syntax for pathological re.search()<avi.e.gross
 +- Re: Correct syntax for pathological re.search()Thomas Passin
 +- Re: Correct syntax for pathological re.search()Stefan Ram
 `* Re: Correct syntax for pathological re.search()Peter J. Holzer
  `* Re: Correct syntax for pathological re.search()jak
   `* Re: Correct syntax for pathological re.search()Peter J. Holzer
    `- Re: Correct syntax for pathological re.search()Stefan Ram

Pages:12
Subject: Re: Correct syntax for pathological re.search()
From: Peter J. Holzer
Newsgroups: comp.lang.python
Date: Fri, 18 Oct 2024 21:09 UTC
References: 1 2 3 4 5 6
Attachments: signature.asc (application/pgp-signature)
Path: eternal-september.org!news.eternal-september.org!feeder3.eternal-september.org!news.szaf.org!fu-berlin.de!uni-berlin.de!not-for-mail
From: hjp-python@hjp.at (Peter J. Holzer)
Newsgroups: comp.lang.python
Subject: Re: Correct syntax for pathological re.search()
Date: Fri, 18 Oct 2024 23:09:41 +0200
Lines: 78
Message-ID: <mailman.28.1729285790.4695.python-list@python.org>
References: <ve0o34$1nep4$1@dont-email.me>
<MQaOO.3313338$EVn.2054758@fx04.ams4>
<011301db1c22$5e7519c0$1b5f4d40$@gmail.com>
<20241012105958.cbctekv7vustleha@hjp.at>
<966b510d-9bd7-4472-a858-7e042d78461d@tompassin.net>
<20241018210941.f5azh2lvz7cxzcy5@hjp.at>
Mime-Version: 1.0
Content-Type: multipart/signed; micalg=pgp-sha512;
protocol="application/pgp-signature"; boundary="jcuygm3ttbtxcbci"
X-Trace: news.uni-berlin.de El96d0tX4ZpIlY+NNS172wo1C/6KkkEHbT4n3QztnDFA==
Cancel-Lock: sha1:aak7i8DdYa0GPXjOhLCLstvCC1E= sha256:2babmzlSGIfHMh2Mf2d7xICgpengB1Z4x6AKBnPTTEA=
Return-Path: <hjp-python@hjp.at>
X-Original-To: python-list@python.org
Delivered-To: python-list@mail.python.org
Authentication-Results: mail.python.org; dkim=none reason="no signature";
dkim-adsp=none (unprotected policy); dkim-atps=neutral
X-Spam-Status: OK 0.000
X-Spam-Evidence: '*H*': 1.00; '*S*': 0.00; 'content-
type:multipart/signed': 0.05; 'string': 0.07; 'content-
type:application/pgp-signature': 0.09; 'expression': 0.09;
'filename:fname piece:asc': 0.09; 'filename:fname
piece:signature': 0.09; 'filename:fname:signature.asc': 0.09;
'prints': 0.09; 'string,': 0.09; 'trivial': 0.09; 'utility': 0.09;
'that.': 0.15; '"creative': 0.16; '__/': 0.16; 'assert': 0.16;
'avi': 0.16; 'challenge!"': 0.16; 'compiled': 0.16; 'expressions':
0.16; 'from:addr:hjp-python': 0.16; 'from:addr:hjp.at': 0.16;
'from:name:peter j. holzer': 0.16; 'gross': 0.16; 'hjp@hjp.at':
0.16; 'holzer': 0.16; 'reality.': 0.16; 'stross,': 0.16;
'subject:syntax': 0.16; 'url-ip:212.17.106/24': 0.16; 'url-
ip:212.17/16': 0.16; 'url:hjp': 0.16; '|_|_)': 0.16; 'wrote:':
0.16; 'to:addr:python-list': 0.20; 'input': 0.21; 'anything':
0.25; 'seems': 0.26; 'function': 0.27; 'sense': 0.28; 'example,':
0.28; 'seem': 0.31; 'am,': 0.31; "doesn't": 0.32; 'assume': 0.32;
'python-list': 0.32; 'but': 0.32; 'subject:for': 0.33; 'there':
0.33; "didn't": 0.34; 'mean': 0.34; 'header:In-Reply-To:1': 0.34;
'display': 0.36; 'way': 0.38; 'could': 0.38; 'two': 0.39; 'valid':
0.39; 'received:212': 0.62; 'between': 0.63; 'skip:r 20': 0.64;
'look': 0.65; 'received:userid': 0.66; 'know.': 0.68; 'url-
ip:212/8': 0.69; 'received:at': 0.84
Mail-Followup-To: python-list@python.org
Content-Disposition: inline
In-Reply-To: <966b510d-9bd7-4472-a858-7e042d78461d@tompassin.net>
X-BeenThere: python-list@python.org
X-Mailman-Version: 2.1.39
Precedence: list
List-Id: General discussion list for the Python programming language
<python-list.python.org>
List-Unsubscribe: <https://mail.python.org/mailman/options/python-list>,
<mailto:python-list-request@python.org?subject=unsubscribe>
List-Archive: <https://mail.python.org/pipermail/python-list/>
List-Post: <mailto:python-list@python.org>
List-Help: <mailto:python-list-request@python.org?subject=help>
List-Subscribe: <https://mail.python.org/mailman/listinfo/python-list>,
<mailto:python-list-request@python.org?subject=subscribe>
X-Mailman-Original-Message-ID: <20241018210941.f5azh2lvz7cxzcy5@hjp.at>
X-Mailman-Original-References: <ve0o34$1nep4$1@dont-email.me>
<MQaOO.3313338$EVn.2054758@fx04.ams4>
<011301db1c22$5e7519c0$1b5f4d40$@gmail.com>
<20241012105958.cbctekv7vustleha@hjp.at>
<966b510d-9bd7-4472-a858-7e042d78461d@tompassin.net>
View all headers

On 2024-10-12 08:51:57 -0400, Thomas Passin via Python-list wrote:
> On 10/12/2024 6:59 AM, Peter J. Holzer via Python-list wrote:
> > On 2024-10-11 17:13:07 -0400, AVI GROSS via Python-list wrote:
> > > Is there some utility function out there that can be called to show what the
> > > regular expression you typed in will look like by the time it is ready to be
> > > used?
> >
> > I assume that by "ready to be used" you mean the compiled form?
> >
> > No, there doesn't seem to be a way to dump that. You can
> >
> > p = re.compile("\\\\sout{")
> > print(p.pattern)
> >
> > but that just prints the input string, which you could do without
> > compiling it first.
>
> It prints the escaped version,

Did you mean the *un*escaped version? Well, yeah, that's what print
does.

> so you can see if you escaped the string as you intended. In this
> case, the print will display '\\sout{'.

print("\\\\sout{")
will do the same.

It seems to me that for any string s which is a valid regular expression
(i.e. re.compile doesn't throw an exception)

assert re.compile(s).pattern == s

holds.

So it doesn't give you anything you didn't already know.

As a trivial example, the regular expressions r"\\sout{" and r"\\sout\{"
are equivalent (the \ before the { is redundant). Yet
re.compile(s).pattern preserves the difference between the two strings.

hp

--
_ | Peter J. Holzer | Story must make more sense than reality.
|_|_) | |
| | | hjp@hjp.at | -- Charles Stross, "Creative writing
__/ | http://www.hjp.at/ | challenge!"

Attachments: signature.asc (application/pgp-signature)
Subject: Re: Correct syntax for pathological re.search()
From: jak
Newsgroups: comp.lang.python
Organization: A noiseless patient Spider
Date: Fri, 18 Oct 2024 22:15 UTC
References: 1 2 3 4 5 6 7
Path: eternal-september.org!news.eternal-september.org!.POSTED!not-for-mail
From: nospam@please.ty (jak)
Newsgroups: comp.lang.python
Subject: Re: Correct syntax for pathological re.search()
Date: Sat, 19 Oct 2024 00:15:23 +0200
Organization: A noiseless patient Spider
Lines: 16
Message-ID: <veumlu$3gfsk$1@dont-email.me>
References: <ve0o34$1nep4$1@dont-email.me>
<MQaOO.3313338$EVn.2054758@fx04.ams4>
<011301db1c22$5e7519c0$1b5f4d40$@gmail.com>
<20241012105958.cbctekv7vustleha@hjp.at>
<966b510d-9bd7-4472-a858-7e042d78461d@tompassin.net>
<20241018210941.f5azh2lvz7cxzcy5@hjp.at>
<mailman.28.1729285790.4695.python-list@python.org>
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8; format=flowed
Content-Transfer-Encoding: 8bit
Injection-Date: Sat, 19 Oct 2024 00:15:27 +0200 (CEST)
Injection-Info: dont-email.me; posting-host="16693d41b040f8f50b178df3596c546e";
logging-data="3686292"; mail-complaints-to="abuse@eternal-september.org"; posting-account="U2FsdGVkX1+w6GW1pT4Kt3OwPJMjo+Bz"
User-Agent: Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:91.0) Gecko/20100101
Firefox/91.0 SeaMonkey/2.53.19
Cancel-Lock: sha1:8ohf2fkSeK+nR/CQ+KjoqXnl4+M=
In-Reply-To: <mailman.28.1729285790.4695.python-list@python.org>
View all headers

Peter J. Holzer ha scritto:
> As a trivial example, the regular expressions r"\\sout{" and r"\\sout\{"
> are equivalent (the \ before the { is redundant). Yet
> re.compile(s).pattern preserves the difference between the two strings.

Hi,
Allow me to be fussy: r"\\sout{" and r"\\sout\{" are similar but not
equivalent. If you omit the backslash, the parser will have to determine
if the graph is part of regular expression {n, m} and will take more
time. In some online regexs have these results:

r"\\sout{" : 1 match ( 7 steps, 620 μs )

r"\\sout\{" : 1 match ( 7 steps, 360 μs )

Subject: Re: Correct syntax for pathological re.search()
From: Peter J. Holzer
Newsgroups: comp.lang.python
Date: Mon, 21 Oct 2024 19:10 UTC
References: 1 2 3 4 5 6 7 8 9
Attachments: signature.asc (application/pgp-signature)
Path: eternal-september.org!news.eternal-september.org!feeder3.eternal-september.org!fu-berlin.de!uni-berlin.de!not-for-mail
From: hjp-python@hjp.at (Peter J. Holzer)
Newsgroups: comp.lang.python
Subject: Re: Correct syntax for pathological re.search()
Date: Mon, 21 Oct 2024 21:10:49 +0200
Lines: 58
Message-ID: <mailman.29.1729537858.4695.python-list@python.org>
References: <ve0o34$1nep4$1@dont-email.me>
<MQaOO.3313338$EVn.2054758@fx04.ams4>
<011301db1c22$5e7519c0$1b5f4d40$@gmail.com>
<20241012105958.cbctekv7vustleha@hjp.at>
<966b510d-9bd7-4472-a858-7e042d78461d@tompassin.net>
<20241018210941.f5azh2lvz7cxzcy5@hjp.at>
<mailman.28.1729285790.4695.python-list@python.org>
<veumlu$3gfsk$1@dont-email.me>
<20241021191049.iclg7pmpfrpkel55@hjp.at>
Mime-Version: 1.0
Content-Type: multipart/signed; micalg=pgp-sha512;
protocol="application/pgp-signature"; boundary="xqh3xvxqda6sjljl"
X-Trace: news.uni-berlin.de DemUBJIHZCD+fxaR7Sp2xATiiJUZjTaMulZHKUflwlbw==
Cancel-Lock: sha1:vgA0P9EbkTWYR7rDpVBuLxpTDoI= sha256:MUGf+4IO4/FoNMS4+PyXGeF+yWug5y4Mcwi2rBsyfW8=
Return-Path: <hjp-python@hjp.at>
X-Original-To: python-list@python.org
Delivered-To: python-list@mail.python.org
Authentication-Results: mail.python.org; dkim=none reason="no signature";
dkim-adsp=none (unprotected policy); dkim-atps=neutral
X-Spam-Status: OK 0.000
X-Spam-Evidence: '*H*': 1.00; '*S*': 0.00; 'content-
type:multipart/signed': 0.05; 'string': 0.07; 'content-
type:application/pgp-signature': 0.09; 'expression': 0.09;
'filename:fname piece:asc': 0.09; 'filename:fname
piece:signature': 0.09; 'filename:fname:signature.asc': 0.09;
'graph': 0.09; 'trivial': 0.09; '"creative': 0.16; '+0200,': 0.16;
'__/': 0.16; 'are.': 0.16; 'challenge!"': 0.16; 'expressions':
0.16; 'from:addr:hjp-python': 0.16; 'from:addr:hjp.at': 0.16;
'from:name:peter j. holzer': 0.16; 'hjp@hjp.at': 0.16; 'holzer':
0.16; 'parsing': 0.16; 'reality.': 0.16; 'stross,': 0.16;
'subject:syntax': 0.16; 'url-ip:212.17.106/24': 0.16; 'url-
ip:212.17/16': 0.16; 'url:hjp': 0.16; '|_|_)': 0.16; 'wrote:':
0.16; 'to:addr:python-list': 0.20; 'sense': 0.28; 'example,':
0.28; 'python-list': 0.32; 'but': 0.32; 'subject:for': 0.33;
'header:In-Reply-To:1': 0.34; 'yes,': 0.35; 'two': 0.39; 'match':
0.40; 'both': 0.40; 'received:212': 0.62; 'between': 0.63; 'skip:r
20': 0.64; 'similar': 0.65; 'received:userid': 0.66; 'time.':
0.66; 'latin': 0.69; 'url-ip:212/8': 0.69; 'left': 0.83;
'received:at': 0.84
Mail-Followup-To: python-list@python.org
Content-Disposition: inline
In-Reply-To: <veumlu$3gfsk$1@dont-email.me>
X-BeenThere: python-list@python.org
X-Mailman-Version: 2.1.39
Precedence: list
List-Id: General discussion list for the Python programming language
<python-list.python.org>
List-Unsubscribe: <https://mail.python.org/mailman/options/python-list>,
<mailto:python-list-request@python.org?subject=unsubscribe>
List-Archive: <https://mail.python.org/pipermail/python-list/>
List-Post: <mailto:python-list@python.org>
List-Help: <mailto:python-list-request@python.org?subject=help>
List-Subscribe: <https://mail.python.org/mailman/listinfo/python-list>,
<mailto:python-list-request@python.org?subject=subscribe>
X-Mailman-Original-Message-ID: <20241021191049.iclg7pmpfrpkel55@hjp.at>
X-Mailman-Original-References: <ve0o34$1nep4$1@dont-email.me>
<MQaOO.3313338$EVn.2054758@fx04.ams4>
<011301db1c22$5e7519c0$1b5f4d40$@gmail.com>
<20241012105958.cbctekv7vustleha@hjp.at>
<966b510d-9bd7-4472-a858-7e042d78461d@tompassin.net>
<20241018210941.f5azh2lvz7cxzcy5@hjp.at>
<mailman.28.1729285790.4695.python-list@python.org>
<veumlu$3gfsk$1@dont-email.me>
View all headers

On 2024-10-19 00:15:23 +0200, jak via Python-list wrote:
> Peter J. Holzer ha scritto:
> > As a trivial example, the regular expressions r"\\sout{" and r"\\sout\{"
> > are equivalent (the \ before the { is redundant). Yet
> > re.compile(s).pattern preserves the difference between the two strings.
>
> Allow me to be fussy: r"\\sout{" and r"\\sout\{" are similar but not
> equivalent.

They are. Both will match the 6 character string
0005c \ REVERSE SOLIDUS
00073 s LATIN SMALL LETTER S
0006f o LATIN SMALL LETTER O
00075 u LATIN SMALL LETTER U
00074 t LATIN SMALL LETTER T
0007b { LEFT CURLY BRACKET

> If you omit the backslash, the parser will have to determine if the
> graph is part of regular expression {n, m} and will take more time.

Yes, that's the parser. But the result of parsing will be the same:
The string will end in a literal backslash.

hp

--
_ | Peter J. Holzer | Story must make more sense than reality.
|_|_) | |
| | | hjp@hjp.at | -- Charles Stross, "Creative writing
__/ | http://www.hjp.at/ | challenge!"

Attachments: signature.asc (application/pgp-signature)
Subject: Re: Correct syntax for pathological re.search()
From: Stefan Ram
Newsgroups: comp.lang.python
Organization: Stefan Ram
Date: Mon, 21 Oct 2024 20:24 UTC
References: 1 2 3 4 5 6 7 8 9 10
Path: eternal-september.org!news.eternal-september.org!feeder3.eternal-september.org!fu-berlin.de!uni-berlin.de!not-for-mail
From: ram@zedat.fu-berlin.de (Stefan Ram)
Newsgroups: comp.lang.python
Subject: Re: Correct syntax for pathological re.search()
Date: 21 Oct 2024 20:24:49 GMT
Organization: Stefan Ram
Lines: 21
Expires: 1 Jul 2025 11:59:58 GMT
Message-ID: <functional-20241021212324@ram.dialup.fu-berlin.de>
References: <ve0o34$1nep4$1@dont-email.me> <MQaOO.3313338$EVn.2054758@fx04.ams4> <011301db1c22$5e7519c0$1b5f4d40$@gmail.com> <20241012105958.cbctekv7vustleha@hjp.at> <966b510d-9bd7-4472-a858-7e042d78461d@tompassin.net> <20241018210941.f5azh2lvz7cxzcy5@hjp.at> <mailman.28.1729285790.4695.python-list@python.org> <veumlu$3gfsk$1@dont-email.me> <20241021191049.iclg7pmpfrpkel55@hjp.at> <mailman.29.1729537858.4695.python-list@python.org>
Mime-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit
X-Trace: news.uni-berlin.de OZhM2b6nRE/7ow7stivsqAFWOhl+JrWfZlBWl+mQycCo4j
Cancel-Lock: sha1:iBtssjD4FIrURomhuqSojXN0ZOk= sha256:IlhS9nXbtThwg3WlOk2GgKvZ9TlFh3ldHKxD2fYh9Tc=
X-Copyright: (C) Copyright 2024 Stefan Ram. All rights reserved.
Distribution through any means other than regular usenet
channels is forbidden. It is forbidden to publish this
article in the Web, to change URIs of this article into links,
and to transfer the body without this notice, but quotations
of parts in other Usenet posts are allowed.
X-No-Archive: Yes
Archive: no
X-No-Archive-Readme: "X-No-Archive" is set, because this prevents some
services to mirror the article in the web. But the article may
be kept on a Usenet archive server with only NNTP access.
X-No-Html: yes
Content-Language: en-US
View all headers

"Peter J. Holzer" <hjp-python@hjp.at> wrote or quoted:
>On 2024-10-19 00:15:23 +0200, jak via Python-list wrote:
>>Allow me to be fussy: r"\\sout{" and r"\\sout\{" are similar but not
>>equivalent.
.. . .
>Yes, that's the parser. But the result of parsing will be the same:
>The string will end in a literal backslash.

Functional reqs lay out what your system's got to do, while
non-functional reqs are all about time and other resource
constraints.

When you're crunching through parsing, what pops out is
your functional bread and butter.

But the time it takes to chew through that data?
That's non-functional and implementation-dependent territory.

So, we can say they're functionally equivalent.

Pages:12

rocksolid light 0.9.8
clearnet tor