Rocksolid Light

News from da outaworlds

mail  files  register  groups  login

Message-ID:  

Generosity and perfection are your everlasting goals.


comp / comp.lang.lisp / Re: Faster remove-duplicates with sorted list.

SubjectAuthor
* Re: Faster remove-duplicates with sorted list.B. Pym
`* Re: Faster remove-duplicates with sorted list.B. Pym
 `* Re: Faster remove-duplicates with sorted list.Joerg Mertens
  +- Re: Faster remove-duplicates with sorted list.Joerg Mertens
  `- Re: Faster remove-duplicates with sorted list.steve g

1
Subject: Re: Faster remove-duplicates with sorted list.
From: B. Pym
Newsgroups: comp.lang.lisp
Organization: A noiseless patient Spider
Date: Wed, 19 Jun 2024 15:18 UTC
Path: eternal-september.org!news.eternal-september.org!.POSTED!not-for-mail
From: No_spamming@noWhere_7073.org (B. Pym)
Newsgroups: comp.lang.lisp
Subject: Re: Faster remove-duplicates with sorted list.
Date: Wed, 19 Jun 2024 15:18:17 -0000 (UTC)
Organization: A noiseless patient Spider
Lines: 41
Message-ID: <v4usrm$21fur$1@dont-email.me>
MIME-Version: 1.0
Content-Type: text/plain; charset=iso-8859-1
Injection-Date: Wed, 19 Jun 2024 17:18:17 +0200 (CEST)
Injection-Info: dont-email.me; posting-host="a0e44262dbf0ff3ef7cd29835841a1f0";
logging-data="2146267"; mail-complaints-to="abuse@eternal-september.org"; posting-account="U2FsdGVkX187JILPxrr+urVaHBfGG2lg"
User-Agent: XanaNews/1.18.1.6
Cancel-Lock: sha1:I2OhEolrIaw7KEEMO9frKsulPaE=
View all headers

> > I believe that remove-duplicates has to assume that the list is not
> > sorted. Therefore it has to compare each element, element by element
> > ( ~N^2 ). Whereas a side by side goes like 2*N.
> >
> > The only problem I have is excessive consing which significantly slows
> > down the algorithm.
>
> (defun uniquify-sorted-list (list &key (key #'identity) (test #'eql))
> (loop for element in list
> for element-key = (funcall key element)
> for last-element-key = (load-time-value (gensym))
> then element-key
> unless (funcall test element-key last-element-key)
> collect element))

Gauche Scheme

(use srfi-1) ;; pair-fold

(pair-fold
(lambda (xs accum)
(if (and (pair? (cdr xs)) (apply equal? (take xs 2)))
accum
(cons (car xs) accum)))
'()
'(0 0 1 2 3 3 3 4 5 5))

===>
(5 4 3 2 1 0)

(pair-fold
(lambda (xs accum)
(if (and (pair? (cdr xs)) (apply equal? (take xs 2)))
accum
(cons (car xs) accum)))
'()
'(0 1 2 3 3 3 4 5))

===>
(5 4 3 2 1 0)

Subject: Re: Faster remove-duplicates with sorted list.
From: B. Pym
Newsgroups: comp.lang.lisp
Organization: A noiseless patient Spider
Date: Wed, 19 Jun 2024 18:47 UTC
References: 1
Path: eternal-september.org!news.eternal-september.org!.POSTED!not-for-mail
From: No_spamming@noWhere_7073.org (B. Pym)
Newsgroups: comp.lang.lisp
Subject: Re: Faster remove-duplicates with sorted list.
Date: Wed, 19 Jun 2024 18:47:31 -0000 (UTC)
Organization: A noiseless patient Spider
Lines: 173
Message-ID: <v4v93u$23sti$1@dont-email.me>
References: <v4usrm$21fur$1@dont-email.me>
MIME-Version: 1.0
Content-Type: text/plain; charset=iso-8859-1
Injection-Date: Wed, 19 Jun 2024 20:47:32 +0200 (CEST)
Injection-Info: dont-email.me; posting-host="a0e44262dbf0ff3ef7cd29835841a1f0";
logging-data="2225074"; mail-complaints-to="abuse@eternal-september.org"; posting-account="U2FsdGVkX1/SPkEpujLTwBaLPNXPnicl"
User-Agent: XanaNews/1.18.1.6
Cancel-Lock: sha1:bDLjpXdz8BV7fBhLcvvNv+QfgKU=
View all headers

"Pierre R. Mai" wrote:

> (defun uniquify-sorted-list (list &key (key #'identity) (test #'eql))
> (loop for element in list
> for element-key = (funcall key element)
> for last-element-key = (load-time-value (gensym))
> then element-key
> unless (funcall test element-key last-element-key)
> collect element))

Testing under SBCL:

(defun uniquify-sorted-list (list &key (key #'identity) (test #'eql))
(loop for element in list
for element-key = (funcall key element)
for last-element-key = (load-time-value (gensym))
then element-key
unless (funcall test element-key last-element-key)
collect element))

(uniquify-sorted-list '(a b b c d d e f f f g))
===>
(A)

(uniquify-sorted-list '(2 2 3 4 4 4 5 5 6 7))
===>
(2)

(uniquify-sorted-list '(2 2 3 4 4 4 5 5 6 7) :test #'=)
===>
debugger invoked on a SIMPLE-TYPE-ERROR in thread
#<THREAD "main thread" RUNNING {23EAC0D1}>:
Argument Y is not a NUMBER: #:G5

It seems that except in the first pass through the loop
"last-element-key" is always equal to "element-key".

Again we have proof of the unusability of Common Lisp and, in
particular, Loop.

Again we have proof of the mindless arrogance of fans of
Common Lisp. That complex chunk of code was not tested even
once by its author.

Paul Graham:

I consider Loop one of the worst flaws in CL, and an example
to be borne in mind by both macro writers and language designers.

[In "ANSI Common Lisp", Graham makes the following comments:]

The loop macro was originally designed to help inexperienced
Lisp users write iterative code. Instead of writing Lisp code,
you express your program in a form meant to resemble English,
and this is then translated into Lisp. Unfortunately, loop is
more like English than its designers ever intended: you can
use it in simple cases without quite understanding how it
works, but to understand it in the abstract is almost
impossible.
....
the ANSI standard does not really give a formal specification
of its behavior.
....
The first thing one notices about the loop macro is that it
has syntax. A loop expression contains not subexpressions but
clauses. The clauses are not delimited by parentheses;
instead, each kind has a distinct syntax. In that, loop
resembles traditional Algol-like languages. But the other
distinctive feature of loop, which makes it as unlike Algol as
Lisp, is that the order in which things happen is only
loosely related to the order in which the clauses occur.
....
For such reasons, the use of loop cannot be recommended.

Dan Weinreb, one of the designers of Common Lisp:

.... the problem with LOOP was that it turned out to be hard to
predict what it would do, when you started using a lot of
different facets of LOOP all together. This is a serious problem
since the whole idea of LOOP was to let you use many facets
together; if you're not doing that, LOOP is overkill.

Barry Margolin:

My recommendation is based on seeing many question in the past
of the form "What happens if you use both XXX and YYY in the
same LOOP?" The unfortunate fact is that when we were writing
the standard we didn't have time to nail down all the possible
interactions between different LOOP features, so many of these
are not well specified. And even if we did get it right in
the standard, it's likely to be difficult to find them and I
wouldn't trust that all implementors got it right (many of
those questions were probably from implementors, trying to
figure out what they were supposed to do). And even if they
all got it right, someone reading your code may not be able to
figure it out.

So, with all those potential problems, my feeling is that if
you have to ask, it's probably better to use something other
than LOOP.

John Foderaro:

I'm not trying to join a debate on loop. I just wanted to present
the other side of [the issue so that] the intelligent people can
then weigh the arguments on both sides.

I'm not suggesting that loop can be fixed either by adding
parenthesis or coming up with ways of indenting it to make it
understandable. It's a lost cause.

....

Another great example from kmp:

=== from kmp

For example, you might think
(loop with i = (random 100) for x from 1 to 10 do (print (list i x)))
and
(loop for i = (random 100) for x from 1 to 10 do (print (list i x)))
meant the same in English, [but they don't do the same thing in loop]

=== end kmp

loop lulls you into thinking that you understand the program since
you understand English. Make no mistake about it, loop is its
own language. If you use it you condem everyone who reads the
code to also learn the loop language.

Gauche Scheme

(use gauche.collection) ;; fold2

(define (uniquify-sorted-list lst :key (key identity) (test equal?))
(define (cons-non-dup x accum old-x-key)
(let ((x-key (key x)))
(values
(if (or (null? accum) (not (test x-key old-x-key)))
(cons x accum)
accum)
x-key)))
(reverse
(fold2
cons-non-dup
'() #f
lst)))

(uniquify-sorted-list '(a b b c d d e f f f g))
===>
(a b c d e f g)

(uniquify-sorted-list '(#f a b b c d d e f f f g #f))
===>
(#f a b c d e f g #f)

(uniquify-sorted-list '(2 2 3 4 4 4 5 5 6 7))
===>
(2 3 4 5 6 7)

(uniquify-sorted-list '(2 2 3 4 4 4 5 5 6 7) :test =)
===>
(2 3 4 5 6 7)

Subject: Re: Faster remove-duplicates with sorted list.
From: Joerg Mertens
Newsgroups: comp.lang.lisp
Organization: A noiseless patient Spider
Date: Fri, 21 Jun 2024 10:22 UTC
References: 1 2
Path: eternal-september.org!news.eternal-september.org!jmertens.eternal-september.org!.POSTED!not-for-mail
From: joerg-mertens@t-online.de (Joerg Mertens)
Newsgroups: comp.lang.lisp
Subject: Re: Faster remove-duplicates with sorted list.
Date: Fri, 21 Jun 2024 12:22:31 +0200
Organization: A noiseless patient Spider
Lines: 51
Message-ID: <v53k98$349qd$1@jmertens.eternal-september.org>
References: <v4usrm$21fur$1@dont-email.me> <v4v93u$23sti$1@dont-email.me>
Injection-Date: Fri, 21 Jun 2024 12:22:34 +0200 (CEST)
Injection-Info: jmertens.eternal-september.org; posting-host="7ae42e8280091382f0aff5976b7e5038";
logging-data="3286872"; mail-complaints-to="abuse@eternal-september.org"; posting-account="U2FsdGVkX19sTVA/U0RlDcrm654E8EiRtXc8CIyQ1kc="
User-Agent: tin/2.6.2-20221225 ("Pittyvaich") (OpenBSD/7.5 (amd64)) tinews.pl/1.1.61
Cancel-Lock: sha1:pYsg3WynJkHQnkXgNhjmEzV3koc=
View all headers

B. Pym <No_spamming@nowhere_7073.org> wrote:
> "Pierre R. Mai" wrote:
>
>> (defun uniquify-sorted-list (list &key (key #'identity) (test #'eql))
>> (loop for element in list
>> for element-key = (funcall key element)
>> for last-element-key = (load-time-value (gensym))
>> then element-key
>> unless (funcall test element-key last-element-key)
>> collect element))
>
> Testing under SBCL:
>
> (defun uniquify-sorted-list (list &key (key #'identity) (test #'eql))
> (loop for element in list
> for element-key = (funcall key element)
> for last-element-key = (load-time-value (gensym))
> then element-key
> unless (funcall test element-key last-element-key)
> collect element))
>
> (uniquify-sorted-list '(a b b c d d e f f f g))
> ===>
> (A)
>
> (uniquify-sorted-list '(2 2 3 4 4 4 5 5 6 7))
> ===>
> (2)
>
> (uniquify-sorted-list '(2 2 3 4 4 4 5 5 6 7) :test #'=)
> ===>
> debugger invoked on a SIMPLE-TYPE-ERROR in thread
> #<THREAD "main thread" RUNNING {23EAC0D1}>:
> Argument Y is not a NUMBER: #:G5

(defun uniquify-sorted-list (list &key (key #'identity) (test #'eql))
(loop for element in list
for last-element-key = nil then element-key
for element-key = (funcall key element)
for result = nil then (funcall test element-key last-element-key)
unless result collect element))

Tested with GNU Clisp:

[3]> (uniquify-sorted-list '(a b b c d d e f f f g))
(A B C D E F G)
[4]> (uniquify-sorted-list '(2 2 3 4 4 4 5 5 6 7))
(2 3 4 5 6 7)
[5]> (uniquify-sorted-list '(2 2 3 4 4 4 5 5 6 7) :test #'=)
(2 3 4 5 6 7)

Subject: Re: Faster remove-duplicates with sorted list.
From: Joerg Mertens
Newsgroups: comp.lang.lisp
Organization: A noiseless patient Spider
Date: Fri, 21 Jun 2024 19:42 UTC
References: 1 2 3
Path: eternal-september.org!news.eternal-september.org!jmertens.eternal-september.org!.POSTED!not-for-mail
From: joerg-mertens@t-online.de (Joerg Mertens)
Newsgroups: comp.lang.lisp
Subject: Re: Faster remove-duplicates with sorted list.
Date: Fri, 21 Jun 2024 21:42:07 +0200
Organization: A noiseless patient Spider
Lines: 19
Message-ID: <v54l2g$3ai5i$1@jmertens.eternal-september.org>
References: <v4usrm$21fur$1@dont-email.me> <v4v93u$23sti$1@dont-email.me> <v53k98$349qd$1@jmertens.eternal-september.org>
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit
Injection-Date: Fri, 21 Jun 2024 21:42:11 +0200 (CEST)
Injection-Info: jmertens.eternal-september.org; posting-host="5fec433df73c4bbd0209172575181d3d";
logging-data="3492038"; mail-complaints-to="abuse@eternal-september.org"; posting-account="U2FsdGVkX19TGFzb8pGl4loiPHq9Tp+08pby065h2ww="
User-Agent: tin/2.6.2-20221225 ("Pittyvaich") (OpenBSD/7.5 (amd64)) tinews.pl/1.1.61
Cancel-Lock: sha1:oe/GSlD+f1Hi2F1TjZQvMrYHr+E=
View all headers

Joerg Mertens <joerg-mertens@t-online.de> wrote:

> (loop for element in list
> for last-element-key = nil then element-key
> for element-key = (funcall key element)
> for result = nil then (funcall test element-key last-element-key)
> unless result collect element))

This is clearer:

(defun uniquify-sorted-list (list &key (key #'identity) (test #'eql))
(loop for element in list
for element-key = (funcall key element)
and last-element-key = nil then element-key
for result = nil then (funcall test element-key last-element-key)
unless result collect element))

I forgot that the loop way to step two clauses in parallel is to
combine them with an `and´.

Subject: Re: Faster remove-duplicates with sorted list.
From: steve g
Newsgroups: comp.lang.lisp
Date: Sat, 10 Aug 2024 18:13 UTC
References: 1 2 3
Path: eternal-september.org!news.eternal-september.org!feeder3.eternal-september.org!border-2.nntp.ord.giganews.com!nntp.giganews.com!local-2.nntp.ord.giganews.com!Xl.tags.giganews.com!local-4.nntp.ord.giganews.com!news.giganews.com.POSTED!not-for-mail
NNTP-Posting-Date: Sat, 10 Aug 2024 18:13:19 +0000
From: sgonedes1977@gmail.com (steve g)
Newsgroups: comp.lang.lisp
Subject: Re: Faster remove-duplicates with sorted list.
References: <v4usrm$21fur$1@dont-email.me> <v4v93u$23sti$1@dont-email.me>
<v53k98$349qd$1@jmertens.eternal-september.org>
Date: Sat, 10 Aug 2024 14:13:19 -0400
Message-ID: <8734ncfdq8.fsf@gmail.com>
User-Agent: Gnus/5.13 (Gnus v5.13)
Cancel-Lock: sha1:onspQTgFcZ4SqPUu/eX+RLWk0BY=
MIME-Version: 1.0
Content-Type: text/plain
Lines: 13
X-Usenet-Provider: http://www.giganews.com
X-Trace: sv3-J9Szw9TB8AHLBB9gAzS2MbJE3sD7C7CQ8H5GRJy+odK1pCpty59riiMQPdUW2y6Mo2Tirp/5QdHN72u!H1SmqyUpbP0T4SAk5jZmLpWFmgy0MlK5CpWseFAh8RG2YFY=
X-Complaints-To: abuse@giganews.com
X-DMCA-Notifications: http://www.giganews.com/info/dmca.html
X-Abuse-and-DMCA-Info: Please be sure to forward a copy of ALL headers
X-Abuse-and-DMCA-Info: Otherwise we will be unable to process your complaint properly
X-Postfilter: 1.3.40
View all headers

Joerg Mertens <joerg-mertens@t-online.de> writes:

> Tested with GNU Clisp:
>
> [3]> (uniquify-sorted-list '(a b b c d d e f f f g))
> (A B C D E F G)
> [4]> (uniquify-sorted-list '(2 2 3 4 4 4 5 5 6 7))
> (2 3 4 5 6 7)
> [5]> (uniquify-sorted-list '(2 2 3 4 4 4 5 5 6 7) :test #'=)
> (2 3 4 5 6 7)

(remove-duplicates '(1 2 3 2 5 a 2) )

1

rocksolid light 0.9.8
clearnet tor