Rocksolid Light

News from da outaworlds

mail  files  register  groups  login

Message-ID:  

Tempt not a desperate man. -- William Shakespeare, "Romeo and Juliet"


comp / comp.lang.lisp / Re: why is this procedure is so slow?

SubjectAuthor
o Re: why is this procedure is so slow?steve

1
Subject: Re: why is this procedure is so slow?
From: steve
Newsgroups: comp.lang.lisp
Date: Thu, 16 May 2024 20:41 UTC
References: 1
Path: eternal-september.org!news.eternal-september.org!feeder3.eternal-september.org!border-4.nntp.ord.giganews.com!nntp.giganews.com!Xl.tags.giganews.com!local-2.nntp.ord.giganews.com!news.giganews.com.POSTED!not-for-mail
NNTP-Posting-Date: Thu, 16 May 2024 20:41:49 +0000
From: sgonedes1977@gmail.com (steve)
Newsgroups: comp.lang.lisp
Subject: Re: why is this procedure is so slow?
References: <87sf22ykaw.fsf@yaxenu.org>
Date: Thu, 16 May 2024 16:41:43 -0400
Message-ID: <87v83dxzeg.fsf@gmail.com>
User-Agent: Gnus/5.13 (Gnus v5.13)
Cancel-Lock: sha1:BPe4mIiELJf14pThxVFf/n8nw/I=
MIME-Version: 1.0
Content-Type: text/plain
Lines: 115
X-Usenet-Provider: http://www.giganews.com
X-Trace: sv3-Y9VHgJSFxClKN5j45uAuy31PkRywRQ8U9cyDwcnbJgE/XK2o4d8UHIG0XM25sYiKkaWSEigq9aRuqSx!inLwmW/KA4LsHecqM8emqwWioVi4UEdPHSc17/c1rTBOMA==
X-Complaints-To: abuse@giganews.com
X-DMCA-Notifications: http://www.giganews.com/info/dmca.html
X-Abuse-and-DMCA-Info: Please be sure to forward a copy of ALL headers
X-Abuse-and-DMCA-Info: Otherwise we will be unable to process your complaint properly
X-Postfilter: 1.3.40
View all headers

Julieta Shem <jshem@yaxenu.org> writes:

> First I define x to be a list of bytes of an NNTP article.
>
> (defvar x (fetch-article "local.test" "28"))
>
> Now x is a list of integers, the bytes of the article 28, which is a
> MIME message containing a PDF of 610 KiB.
>
> Now I want to split it line by line. I am able to do it, but it's very
> slow.
>
> * (time (length (split-sequence (list 13 10) x nil)))
> Evaluation took:
> 65.710 seconds of real time
> 65.671875 seconds of total run time (47.093750 user, 18.578125 system)
> [ Run times consist of 23.968 seconds GC time, and 41.704 seconds non-GC time. ]
> 99.94% CPU
> 170,322,749,168 processor cycles
> 79,439,358,864 bytes consed
>
> 11585
> *
>
> Less than 12k lines. The number 79,439,358,864 of bytes is way more
> than I would have expected for anything regarding this invocation above,
> but I have no idea what this number really measures here. I appreciate
> any help. Thank you.
>
> Here's the procedure:
>
> (defun split-sequence (delim ls acc &key limit (so-far 1))
> (let* ((len (length ls))
> (delim delim)
> (pos (search delim ls))
> (n-take (or pos len))
> (n-drop (if pos
> (+ n-take (length delim))
> n-take)))
> (cond ((zerop len) acc)
> ((and limit (= so-far limit)) (list ls))
> (t (split-sequence
> delim (drop n-drop ls)
> (cons (take n-take ls) acc)
> :limit limit
> :so-far (1+ so-far))))))
>
> (defun take (n seq) (subseq seq 0 n))
> (defun drop (n seq) (subseq seq n))
>
> (*) A sample of the article
>
> All lines in it should be pretty short.
>
> Message-Id: <tnkqcqnuujaljsvmzvuc@loop>
> Content-Type: multipart/mixed; boundary="------------PeB0GiqcER01ZhCmBvnP2yr6"
> Date: Wed, 7 Feb 2024 22:22:57 -0300
> Mime-Version: 1.0
> User-Agent: Mozilla Thunderbird
> Newsgroups: local.test
> Content-Language: en-US
> From: Someone <someone@somewhere.org>
> Subject: juris hartmanis
>
> This is a multi-part message in MIME format.
> --------------PeB0GiqcER01ZhCmBvnP2yr6
> Content-Type: text/plain; charset=UTF-8
> Content-Transfer-Encoding: 7bit
>
> --------------PeB0GiqcER01ZhCmBvnP2yr6
> Content-Type: application/pdf;
> name="juris-hartmanis-godel-von-neumann-1956-89-994.pdf"
> Content-Disposition: attachment;
> filename="juris-hartmanis-godel-von-neumann-1956-89-994.pdf"
> Content-Transfer-Encoding: base64
>
> JVBERi0xLjIKekdf1fnfSqQYt7AjczYfpmRSIEyEcx8KMSAwIG9iago8PAovVHlwZSAvQ2F0
> YWxvZwovUGFnZXMgMyAwIFIKL091dGxpbmVzIDIgMCBSCj4+CmVuZG9iagoyIDAgb2JqCjw8
> [...]
> IDAwMDAwIG4gCjAwMDA2MjI5NTQgMDAwMDAgbiAKdHJhaWxlcgo8PAovU2l6ZSA0NgovUm9v
> dCAxIDAgUgovSW5mbyA0NSAwIFIKPj4Kc3RhcnR4cmVmCjYyMzI4NgolJUVPRgo=
>
> --------------PeB0GiqcER01ZhCmBvnP2yr6--
>
> Here's the first 1000 bytes of it.
>
> * (take 1000 x)

> (77 101 115 115 97 103 101 45 73 100 58 32 60 116 110 107 113 99 113 110 117
> 117 106 97 108 106 115 118 109 122 118 117 99 64 108 111 111 112 62 13 10 67
[ ... ]

> 103 98 50 74 113 67 106 119 56 13 10 67 105 57 85 101 88 66 108 73 67 57 80
> 100 88 82 115 97 87 53 108 99 119 111 118 81 50 57 49 98 110 81 103 77 84 65
> 75 76 48 90 112 99 110 78 48 73 68 77 49 73 68 65 103 85 103 111 118 84 71 70
> 122 100 67 65 48 78 67 65 119 73 70 73 75 13 10 80 106 52 75 90 87 53 107 98
> 50 74 113 67 106 77 103 77 67 66 118 89 109 111)

Here is something helpful maybe.

(let ((*print-right-margin* 80))
(pprint var))

There are more options; see the pretty printer.

also subseq makes a copy everytime of the list. best to wrap it up into
a struct for the printer.

you could use rplacd or setcdr to destructively modify the list. I don't
really understand what you are doing though. maybe using a string or
vector?

1

rocksolid light 0.9.8
clearnet tor