RetroBBS - comp.lang.lisp - Re: why is this procedure is so slow?

Re: why is this procedure is so slow?

<87h6idmmku.fsf@bsb.me.uk>

https://www.rocksolidbbs.com/devel/article-flat.php?id=18008&group=comp.lang.lisp#18008

Path: i2pn2.org!i2pn.org!eternal-september.org!feeder3.eternal-september.org!news.eternal-september.org!.POSTED!not-for-mail
From: ben.usenet@bsb.me.uk (Ben Bacarisse)
Newsgroups: comp.lang.lisp
Subject: Re: why is this procedure is so slow?
Date: Mon, 12 Feb 2024 20:53:05 +0000
Organization: A noiseless patient Spider
Lines: 64
Message-ID: <87h6idmmku.fsf@bsb.me.uk>
References: <87sf22ykaw.fsf@yaxenu.org> <m24jeiyeb1.fsf@MacBook-Pro-2.home>
<87v86xpy8a.fsf@yaxenu.org> <m2a5o9h1bw.fsf@MacBook-Pro-2.home>
<ogD0Mv+vfv1CrR+QE@bongo-ra.co> <87mss8idlr.fsf@yaxenu.org>
<rjGq+aS7Ay3xAtLwb@bongo-ra.co> <87il2wdyqy.fsf@yaxenu.org>
<86le7rqycf.fsf@williamsburg.bawden.org> <uq9t8d$spnu$1@dont-email.me>
<87wmrbmgju.fsf@bsb.me.uk> <uqc7e8$1c368$1@dont-email.me>
MIME-Version: 1.0
Content-Type: text/plain
Injection-Info: dont-email.me; posting-host="8ae65d6fdeba3daedfa5f3f09e667b97";
logging-data="1801091"; mail-complaints-to="abuse@eternal-september.org"; posting-account="U2FsdGVkX18Vq6Wj9pVJHq9KifNYXQJMzVtsS18OSDM="
User-Agent: Gnus/5.13 (Gnus v5.13)
Cancel-Lock: sha1:Pp7rdgRIQZa9/GLoWk2SCt9uu24=
sha1:fcoMqy4g/KPQMltrK77khbEBkCg=
X-BSB-Auth: 1.f994db2bf4fd858fc0cb.20240212205306GMT.87h6idmmku.fsf@bsb.me.uk

by: Ben Bacarisse - Mon, 12 Feb 2024 20:53 UTC

Jeff Barnett <jbb@notatt.com> writes:

> On 2/11/2024 3:38 AM, Ben Bacarisse wrote:
>> Jeff Barnett <jbb@notatt.com> writes:
>>
>>> On 2/11/2024 12:00 AM, Alan Bawden wrote:
>>>> Julieta Shem <jshem@yaxenu.org> writes:
>>>> Vectors are not convenient when we want to build something one
>>>> element
>>>> at a time. I think coercing a list into a vector is O(n), but that
>>>> seems inevitable because to make a vector we need to know the size and
>>>> we don't know the size up front.
>>>> This is why we have arrays with fill pointers. You should read up on
>>>> how arrays (and vectors) really work, with an eye on eventually being
>>>> able to use VECTOR-PUSH-EXTEND on a array that's specialized to contain
>>>> just 8-bit bytes.
>>>
>>> N.B. The use of vector-push-extend, where extend happens many (for many
>>> definitions of "many") times, will consume O(n^2) time AND probably memory
>>> too. There is no free (or even cheap) lunch when you don't have a good
>>> estimate of final total size up front.
>> Is this true for real implementations? I would have thought they would
>> typically use an exponential allocation strategy to give what is often
>> called "amortised O(n)" growth.
>> That is certainly what I found in a quick experiment using clisp:
>> (defun build-vec (size)
>> (let ((a (make-array 0 :adjustable t :fill-pointer t)))
>> (dotimes (i size (length a)) (vector-push-extend i a))))
>> shows linear growth in execution time.
>
> Every time you do a rebuild initiated by a push-extend you end up copying
> all the current items in the vector;

Why? Having to copy everything, even in a tight reallocation loop like
the one I used, will be a problem.

> and each time you do the copy, more
> items are copied than for the last copy. This leads, finally, to O(n^2)
> complexity. And as to storage, you must look at all the allocations you
> have done and the overhead to either GC or tack them into your
> heap. Usually, you only notice this degradation if you push a whole lot of
> items - recall, O(.) is really an asymptotic measure. Lots of tricks will
> hide the problem for a while but one usually gets bit in the end.
>
> When you ran build-vec, did you let size be a few billion? What
> happened to run time?

No, I could only go up to (expt 2 23) before clisp seg faulted (surely a
bug?). That's only about 8 million items, but 134 million bytes of
storage. The time taken was almost perfectly linear.

> My guess is that most storage heap management schemes take about
> the same amount of time and storage to allocate a few bytes and a few
> thousand bytes. The growth effect, if it is real in the implementation you
> are using for testing, must build vectors many many times longer than the
> internal allocation chunk size to see the quadratic effect.

Can you post an example? I don't know how to get either clisp or SBCL
to go much larger. They both have options which would seem to control
this but I can't seem to get them to cooperate! Any help would be much
appreciated.

--
Ben.

Re: why is this procedure is so slow?

<0HHBWJbFZpW4pdXpr@bongo-ra.co>

copy mid

https://www.rocksolidbbs.com/devel/article-flat.php?id=18010&group=comp.lang.lisp#18010

copy link Newsgroups: comp.lang.lisp

Path: i2pn2.org!i2pn.org!newsfeed.endofthelinebbs.com!nyheter.lysator.liu.se!eternal-september.org!feeder3.eternal-september.org!news.eternal-september.org!.POSTED!not-for-mail
From: spibou@gmail.com (Spiros Bousbouras)
Newsgroups: comp.lang.lisp
Subject: Re: why is this procedure is so slow?
Date: Mon, 12 Feb 2024 22:21:17 -0000 (UTC)
Organization: A noiseless patient Spider
Lines: 37
Message-ID: <0HHBWJbFZpW4pdXpr@bongo-ra.co>
References: <87sf22ykaw.fsf@yaxenu.org> <m24jeiyeb1.fsf@MacBook-Pro-2.home> <87v86xpy8a.fsf@yaxenu.org>
<m2a5o9h1bw.fsf@MacBook-Pro-2.home> <ogD0Mv+vfv1CrR+QE@bongo-ra.co> <87mss8idlr.fsf@yaxenu.org>
<rjGq+aS7Ay3xAtLwb@bongo-ra.co> <87il2wdyqy.fsf@yaxenu.org> <86le7rqycf.fsf@williamsburg.bawden.org>
<uq9t8d$spnu$1@dont-email.me> <87wmrbmgju.fsf@bsb.me.uk>
MIME-Version: 1.0
Content-Type: text/plain; charset=ISO-8859-1
Content-Transfer-Encoding: 8bit
Injection-Date: Mon, 12 Feb 2024 22:21:17 -0000 (UTC)
Injection-Info: dont-email.me; posting-host="4735e20e02677e58ec38222b2752a50f";
logging-data="1829799"; mail-complaints-to="abuse@eternal-september.org"; posting-account="U2FsdGVkX1/jLJti6iid1XvxoFgSZ6ls"
Cancel-Lock: sha1:jmwGlCVEnpBLKBtv20XNVvF2KoM=
X-Server-Commands: nowebcancel
In-Reply-To: <87wmrbmgju.fsf@bsb.me.uk>
X-Organisation: Weyland-Yutani

by: Spiros Bousbouras - Mon, 12 Feb 2024 22:21 UTC

On Sun, 11 Feb 2024 10:38:45 +0000
Ben Bacarisse <ben.usenet@bsb.me.uk> wrote:
> Is this true for real implementations? I would have thought they would
> typically use an exponential allocation strategy to give what is often
> called "amortised O(n)" growth.
>
> That is certainly what I found in a quick experiment using clisp:
>
> (defun build-vec (size)
> (let ((a (make-array 0 :adjustable t :fill-pointer t)))
> (dotimes (i size (length a)) (vector-push-extend i a))))
>
> shows linear growth in execution time.

[17]> (build-vec 16000000)

*** - VECTOR-PUSH-EXTEND: extending the vector by 8388608 elements makes it too long
The following restarts are available:
ABORT :R1 Abort main loop
Break 1 [18]> :r1
[19]> most-positive-fixnum
16777215

I tried the following variation :

(defvar b 11)

(defun build-vec (size)
(let ((a (make-array 0 :adjustable t :fill-pointer t)))
(dotimes (i size (length a)) (setq b i) (vector-push-extend i a))))

I get the same error message and in the debugger the value of b is
8388608. These experiments suggest an exponential reallocation policy.

--
vlaho.ninja/menu

Re: why is this procedure is so slow?

<uqeqa6$1ukop$1@dont-email.me>

copy mid

https://www.rocksolidbbs.com/devel/article-flat.php?id=18012&group=comp.lang.lisp#18012

copy link Newsgroups: comp.lang.lisp

Path: i2pn2.org!i2pn.org!eternal-september.org!feeder3.eternal-september.org!news.eternal-september.org!.POSTED!not-for-mail
From: jbb@notatt.com (Jeff Barnett)
Newsgroups: comp.lang.lisp
Subject: Re: why is this procedure is so slow?
Date: Mon, 12 Feb 2024 21:13:56 -0700
Organization: A noiseless patient Spider
Lines: 101
Message-ID: <uqeqa6$1ukop$1@dont-email.me>
References: <87sf22ykaw.fsf@yaxenu.org> <m24jeiyeb1.fsf@MacBook-Pro-2.home>
<87v86xpy8a.fsf@yaxenu.org> <m2a5o9h1bw.fsf@MacBook-Pro-2.home>
<ogD0Mv+vfv1CrR+QE@bongo-ra.co> <87mss8idlr.fsf@yaxenu.org>
<rjGq+aS7Ay3xAtLwb@bongo-ra.co> <87il2wdyqy.fsf@yaxenu.org>
<86le7rqycf.fsf@williamsburg.bawden.org> <uq9t8d$spnu$1@dont-email.me>
<87wmrbmgju.fsf@bsb.me.uk> <uqc7e8$1c368$1@dont-email.me>
<87h6idmmku.fsf@bsb.me.uk>
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8; format=flowed
Content-Transfer-Encoding: 7bit
Injection-Date: Tue, 13 Feb 2024 04:13:59 -0000 (UTC)
Injection-Info: dont-email.me; posting-host="6bdd3ad7d74f56e68d9eac1573e2bd10";
logging-data="2052889"; mail-complaints-to="abuse@eternal-september.org"; posting-account="U2FsdGVkX19PgeF5oVc5Y5j2FVT2KaHmUHtbo4Ii3lI="
User-Agent: Mozilla Thunderbird
Cancel-Lock: sha1:fPQn0EGvN4Z5F6yl4uJ2ZVSPNlY=
In-Reply-To: <87h6idmmku.fsf@bsb.me.uk>
Content-Language: en-US

by: Jeff Barnett - Tue, 13 Feb 2024 04:13 UTC

On 2/12/2024 1:53 PM, Ben Bacarisse wrote:
> Jeff Barnett <jbb@notatt.com> writes:
>
>> On 2/11/2024 3:38 AM, Ben Bacarisse wrote:
>>> Jeff Barnett <jbb@notatt.com> writes:
>>>
>>>> On 2/11/2024 12:00 AM, Alan Bawden wrote:
>>>>> Julieta Shem <jshem@yaxenu.org> writes:
>>>>> Vectors are not convenient when we want to build something one
>>>>> element
>>>>> at a time. I think coercing a list into a vector is O(n), but that
>>>>> seems inevitable because to make a vector we need to know the size and
>>>>> we don't know the size up front.
>>>>> This is why we have arrays with fill pointers. You should read up on
>>>>> how arrays (and vectors) really work, with an eye on eventually being
>>>>> able to use VECTOR-PUSH-EXTEND on a array that's specialized to contain
>>>>> just 8-bit bytes.
>>>>
>>>> N.B. The use of vector-push-extend, where extend happens many (for many
>>>> definitions of "many") times, will consume O(n^2) time AND probably memory
>>>> too. There is no free (or even cheap) lunch when you don't have a good
>>>> estimate of final total size up front.
>>> Is this true for real implementations? I would have thought they would
>>> typically use an exponential allocation strategy to give what is often
>>> called "amortised O(n)" growth.
>>> That is certainly what I found in a quick experiment using clisp:
>>> (defun build-vec (size)
>>> (let ((a (make-array 0 :adjustable t :fill-pointer t)))
>>> (dotimes (i size (length a)) (vector-push-extend i a))))
>>> shows linear growth in execution time.
>>
>> Every time you do a rebuild initiated by a push-extend you end up copying
>> all the current items in the vector;
>
> Why? Having to copy everything, even in a tight reallocation loop like
> the one I used, will be a problem.
>
>> and each time you do the copy, more
>> items are copied than for the last copy. This leads, finally, to O(n^2)
>> complexity. And as to storage, you must look at all the allocations you
>> have done and the overhead to either GC or tack them into your
>> heap. Usually, you only notice this degradation if you push a whole lot of
>> items - recall, O(.) is really an asymptotic measure. Lots of tricks will
>> hide the problem for a while but one usually gets bit in the end.
>>
>> When you ran build-vec, did you let size be a few billion? What
>> happened to run time?
>
> No, I could only go up to (expt 2 23) before clisp seg faulted (surely a
> bug?). That's only about 8 million items, but 134 million bytes of
> storage. The time taken was almost perfectly linear.
>
>> My guess is that most storage heap management schemes take about
>> the same amount of time and storage to allocate a few bytes and a few
>> thousand bytes. The growth effect, if it is real in the implementation you
>> are using for testing, must build vectors many many times longer than the
>> internal allocation chunk size to see the quadratic effect.
>
> Can you post an example? I don't know how to get either clisp or SBCL
> to go much larger. They both have options which would seem to control
> this but I can't seem to get them to cooperate! Any help would be much
> appreciated.

At the moment I have no system to build executable code. About a year
ago, I contacted Franz Lisp to see what they wanted for a non commercial
license - what they used to lease to me years ago as some sort of
academic thing. Unfortunately, the price was more than I wanted to pay
to putter around so I passed. At present, I'm working from the back of
an envelope:

I don't know if it will help but I'll tell you a few of the assumptions
I made leading to what I wrote above.

1. Most variants of push-extend want to maintain the use of indexing to
find the nth entry. That basically means that you demand that all memory
used be contiguous (in the address space). There are ways around this by
using heap (as in heap sort) algorithms but retrieval times go from O(1)
to O(log size).

2. Many variants of Malloc/Free set a minimum size, under the table, for
the blocks they actually maintain: a size like 256 bytes or the same
size as the virtual address mechanism handles, e.g., 4k bytes. If this
is the case, you need REALLY big test cases that get to the point where
you go quadratic in these size chunks.

3. If you have a good handle on the structure (its ultimate size, growth
characteristics, and reference patterns) you can develop or find
algorithms that are better than I'm estimating above. However and this
is were the gotchas come in, using these specialized approaches in
something like a general purpose Lisp push-extend primitive is all
wrong: the overhead will eat little applications alive and cause one to
wonder what's wrong with the system. Microseconds will turn into
milliseconds, etc. Of course one could take the same approach taken by
most Lisp hash packages: implement hash as an object type that can not
only grow but morph its type as the table gets larger. For example, a
hash table is originally implemented (under the table) as an assoc list;
when the number of items crosses a threshold another set of methods
provide hash functionality. And so on.
--
Jeff Barnett

Re: why is this procedure is so slow?

<87le7olaea.fsf@bsb.me.uk>

copy mid

https://www.rocksolidbbs.com/devel/article-flat.php?id=18013&group=comp.lang.lisp#18013

copy link Newsgroups: comp.lang.lisp

Path: i2pn2.org!i2pn.org!news.niel.me!news.gegeweb.eu!gegeweb.org!eternal-september.org!feeder3.eternal-september.org!news.eternal-september.org!.POSTED!not-for-mail
From: ben.usenet@bsb.me.uk (Ben Bacarisse)
Newsgroups: comp.lang.lisp
Subject: Re: why is this procedure is so slow?
Date: Tue, 13 Feb 2024 14:13:49 +0000
Organization: A noiseless patient Spider
Lines: 77
Message-ID: <87le7olaea.fsf@bsb.me.uk>
References: <87sf22ykaw.fsf@yaxenu.org> <m24jeiyeb1.fsf@MacBook-Pro-2.home>
<87v86xpy8a.fsf@yaxenu.org> <m2a5o9h1bw.fsf@MacBook-Pro-2.home>
<ogD0Mv+vfv1CrR+QE@bongo-ra.co> <87mss8idlr.fsf@yaxenu.org>
<rjGq+aS7Ay3xAtLwb@bongo-ra.co> <87il2wdyqy.fsf@yaxenu.org>
<86le7rqycf.fsf@williamsburg.bawden.org> <uq9t8d$spnu$1@dont-email.me>
<87wmrbmgju.fsf@bsb.me.uk> <uqc7e8$1c368$1@dont-email.me>
<87h6idmmku.fsf@bsb.me.uk> <uqeqa6$1ukop$1@dont-email.me>
MIME-Version: 1.0
Content-Type: text/plain
Injection-Info: dont-email.me; posting-host="7afb96008295e9afd859ffb3bddfc990";
logging-data="2229708"; mail-complaints-to="abuse@eternal-september.org"; posting-account="U2FsdGVkX1+l4gubYc0I4ebUqKUNTqx3aujhQpgj9rc="
User-Agent: Gnus/5.13 (Gnus v5.13)
Cancel-Lock: sha1:114vkZX1bU8EAuXR77LcdZK3t0k=
sha1:F/1S9URTPIv5ap+t51QKJGoAvXM=
X-BSB-Auth: 1.ebe31fde507eaf794e5a.20240213141349GMT.87le7olaea.fsf@bsb.me.uk

by: Ben Bacarisse - Tue, 13 Feb 2024 14:13 UTC

Jeff Barnett <jbb@notatt.com> writes:

> On 2/12/2024 1:53 PM, Ben Bacarisse wrote:
>> Jeff Barnett <jbb@notatt.com> writes:
>>
>>> On 2/11/2024 3:38 AM, Ben Bacarisse wrote:
>>>> Jeff Barnett <jbb@notatt.com> writes:
>>>>
>>>>> On 2/11/2024 12:00 AM, Alan Bawden wrote:
>>>>>> Julieta Shem <jshem@yaxenu.org> writes:
>>>>>> Vectors are not convenient when we want to build something one
>>>>>> element
>>>>>> at a time. I think coercing a list into a vector is O(n), but that
>>>>>> seems inevitable because to make a vector we need to know the size and
>>>>>> we don't know the size up front.
>>>>>> This is why we have arrays with fill pointers. You should read up on
>>>>>> how arrays (and vectors) really work, with an eye on eventually being
>>>>>> able to use VECTOR-PUSH-EXTEND on a array that's specialized to contain
>>>>>> just 8-bit bytes.
>>>>>
>>>>> N.B. The use of vector-push-extend, where extend happens many (for many
>>>>> definitions of "many") times, will consume O(n^2) time AND probably memory
>>>>> too. There is no free (or even cheap) lunch when you don't have a good
>>>>> estimate of final total size up front.
>>>> Is this true for real implementations? I would have thought they would
>>>> typically use an exponential allocation strategy to give what is often
>>>> called "amortised O(n)" growth.
>>>> That is certainly what I found in a quick experiment using clisp:
>>>> (defun build-vec (size)
>>>> (let ((a (make-array 0 :adjustable t :fill-pointer t)))
>>>> (dotimes (i size (length a)) (vector-push-extend i a))))
>>>> shows linear growth in execution time.
>>>
>>> Every time you do a rebuild initiated by a push-extend you end up copying
>>> all the current items in the vector;
>> Why? Having to copy everything, even in a tight reallocation loop like
>> the one I used, will be a problem.
>>
>>> and each time you do the copy, more
>>> items are copied than for the last copy. This leads, finally, to O(n^2)
>>> complexity. And as to storage, you must look at all the allocations you
>>> have done and the overhead to either GC or tack them into your
>>> heap. Usually, you only notice this degradation if you push a whole lot of
>>> items - recall, O(.) is really an asymptotic measure. Lots of tricks will
>>> hide the problem for a while but one usually gets bit in the end.
>>>
>>> When you ran build-vec, did you let size be a few billion? What
>>> happened to run time?
>> No, I could only go up to (expt 2 23) before clisp seg faulted (surely a
>> bug?). That's only about 8 million items, but 134 million bytes of
>> storage. The time taken was almost perfectly linear.
>>
>>> My guess is that most storage heap management schemes take about
>>> the same amount of time and storage to allocate a few bytes and a few
>>> thousand bytes. The growth effect, if it is real in the implementation you
>>> are using for testing, must build vectors many many times longer than the
>>> internal allocation chunk size to see the quadratic effect.
>> Can you post an example? I don't know how to get either clisp or SBCL
>> to go much larger. They both have options which would seem to control
>> this but I can't seem to get them to cooperate! Any help would be much
>> appreciated.
....
> I don't know if it will help but I'll tell you a few of the assumptions I
> made leading to what I wrote above.

Thanks for what you wrote, but I was hoping for some help to extend my
tests so I could see the effects you think I should be seeing but which I
was not. I saw linear growth up to the point where the implementations
crapped out and I could not see how to extend them.

By the way, I don't think one can assume that in-place reallocation with
an exponential growth strategy in always going to be in place. I was
just suggesting that's it's likely to be what is used in some
implementations.

--
Ben.

Re: why is this procedure is so slow?

<uqgohs$297e9$1@dont-email.me>

copy mid

https://www.rocksolidbbs.com/devel/article-flat.php?id=18014&group=comp.lang.lisp#18014

copy link Newsgroups: comp.lang.lisp

Path: i2pn2.org!i2pn.org!usenet.network!eternal-september.org!feeder3.eternal-september.org!news.eternal-september.org!.POSTED!not-for-mail
From: jbb@notatt.com (Jeff Barnett)
Newsgroups: comp.lang.lisp
Subject: Re: why is this procedure is so slow?
Date: Tue, 13 Feb 2024 14:56:10 -0700
Organization: A noiseless patient Spider
Lines: 99
Message-ID: <uqgohs$297e9$1@dont-email.me>
References: <87sf22ykaw.fsf@yaxenu.org> <m24jeiyeb1.fsf@MacBook-Pro-2.home>
<87v86xpy8a.fsf@yaxenu.org> <m2a5o9h1bw.fsf@MacBook-Pro-2.home>
<ogD0Mv+vfv1CrR+QE@bongo-ra.co> <87mss8idlr.fsf@yaxenu.org>
<rjGq+aS7Ay3xAtLwb@bongo-ra.co> <87il2wdyqy.fsf@yaxenu.org>
<86le7rqycf.fsf@williamsburg.bawden.org> <uq9t8d$spnu$1@dont-email.me>
<87wmrbmgju.fsf@bsb.me.uk> <uqc7e8$1c368$1@dont-email.me>
<87h6idmmku.fsf@bsb.me.uk> <uqeqa6$1ukop$1@dont-email.me>
<87le7olaea.fsf@bsb.me.uk>
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8; format=flowed
Content-Transfer-Encoding: 7bit
Injection-Date: Tue, 13 Feb 2024 21:56:13 -0000 (UTC)
Injection-Info: dont-email.me; posting-host="6bdd3ad7d74f56e68d9eac1573e2bd10";
logging-data="2399689"; mail-complaints-to="abuse@eternal-september.org"; posting-account="U2FsdGVkX1+EH1Wu6SNdlXVZVKZTc/vShy6qVEEcFRI="
User-Agent: Mozilla Thunderbird
Cancel-Lock: sha1:OYkMsrAum/Q5TAtbbH27axT5z90=
Content-Language: en-US
In-Reply-To: <87le7olaea.fsf@bsb.me.uk>

by: Jeff Barnett - Tue, 13 Feb 2024 21:56 UTC

On 2/13/2024 7:13 AM, Ben Bacarisse wrote:
> Jeff Barnett <jbb@notatt.com> writes:
>
>> On 2/12/2024 1:53 PM, Ben Bacarisse wrote:
>>> Jeff Barnett <jbb@notatt.com> writes:
>>>
>>>> On 2/11/2024 3:38 AM, Ben Bacarisse wrote:
>>>>> Jeff Barnett <jbb@notatt.com> writes:
>>>>>
>>>>>> On 2/11/2024 12:00 AM, Alan Bawden wrote:
>>>>>>> Julieta Shem <jshem@yaxenu.org> writes:
>>>>>>> Vectors are not convenient when we want to build something one
>>>>>>> element
>>>>>>> at a time. I think coercing a list into a vector is O(n), but that
>>>>>>> seems inevitable because to make a vector we need to know the size and
>>>>>>> we don't know the size up front.
>>>>>>> This is why we have arrays with fill pointers. You should read up on
>>>>>>> how arrays (and vectors) really work, with an eye on eventually being
>>>>>>> able to use VECTOR-PUSH-EXTEND on a array that's specialized to contain
>>>>>>> just 8-bit bytes.
>>>>>>
>>>>>> N.B. The use of vector-push-extend, where extend happens many (for many
>>>>>> definitions of "many") times, will consume O(n^2) time AND probably memory
>>>>>> too. There is no free (or even cheap) lunch when you don't have a good
>>>>>> estimate of final total size up front.
>>>>> Is this true for real implementations? I would have thought they would
>>>>> typically use an exponential allocation strategy to give what is often
>>>>> called "amortised O(n)" growth.
>>>>> That is certainly what I found in a quick experiment using clisp:
>>>>> (defun build-vec (size)
>>>>> (let ((a (make-array 0 :adjustable t :fill-pointer t)))
>>>>> (dotimes (i size (length a)) (vector-push-extend i a))))
>>>>> shows linear growth in execution time.
>>>>
>>>> Every time you do a rebuild initiated by a push-extend you end up copying
>>>> all the current items in the vector;
>>> Why? Having to copy everything, even in a tight reallocation loop like
>>> the one I used, will be a problem.
>>>
>>>> and each time you do the copy, more
>>>> items are copied than for the last copy. This leads, finally, to O(n^2)
>>>> complexity. And as to storage, you must look at all the allocations you
>>>> have done and the overhead to either GC or tack them into your
>>>> heap. Usually, you only notice this degradation if you push a whole lot of
>>>> items - recall, O(.) is really an asymptotic measure. Lots of tricks will
>>>> hide the problem for a while but one usually gets bit in the end.
>>>>
>>>> When you ran build-vec, did you let size be a few billion? What
>>>> happened to run time?
>>> No, I could only go up to (expt 2 23) before clisp seg faulted (surely a
>>> bug?). That's only about 8 million items, but 134 million bytes of
>>> storage. The time taken was almost perfectly linear.
>>>
>>>> My guess is that most storage heap management schemes take about
>>>> the same amount of time and storage to allocate a few bytes and a few
>>>> thousand bytes. The growth effect, if it is real in the implementation you
>>>> are using for testing, must build vectors many many times longer than the
>>>> internal allocation chunk size to see the quadratic effect.
>>> Can you post an example? I don't know how to get either clisp or SBCL
>>> to go much larger. They both have options which would seem to control
>>> this but I can't seem to get them to cooperate! Any help would be much
>>> appreciated.
> ...
>> I don't know if it will help but I'll tell you a few of the assumptions I
>> made leading to what I wrote above.
>
> Thanks for what you wrote, but I was hoping for some help to extend my
> tests so I could see the effects you think I should be seeing but which I
> was not. I saw linear growth up to the point where the implementations
> crapped out and I could not see how to extend them.
>
> By the way, I don't think one can assume that in-place reallocation with
> an exponential growth strategy in always going to be in place. I was
> just suggesting that's it's likely to be what is used in some
> implementations.

I wasn't assuming an in-place reallocation. My thoughts were more along
the following lines:

An implementation of a vector with extend potential must still properly
allow all vector methods (as in object methods) to meet their original
contract. So, for example, the vector must present at least the illusion
of elements being consecutive in the address space. In Lisp it also
means that one can (after doing appropriate index validation) access the
vector by computing its address in most implementations. One can do
better by ignoring such "requirements" but then the implementation will
display some kinks. It's just easier to assign new contiguous address
space to the vector, copy the old contents, then put a magic forwarded
pointer in the original header, and leave the rest of the original so
the GC can have at.

I'm not sure if I understood what you were trying to get at. If I
flubbed it, please try again.

PS I hope all goes well with you. It seems we've both been lurking in
murky parts of USENET.
--
Jeff Barnett

Re: why is this procedure is so slow?

<uqgpjo$29928$4@dont-email.me>

copy mid

https://www.rocksolidbbs.com/devel/article-flat.php?id=18015&group=comp.lang.lisp#18015

copy link Newsgroups: comp.lang.lisp

Path: i2pn2.org!i2pn.org!eternal-september.org!feeder3.eternal-september.org!news.eternal-september.org!.POSTED!not-for-mail
From: ldo@nz.invalid (Lawrence D'Oliveiro)
Newsgroups: comp.lang.lisp
Subject: Re: why is this procedure is so slow?
Date: Tue, 13 Feb 2024 22:14:16 -0000 (UTC)
Organization: A noiseless patient Spider
Lines: 20
Message-ID: <uqgpjo$29928$4@dont-email.me>
References: <87sf22ykaw.fsf@yaxenu.org> <m24jeiyeb1.fsf@MacBook-Pro-2.home>
<87v86xpy8a.fsf@yaxenu.org> <oFTEKHKSwjuoXLNeu@bongo-ra.co>
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit
Injection-Date: Tue, 13 Feb 2024 22:14:16 -0000 (UTC)
Injection-Info: dont-email.me; posting-host="24b9a84a64cf4b4b69a117a024a5ec4b";
logging-data="2401352"; mail-complaints-to="abuse@eternal-september.org"; posting-account="U2FsdGVkX1+k4FSHB4oyHmTQSzTVLS+S"
User-Agent: Pan/0.155 (Kherson; fc5a80b8)
Cancel-Lock: sha1:+k8pHZnGx9q+PBJRxCGf4Ic1888=

by: Lawrence D'Oliv - Tue, 13 Feb 2024 22:14 UTC

On Fri, 9 Feb 2024 21:13:18 -0000 (UTC), Spiros Bousbouras wrote:

> (declaim (ftype (function (vector vector list
> &key (:limit (or null integer)) (:so-far
> integer))
> list)
> split-vector))

Not a fan of “parenthesis-pileup” layout. This helps make it clearer
to me:

(declaim
(ftype
(function
(vector vector list &key (:limit (or null integer)) (:so-far integer))
list
)
split-vector
)
)

Re: why is this procedure is so slow?

<874jealqr7.fsf@bsb.me.uk>

copy mid

https://www.rocksolidbbs.com/devel/article-flat.php?id=18017&group=comp.lang.lisp#18017

copy link Newsgroups: comp.lang.lisp

Path: i2pn2.org!i2pn.org!news.neodome.net!news.mixmin.net!eternal-september.org!feeder3.eternal-september.org!news.eternal-september.org!.POSTED!not-for-mail
From: ben.usenet@bsb.me.uk (Ben Bacarisse)
Newsgroups: comp.lang.lisp
Subject: Re: why is this procedure is so slow?
Date: Wed, 14 Feb 2024 20:45:00 +0000
Organization: A noiseless patient Spider
Lines: 91
Message-ID: <874jealqr7.fsf@bsb.me.uk>
References: <87sf22ykaw.fsf@yaxenu.org> <m24jeiyeb1.fsf@MacBook-Pro-2.home>
<87v86xpy8a.fsf@yaxenu.org> <m2a5o9h1bw.fsf@MacBook-Pro-2.home>
<ogD0Mv+vfv1CrR+QE@bongo-ra.co> <87mss8idlr.fsf@yaxenu.org>
<rjGq+aS7Ay3xAtLwb@bongo-ra.co> <87il2wdyqy.fsf@yaxenu.org>
<86le7rqycf.fsf@williamsburg.bawden.org> <uq9t8d$spnu$1@dont-email.me>
<87wmrbmgju.fsf@bsb.me.uk> <uqc7e8$1c368$1@dont-email.me>
<87h6idmmku.fsf@bsb.me.uk> <uqeqa6$1ukop$1@dont-email.me>
<87le7olaea.fsf@bsb.me.uk> <uqgohs$297e9$1@dont-email.me>
MIME-Version: 1.0
Content-Type: text/plain
Injection-Info: dont-email.me; posting-host="d38f6305b0b3271efacf92c8115bd102";
logging-data="2964658"; mail-complaints-to="abuse@eternal-september.org"; posting-account="U2FsdGVkX19JNyU4HLFXFJFCfwHXkGDV/dEcbUlR6SM="
User-Agent: Gnus/5.13 (Gnus v5.13)
Cancel-Lock: sha1:N3pazE2H/LSZ488hqbNEFBWuJOo=
sha1:/91kEtvneut4YTkKxfBC64/tJHY=
X-BSB-Auth: 1.19d140e02867905dc92d.20240214204500GMT.874jealqr7.fsf@bsb.me.uk

by: Ben Bacarisse - Wed, 14 Feb 2024 20:45 UTC

Jeff Barnett <jbb@notatt.com> writes:

> On 2/13/2024 7:13 AM, Ben Bacarisse wrote:
>> Jeff Barnett <jbb@notatt.com> writes:
>>
>>> On 2/12/2024 1:53 PM, Ben Bacarisse wrote:
>>>> Jeff Barnett <jbb@notatt.com> writes:
>>>>
>>>>> On 2/11/2024 3:38 AM, Ben Bacarisse wrote:
>>>>>> Jeff Barnett <jbb@notatt.com> writes:
>>>>>>
>>>>>>> On 2/11/2024 12:00 AM, Alan Bawden wrote:
>>>>>>>> Julieta Shem <jshem@yaxenu.org> writes:
>>>>>>>> Vectors are not convenient when we want to build something one
>>>>>>>> element
>>>>>>>> at a time. I think coercing a list into a vector is O(n), but that
>>>>>>>> seems inevitable because to make a vector we need to know the size and
>>>>>>>> we don't know the size up front.
>>>>>>>> This is why we have arrays with fill pointers. You should read up on
>>>>>>>> how arrays (and vectors) really work, with an eye on eventually being
>>>>>>>> able to use VECTOR-PUSH-EXTEND on a array that's specialized to contain
>>>>>>>> just 8-bit bytes.
>>>>>>>
>>>>>>> N.B. The use of vector-push-extend, where extend happens many (for many
>>>>>>> definitions of "many") times, will consume O(n^2) time AND probably memory
>>>>>>> too. There is no free (or even cheap) lunch when you don't have a good
>>>>>>> estimate of final total size up front.
>>>>>> Is this true for real implementations? I would have thought they would
>>>>>> typically use an exponential allocation strategy to give what is often
>>>>>> called "amortised O(n)" growth.
>>>>>> That is certainly what I found in a quick experiment using clisp:
>>>>>> (defun build-vec (size)
>>>>>> (let ((a (make-array 0 :adjustable t :fill-pointer t)))
>>>>>> (dotimes (i size (length a)) (vector-push-extend i a))))
>>>>>> shows linear growth in execution time.
>>>>>
>>>>> Every time you do a rebuild initiated by a push-extend you end up copying
>>>>> all the current items in the vector;
>>>> Why? Having to copy everything, even in a tight reallocation loop like
>>>> the one I used, will be a problem.
>>>>
>>>>> and each time you do the copy, more
>>>>> items are copied than for the last copy. This leads, finally, to O(n^2)
>>>>> complexity. And as to storage, you must look at all the allocations you
>>>>> have done and the overhead to either GC or tack them into your
>>>>> heap. Usually, you only notice this degradation if you push a whole lot of
>>>>> items - recall, O(.) is really an asymptotic measure. Lots of tricks will
>>>>> hide the problem for a while but one usually gets bit in the end.
>>>>>
>>>>> When you ran build-vec, did you let size be a few billion? What
>>>>> happened to run time?
>>>> No, I could only go up to (expt 2 23) before clisp seg faulted (surely a
>>>> bug?). That's only about 8 million items, but 134 million bytes of
>>>> storage. The time taken was almost perfectly linear.
>>>>
>>>>> My guess is that most storage heap management schemes take about
>>>>> the same amount of time and storage to allocate a few bytes and a few
>>>>> thousand bytes. The growth effect, if it is real in the implementation you
>>>>> are using for testing, must build vectors many many times longer than the
>>>>> internal allocation chunk size to see the quadratic effect.
>>>> Can you post an example? I don't know how to get either clisp or SBCL
>>>> to go much larger. They both have options which would seem to control
>>>> this but I can't seem to get them to cooperate! Any help would be much
>>>> appreciated.
>> ...
>>> I don't know if it will help but I'll tell you a few of the assumptions I
>>> made leading to what I wrote above.
>> Thanks for what you wrote, but I was hoping for some help to extend my
>> tests so I could see the effects you think I should be seeing but which I
>> was not. I saw linear growth up to the point where the implementations
>> crapped out and I could not see how to extend them.
>> By the way, I don't think one can assume that in-place reallocation with
>> an exponential growth strategy in always going to be in place. I was
>> just suggesting that's it's likely to be what is used in some
>> implementations.
>
> I wasn't assuming an in-place reallocation.

Yes, I know. I was just remarking that my own observations that just
because as least two implementation appear to offer amortised O(n) cost,
that result can't be relied upon as the language does not stipulate any
underlying strategy.

> I'm not sure if I understood what you were trying to get at. If I flubbed
> it, please try again.

I was simply reporting the result of a measurement up to the maximum
size I could manage.

--
Ben.

Time sharing: The use of many people by the computer.

devel / comp.lang.lisp / Re: why is this procedure is so slow?

Subject	Author
why is this procedure is so slow?	Julieta Shem
Re: why is this procedure is so slow?	Spiros Bousbouras
Re: why is this procedure is so slow?	Raymond Wiker
Re: why is this procedure is so slow?	Spiros Bousbouras
Re: why is this procedure is so slow?	Spiros Bousbouras
Re: why is this procedure is so slow?	Julieta Shem
Re: why is this procedure is so slow?	Raymond Wiker
Re: why is this procedure is so slow?	Spiros Bousbouras
Re: why is this procedure is so slow?	Julieta Shem
Re: why is this procedure is so slow?	Spiros Bousbouras
Re: why is this procedure is so slow?	Julieta Shem
Re: why is this procedure is so slow?	Alan Bawden
Re: why is this procedure is so slow?	Jeff Barnett
Re: why is this procedure is so slow?	Ben Bacarisse
Re: why is this procedure is so slow?	Spiros Bousbouras
Re: why is this procedure is so slow?	Ben Bacarisse
Re: why is this procedure is so slow?	Jeff Barnett
Re: why is this procedure is so slow?	Ben Bacarisse
Re: why is this procedure is so slow?	Jeff Barnett
Re: why is this procedure is so slow?	Ben Bacarisse
Re: why is this procedure is so slow?	Jeff Barnett
Re: why is this procedure is so slow?	Ben Bacarisse
Re: why is this procedure is so slow?	Spiros Bousbouras
Re: why is this procedure is so slow?	Spiros Bousbouras
Re: why is this procedure is so slow?	Spiros Bousbouras
Re: why is this procedure is so slow?	Alan Bawden
Re: why is this procedure is so slow?	Spiros Bousbouras
Re: why is this procedure is so slow?	Julieta Shem
Re: why is this procedure is so slow?	Spiros Bousbouras
Re: why is this procedure is so slow?	Julieta Shem
Re: why is this procedure is so slow?	Lawrence D'Oliveiro
Re: why is this procedure is so slow?	Kaz Kylheku