Rocksolid Light

Welcome to RetroBBS

mail  files  register  newsreader  groups  login

Message-ID:  

"For the man who has everything... Penicillin." -- F. Borquin


devel / comp.lang.misc / Re: Designing a memory image for language runtime

SubjectAuthor
* Re: Designing a memory image for language runtimeluserdroog
+- Re: Designing a memory image for language runtimeTim Rentsch
`- Re: Designing a memory image for language runtimeTim Rentsch

1
Re: Designing a memory image for language runtime

<bf39e8ad-6bda-4dda-8d88-9744ae1edcacn@googlegroups.com>

  copy mid

https://www.rocksolidbbs.com/devel/article-flat.php?id=2147&group=comp.lang.misc#2147

  copy link   Newsgroups: comp.lang.misc
X-Received: by 2002:a05:622a:4d98:b0:3a6:68cb:cac1 with SMTP id ff24-20020a05622a4d9800b003a668cbcac1mr3078995qtb.92.1673154469301;
Sat, 07 Jan 2023 21:07:49 -0800 (PST)
X-Received: by 2002:a0c:fba9:0:b0:4c6:fa63:e944 with SMTP id
m9-20020a0cfba9000000b004c6fa63e944mr3773241qvp.4.1673154469057; Sat, 07 Jan
2023 21:07:49 -0800 (PST)
Path: i2pn2.org!i2pn.org!usenet.blueworldhosting.com!feed1.usenet.blueworldhosting.com!peer01.iad!feed-me.highwinds-media.com!news.highwinds-media.com!news-out.google.com!nntp.google.com!postnews.google.com!google-groups.googlegroups.com!not-for-mail
Newsgroups: comp.lang.misc
Date: Sat, 7 Jan 2023 21:07:48 -0800 (PST)
In-Reply-To: <91db7065-d24e-427f-91dc-d8399feaa0c4n@googlegroups.com>
Injection-Info: google-groups.googlegroups.com; posting-host=24.107.184.18; posting-account=G1KGwgkAAAAyw4z0LxHH0fja6wAbo7Cz
NNTP-Posting-Host: 24.107.184.18
References: <e9c9354b-7a43-4e14-8d08-a89cb88afc0cn@googlegroups.com>
<86cz89yxvz.fsf@linuxsc.com> <6e38f8f9-a9cd-43ac-8286-49cc2a9cb4d8n@googlegroups.com>
<863592yip4.fsf@linuxsc.com> <2261e058-12d8-4e4b-ba7e-fb4e2bc65b1en@googlegroups.com>
<86h6xdtzkk.fsf@linuxsc.com> <91db7065-d24e-427f-91dc-d8399feaa0c4n@googlegroups.com>
User-Agent: G2/1.0
MIME-Version: 1.0
Message-ID: <bf39e8ad-6bda-4dda-8d88-9744ae1edcacn@googlegroups.com>
Subject: Re: Designing a memory image for language runtime
From: mijoryx@yahoo.com (luserdroog)
Injection-Date: Sun, 08 Jan 2023 05:07:49 +0000
Content-Type: text/plain; charset="UTF-8"
X-Received-Bytes: 11070
 by: luserdroog - Sun, 8 Jan 2023 05:07 UTC

On Thursday, December 29, 2022 at 8:06:37 PM UTC-6, Tim Rentsch wrote:
> luserdroog <mij...@yahoo.com> writes:
>
> > On Sunday, December 25, 2022 at 8:58:33 PM UTC-6, Tim Rentsch wrote:
> [...]
> >> I'm still hoping to see a summary description (concise but complete)
> >> of all the different parts of the environment, [...]
> >
> > [.. long and detailed description ..]
>
> A blizzard of information... more than is needed in some areas,
> and (I suspect) less than is needed in others.
>
> Part of the reason for asking for a _concise_ but complete
> summary description is to get you to organize the information so
> it can be so presented. Going through the effort of organizing
> the information in this way should go a long way towards helping
> you implement the state-saving functionality.
>
> For example, any state that is held in integer data types can all
> be lumped together, because saving integers is well-understood
> and pretty easy.
>
> Conversely, function pointers need special care, because they
> cannot just be stored directly.
>
> The long description given mentions "objects" but as far as I can
> tell what an "object" (Xpost_Object?) is is never defined.
>
> Are references to objects done with pointers or by means of an
> object table? If there were an object table that would greatly
> simplify (probably) the state-saving operations.
>
> You mention "operations" but don't say what an operation is.
>
> Also, there is some amount of state for the garbage collector.
> Probably that state does not need to be (directly) saved for
> a state-saving operation. What information is important to save,
> and what information is incidental and can be ignored?
>
> Do these comments help you see what I'm getting at?

Yes. Let me try again having cleaned and straightened up my bifocals.

From the top level, and abstracting away all the fiddlybits, the whole
interpreter is just a collection of execution contexts. Since it's
just a simple round robin scheduling algorithm, it doesn't even really
need to remember the current context to resume execution.

Interpreter
collection of Contexts

Next, an execution context has a global memory and a local memory,
where there is a rule that global memory ought not to contain any
references to things in local memory. So, the global memory can be
considered self-contained with the local memory forming a shell around
it.

Context
Global Memory
Local Memory
a collection of integers (flags, offsets into local memory)
a collection of Objects (current object, window device, window
device event handler)

Skipping ahead to the Objects themselves, these are designed to be 64
bits long. There are Simple Objects which contain their value entirely
within the 64 bit representation. And there are Composite Objects,
such as arrays, strings, and dictionaries, which have their values in
either Global or Local Memory. There are File objects which have a
pointer to a C structure in memory, indexed by an entity number.
There are also Name objects which have associated strings in one of
the memories. Operator objects contain an integer code which indexes
the operator table which is in Global memory (although this is not a
requirement, perhaps it would make more sense to have the operator
table exist outside the memory arena). There is a Glob object which
is not directly accessible to the user but exists during the execution
of the `filenameforall` looping operator. There is also a Magic
object which is intended to exist only in the value part of a
key/value pair in a dictionary, in order to implement the Magic
Dictionaries from Sun's NeWS (where something like `canvas /mapped
true put` would instantly make the window visible).

Object = Integer int-val
| Boolean bool-val
| Composite Global? Entity Size Offset
| File Global? Entity
| Name Global? name-index
| Operator opcode
| Glob pointer-to-POSIX-glob_t
| Magic pointer-to-struct{(*get)();(*set)();}

A composite object always has a bit specifying whether it's in Global
or Local memory, then an Entity number which is an index into the
Memory Table for that memory, Size (in bytes for a String, objects for
an Array, key/value pairs for a Dictionary), and an Offset which will
be added to the address looked up from the Memory Table.

[Aside: I think both kinds of pointer need to be removed from the
Object representation and replaced with indexes into global tables.
With 64bit pointers, these violate the "64bit design" and force the
objects to be larger than intended.]

Each Memory has an associated Memory Table, indexed by the Entity
number, and a flat area of raw memory with size in use and total size
available.

Memory
Memory arena (big block of raw data, size in use, size available)
Memory Table

The Memory Table has a size in use and total size available, and an
array of allocation records containing an address (offset into the
arena data), size in use, size available, GC mark

Memory Table
size in use, size available
Array of (address (==offset), size used, size available, mark, tag)

For better or worse, all the other features of the PostScript Virual
Memory are grafted on top of this basic structure of Objects which
index the Memory Table to get the offset into the raw data arena.

sidebar: How some other features are grafted on:

The first few slots of the Memory Table hold Special Entities:
[0]: Free List (32bit word at address is the index of next free slot)
[1]: Save Stack (address locates head of stack of stacks of Save Records)
[2]: Context List (array of ids of all contexts sharing this memory)
[3]: Name Stack (address locates stack of string objects)
[4]: Name Tree (address locates head of Ternary Search Tree)
[5]: Bogus Name (special internal string returned by a failed name lookup)
( [6]: Operator Table if this is a Global Memory )
[...]: Live Allocations of Entities

The Operator Table is organized as an array of records

Operator Table
Array of (name stack index of operator's name,
number of operator Signatures,
address of array of Signatures)

Signature
pointer to function which implements the operator's action
number of argument objects
address of array of tag patterns
pointer to stack checking function (or NULL)
number of output objects

The window device is a Dictionary whose contents are in Local Memory.
The window device event handler is an Operator object which indexes
into the Operator table like any other operator to locate its function
pointer. One complication is that a window object has a block of internal
data that it stores in a PostScript String object. For an xcb device
this block of data contains an xcb_connection_t * and xcb_screen_t *
which would no longer be valid. Although some crucial information
would still be stored in the dictionary, like the dimensions. So, it
shouldn't be too difficult to create a new window with the old specs.

What needs to be done for the interpreter to resume a stored memory
image.

The original design was to have all the various pieces naturally
live inside the memory arena, so then the Saving/Resuming behavior
would just happen automatically by saving and loading the raw data.
The memory arena is allocated using mmap() so the saving part is
already done. Without any extra effort, exiting the interpreter
leaves it's final gmemXXXXXX and lmemXXXXXX files sitting right
there on the disk.

I put the Operator table in the arena so the memory image would
naturally correspond to the operator definitions it would work
with... but that doesn't really solve anything it seems. The Memory
Tables were originally implemented as a linked list of fixed sized
tables but it was pulled out of the arena for better performance.

So, in broad strokes, the Memory Table needs to go back into the arena
and the Operator Table needs to come out. Perhaps some kind of CRC or
hash could be computed on the operator names to establish the
correspondence between the codes in memory and the functions they
reference. And the Magic pointers need to be replaced with an
index into a table.

Upon resuming, a sweep needs to be done to invalidate all
Glob pointers. And something needs to be done about FILE *s.
Stdio files like stdin, stdout, stderr should be possible
to reconnect -- if they are stored in a recognizable form.
And it seems possible -- with much additional work -- to
remember the position and filename of a repositionable file
and to fopen() and fseek() to the same place. But maybe FILE *s
should just be invalidated except for stdio files which do
seem essential.

And finally, all the top level structures need to be packaged
up into a record and stashed into the Local Memory arena.
Well, wait a second. I think it does actually need one special
top-level file to list all the contexts. Each context is associated
with exactly one Local memory so all the per-context info can be
stored there. But two contexts may be sharing that same Local memory.


Click here to read the complete article
Re: Designing a memory image for language runtime

<861qnuicex.fsf@linuxsc.com>

  copy mid

https://www.rocksolidbbs.com/devel/article-flat.php?id=2149&group=comp.lang.misc#2149

  copy link   Newsgroups: comp.lang.misc
Path: i2pn2.org!i2pn.org!eternal-september.org!reader01.eternal-september.org!.POSTED!not-for-mail
From: tr.17687@z991.linuxsc.com (Tim Rentsch)
Newsgroups: comp.lang.misc
Subject: Re: Designing a memory image for language runtime
Date: Mon, 16 Jan 2023 10:04:22 -0800
Organization: A noiseless patient Spider
Lines: 149
Message-ID: <861qnuicex.fsf@linuxsc.com>
References: <e9c9354b-7a43-4e14-8d08-a89cb88afc0cn@googlegroups.com> <86cz89yxvz.fsf@linuxsc.com> <6e38f8f9-a9cd-43ac-8286-49cc2a9cb4d8n@googlegroups.com> <863592yip4.fsf@linuxsc.com> <2261e058-12d8-4e4b-ba7e-fb4e2bc65b1en@googlegroups.com> <86h6xdtzkk.fsf@linuxsc.com> <91db7065-d24e-427f-91dc-d8399feaa0c4n@googlegroups.com> <bf39e8ad-6bda-4dda-8d88-9744ae1edcacn@googlegroups.com>
MIME-Version: 1.0
Content-Type: text/plain; charset=us-ascii
Injection-Info: reader01.eternal-september.org; posting-host="fcb23827ff43ef5909d76507def8df8e";
logging-data="2974924"; mail-complaints-to="abuse@eternal-september.org"; posting-account="U2FsdGVkX19qMPXI30a29RpOg/tW5JYkT2QA8ZA3PH8="
User-Agent: Gnus/5.11 (Gnus v5.11) Emacs/22.4 (gnu/linux)
Cancel-Lock: sha1:gunOXRxDRtHhrelZnUIUsBxvGpY=
sha1:W3CRY032/CildtAlqRTF142pKPM=
 by: Tim Rentsch - Mon, 16 Jan 2023 18:04 UTC

luserdroog <mijoryx@yahoo.com> writes:

> On Thursday, December 29, 2022 at 8:06:37 PM UTC-6, Tim Rentsch wrote:
>
>> luserdroog <mij...@yahoo.com> writes:
>>
>>> On Sunday, December 25, 2022 at 8:58:33 PM UTC-6, Tim Rentsch wrote:
>>
>> [...]
>>
>>>> I'm still hoping to see a summary description (concise but complete)
>>>> of all the different parts of the environment, [...]
>>>
>>> [.. long and detailed description ..]
>>
>> A blizzard of information... more than is needed in some areas,
>> and (I suspect) less than is needed in others.
>>
>> Part of the reason for asking for a _concise_ but complete
>> summary description is to get you to organize the information so
>> it can be so presented. Going through the effort of organizing
>> the information in this way should go a long way towards helping
>> you implement the state-saving functionality.
>>
>> For example, any state that is held in integer data types can all
>> be lumped together, because saving integers is well-understood
>> and pretty easy.
>>
>> Conversely, function pointers need special care, because they
>> cannot just be stored directly.
>>
>> The long description given mentions "objects" but as far as I can
>> tell what an "object" (Xpost_Object?) is is never defined.
>>
>> Are references to objects done with pointers or by means of an
>> object table? If there were an object table that would greatly
>> simplify (probably) the state-saving operations.
>>
>> You mention "operations" but don't say what an operation is.
>>
>> Also, there is some amount of state for the garbage collector.
>> Probably that state does not need to be (directly) saved for
>> a state-saving operation. What information is important to save,
>> and what information is incidental and can be ignored?
>>
>> Do these comments help you see what I'm getting at?
>
> Yes. Let me try again having cleaned and straightened up my bifocals.
>
> From the top level, and abstracting away all the fiddlybits, the
> whole interpreter is just a collection of execution contexts.
> Since it's just a simple round robin scheduling algorithm, it
> doesn't even really need to remember the current context to resume
> execution.
>
> Interpreter
> collection of Contexts

Okay. It might be good to say whether the order is important,
especially if the ordering information needs to be held extrinsic
to a Context rather than being stored in the Context itself.

> Next, an execution context has a global memory and a local memory,
> where there is a rule that global memory ought not to contain any
> references to things in local memory. So, the global memory can be
> considered self-contained with the local memory forming a shell
> around it.
>
> Context
> Global Memory
> Local Memory
> a collection of integers (flags, offsets into local memory)
> a collection of Objects (current object, window device, window
> device event handler)
>
> [another 150 lines]

The names Global Memory and Local Memory don't say very much.

Here the word "collection" feels wrong. For example a struct
is a kind of "collection", but normally we wouldn't call it
a collection. The descriptions here are both a little big
vague and have unnecessary (?) information.

Some general comments...

Probably anything related to internal memory management (e.g.,
garbage collector, free space) can be left out. Unless there
is some compelling reason they need to be stored on the disk,
they shouldn't be, and hence there is no need to describe them.

Presumably all the stuff that needs to be stored is made up of
"atoms" and some sort of links between atoms. (Atoms are just
blobs of memory that are referred to only as a whole, never as a
subatom except for things like integers and links. In particular
atoms are not the same as an atom in Lisp.) All the ordinary
data in an atom can be lumped together and doesn't need to be
mentioned further. The questions that come up are

1. How do we know what kind of atom a particular atom is?

2. Given a particular kind of atom, how do we know where
the links are in that atom?

3. What are the different kind of links?

4. Do some atoms live in "spaces", which could affect
the nature of links to those atoms?

5. Are spaces of atoms some sort of compound structures
to be stored, or are they implicit somehow?

6. Do we also need to store some sort of links to
external entities that are not going to be held
in the database but are understood to reference
some sort of thing that the interpreter will
know about, without needing to store it?

These questions need accurate answers but not necessarily
precise answers. For example, we might say

Each atom has a tag that says what kind of atom it is.
The tag has enough information to be able to locate
where are the links are, and what is the kind of each
link.

Probably that description isn't accurate, but if it were then
that is most of what we need to say about atoms. The description
isn't precise enough that we could write code from it, but it has
enough information that we could start to design code to read a
stored image back into memory. Looking at the structure from the
other end of the spectrum, we need to know how the data storage
is compartmentalized. Are links allowed to cross compartment
boundaries (probably some are), and if so what kinds of links?
How are the kinds of different compartments to be identified?
Compartments may be the same as spaces or they may be different
somehow, probably the main difference being how they participate
(or do not participate) in different kinds of links.

The key is to give just enough information so the overall
structure stands out, but not more information than that. You
might try writing descriptions at three different levels: one
with not enough information, one with too much information, and
one somewhere in between those two. By adding bits to the "too
little" description and taking away bits from the "too much"
description you may find it easier to discover the "just right"
description.

I hope this is enough to help you in the next iteration.

Re: Designing a memory image for language runtime

<86wn5mgxse.fsf@linuxsc.com>

  copy mid

https://www.rocksolidbbs.com/devel/article-flat.php?id=2150&group=comp.lang.misc#2150

  copy link   Newsgroups: comp.lang.misc
Path: i2pn2.org!i2pn.org!eternal-september.org!reader01.eternal-september.org!.POSTED!not-for-mail
From: tr.17687@z991.linuxsc.com (Tim Rentsch)
Newsgroups: comp.lang.misc
Subject: Re: Designing a memory image for language runtime
Date: Mon, 16 Jan 2023 10:05:37 -0800
Organization: A noiseless patient Spider
Lines: 1
Message-ID: <86wn5mgxse.fsf@linuxsc.com>
References: <e9c9354b-7a43-4e14-8d08-a89cb88afc0cn@googlegroups.com> <86cz89yxvz.fsf@linuxsc.com> <6e38f8f9-a9cd-43ac-8286-49cc2a9cb4d8n@googlegroups.com> <863592yip4.fsf@linuxsc.com> <2261e058-12d8-4e4b-ba7e-fb4e2bc65b1en@googlegroups.com> <86h6xdtzkk.fsf@linuxsc.com> <91db7065-d24e-427f-91dc-d8399feaa0c4n@googlegroups.com> <bf39e8ad-6bda-4dda-8d88-9744ae1edcacn@googlegroups.com>
MIME-Version: 1.0
Content-Type: text/plain; charset=us-ascii
Injection-Info: reader01.eternal-september.org; posting-host="fcb23827ff43ef5909d76507def8df8e";
logging-data="2974924"; mail-complaints-to="abuse@eternal-september.org"; posting-account="U2FsdGVkX19e/uSDUGAd1p0PjuLqrH3o/5jwLncaF2U="
User-Agent: Gnus/5.11 (Gnus v5.11) Emacs/22.4 (gnu/linux)
Cancel-Lock: sha1:vAE9UUpPHhM5bL9ld4xpkel9Lw4=
sha1:0jetZI+PgNQ5JUcseuOqigkoMIw=
 by: Tim Rentsch - Mon, 16 Jan 2023 18:05 UTC

P.S. Tsk, tsk.. posting through google groups...

1
server_pubkey.txt

rocksolid light 0.9.81
clearnet tor