Rocksolid Light

Welcome to RetroBBS

mail  files  register  newsreader  groups  login

Message-ID:  

At many levels, Perl is a "diagonal" language. -- Larry Wall in <199709021854.LAA12794@wall.org>


devel / comp.lang.misc / Pointer to address zero

SubjectAuthor
* Pointer to address zeroJames Harris
+* Re: Pointer to address zeroDmitry A. Kazakov
|`* Re: Pointer to address zeroJames Harris
| +* Re: Pointer to address zeroDmitry A. Kazakov
| |`* Re: Pointer to address zeroJames Harris
| | +* Re: Pointer to address zeroDmitry A. Kazakov
| | |`* Re: Pointer to address zeroJames Harris
| | | `- Re: Pointer to address zeroDmitry A. Kazakov
| | +* Re: Pointer to address zeroRod Pemberton
| | |`- Re: Pointer to address zeroJames Harris
| | `- Re: Pointer to address zeroluserdroog
| `* Re: Pointer to address zeroRod Pemberton
|  `* Re: Pointer to address zeroJames Harris
|   `* Re: Pointer to address zeroRod Pemberton
|    `* Re: Pointer to address zeroJames Harris
|     `* Re: Pointer to address zeroRod Pemberton
|      +* Re: Pointer to address zeroJames Harris
|      |`* Re: Pointer to address zeroRod Pemberton
|      | +* Re: Pointer to address zeroBart
|      | |+* Re: Pointer to address zeroluserdroog
|      | ||`* Re: Pointer to address zeroRod Pemberton
|      | || `- Re: Pointer to address zeroluserdroog
|      | |`* Re: Pointer to address zeroRod Pemberton
|      | | +- Re: Pointer to address zeroBart
|      | | `- Re: Pointer to address zeroJames Harris
|      | `- Re: Pointer to address zeroJames Harris
|      +* Re: Pointer to address zeroBart
|      |+* Re: Pointer to address zeroDavid Brown
|      ||+* Re: Pointer to address zeroJames Harris
|      |||`* Re: Pointer to address zeroDavid Brown
|      ||| `* Re: Pointer to address zeroJames Harris
|      |||  `* Re: Pointer to address zeroDavid Brown
|      |||   `* Re: Pointer to address zeroJames Harris
|      |||    `* Re: Pointer to address zeroDavid Brown
|      |||     `* Re: Pointer to address zeroJames Harris
|      |||      +* Re: Pointer to address zeroJames Harris
|      |||      |`* Re: Pointer to address zeroDmitry A. Kazakov
|      |||      | `* Re: Pointer to address zeroJames Harris
|      |||      |  `* Re: Pointer to address zeroDmitry A. Kazakov
|      |||      |   +* Re: Pointer to address zeroBart
|      |||      |   |`* Re: Pointer to address zeroDmitry A. Kazakov
|      |||      |   | `* Re: Pointer to address zeroBart
|      |||      |   |  `* Re: Pointer to address zeroDmitry A. Kazakov
|      |||      |   |   +- Re: Pointer to address zeroDavid Brown
|      |||      |   |   `* Re: Pointer to address zeroBart
|      |||      |   |    +* Re: Pointer to address zeroJames Harris
|      |||      |   |    |+- Re: Pointer to address zeroBart
|      |||      |   |    |`- Re: Pointer to address zeroDavid Brown
|      |||      |   |    `* Re: Pointer to address zeroDmitry A. Kazakov
|      |||      |   |     `- Re: Pointer to address zeroDavid Brown
|      |||      |   `* Re: Pointer to address zeroJames Harris
|      |||      |    `* Re: Pointer to address zeroDmitry A. Kazakov
|      |||      |     `* Re: Pointer to address zeroJames Harris
|      |||      |      `- Re: Pointer to address zeroDmitry A. Kazakov
|      |||      `* Re: Pointer to address zeroDavid Brown
|      |||       `* Re: Pointer to address zeroJames Harris
|      |||        +* Re: Pointer to address zeroAndy Walker
|      |||        |`* Re: Pointer to address zeroJames Harris
|      |||        | `- Re: Pointer to address zeroAndy Walker
|      |||        `* Re: Pointer to address zeroDavid Brown
|      |||         `- Re: Pointer to address zeroJames Harris
|      ||`* Re: Pointer to address zeroBart
|      || `- Re: Pointer to address zeroDavid Brown
|      |`* Re: Pointer to address zeroRod Pemberton
|      | `* Re: Pointer to address zeroJames Harris
|      |  `* Re: Pointer to address zeroDavid Brown
|      |   `* Re: Pointer to address zeroBart
|      |    `- Re: Pointer to address zeroDavid Brown
|      `* Re: Pointer to address zeroAndy Walker
|       `- Re: Pointer to address zeroRod Pemberton
+* Re: Pointer to address zeroBart
|`* Re: Pointer to address zeroJames Harris
| `* Re: Pointer to address zeroBart
|  +- Re: Pointer to address zeroJames Harris
|  `* Re: Pointer to address zeroDavid Brown
|   `* Re: Pointer to address zeroBart
|    `* Re: Pointer to address zeroDavid Brown
|     `* Re: Pointer to address zeroBart
|      `* Re: Pointer to address zeroDavid Brown
|       +- Re: Pointer to address zeroBart
|       `- Re: Pointer to address zeroAndy Walker
`* Re: Pointer to address zeroantispam
 `* Re: Pointer to address zeroJames Harris
  `* Re: Pointer to address zeroantispam
   `- Re: Pointer to address zeroJames Harris

Pages:1234
Pointer to address zero

<sd71dn$pil$1@dont-email.me>

  copy mid

https://www.rocksolidbbs.com/devel/article-flat.php?id=274&group=comp.lang.misc#274

  copy link   Newsgroups: comp.lang.misc
Path: i2pn2.org!i2pn.org!eternal-september.org!reader02.eternal-september.org!.POSTED!not-for-mail
From: james.harris.1@gmail.com (James Harris)
Newsgroups: comp.lang.misc
Subject: Pointer to address zero
Date: Tue, 20 Jul 2021 18:33:43 +0100
Organization: A noiseless patient Spider
Lines: 22
Message-ID: <sd71dn$pil$1@dont-email.me>
Mime-Version: 1.0
Content-Type: text/plain; charset=utf-8; format=flowed
Content-Transfer-Encoding: 7bit
Injection-Date: Tue, 20 Jul 2021 17:33:43 -0000 (UTC)
Injection-Info: reader02.eternal-september.org; posting-host="1f365fd120c100f059d4bd246cb8e761";
logging-data="26197"; mail-complaints-to="abuse@eternal-september.org"; posting-account="U2FsdGVkX1/ghfFfl8cDA3lLogi1irGHOEzXXEVxQzQ="
User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:78.0) Gecko/20100101
Thunderbird/78.11.0
Cancel-Lock: sha1:0DQp5XIoCh1z1wuZhUB2iQ0gtiU=
Content-Language: en-GB
X-Mozilla-News-Host: snews://news.eternal-september.org:563
 by: James Harris - Tue, 20 Jul 2021 17:33 UTC

For my OS project I have been looking at program loading and that has
led me to query what would be required in a language to support address
zero being accessible and a pointer to it being considered to be valid.

If I use paging I can reserve the lowest page so that address zero is
inaccessible. A dereference of a zeroed pointer would be caught by the
CPU triggering a fault.

However, if on x86 I don't use paging then a reference to address zero
would not trigger a fault so it would not be caught and diagnosed. And
it's not just that case. Other CPUs or microcontrollers may, presumably,
allow access to address zero. Therefore there are cases where a program
may have a pointer to address zero and that pointer could be legitimate.

Hence the question: how should a language support access to address
zero? Any ideas?

--
James Harris

Re: Pointer to address zero

<sd731k$1b50$1@gioia.aioe.org>

  copy mid

https://www.rocksolidbbs.com/devel/article-flat.php?id=276&group=comp.lang.misc#276

  copy link   Newsgroups: comp.lang.misc
Path: i2pn2.org!i2pn.org!aioe.org!N/bBT90+fJ5f2hH/+d3Lnw.user.46.165.242.91.POSTED!not-for-mail
From: mailbox@dmitry-kazakov.de (Dmitry A. Kazakov)
Newsgroups: comp.lang.misc
Subject: Re: Pointer to address zero
Date: Tue, 20 Jul 2021 20:01:25 +0200
Organization: Aioe.org NNTP Server
Message-ID: <sd731k$1b50$1@gioia.aioe.org>
References: <sd71dn$pil$1@dont-email.me>
Mime-Version: 1.0
Content-Type: text/plain; charset=utf-8; format=flowed
Content-Transfer-Encoding: 7bit
Injection-Info: gioia.aioe.org; logging-data="44192"; posting-host="N/bBT90+fJ5f2hH/+d3Lnw.user.gioia.aioe.org"; mail-complaints-to="abuse@aioe.org";
User-Agent: Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:78.0) Gecko/20100101
Thunderbird/78.12.0
Content-Language: en-US
X-Notice: Filtered by postfilter v. 0.9.2
 by: Dmitry A. Kazakov - Tue, 20 Jul 2021 18:01 UTC

On 2021-07-20 19:33, James Harris wrote:

> Hence the question: how should a language support access to address
> zero? Any ideas?

Where is a problem? Machine address is neither integer nor pointer, any
resemblance to persons living or dead is purely coincidental as they
write before the movie starts...

--
Regards,
Dmitry A. Kazakov
http://www.dmitry-kazakov.de

Re: Pointer to address zero

<sd7cck$btm$1@dont-email.me>

  copy mid

https://www.rocksolidbbs.com/devel/article-flat.php?id=277&group=comp.lang.misc#277

  copy link   Newsgroups: comp.lang.misc
Path: i2pn2.org!i2pn.org!eternal-september.org!reader02.eternal-september.org!.POSTED!not-for-mail
From: james.harris.1@gmail.com (James Harris)
Newsgroups: comp.lang.misc
Subject: Re: Pointer to address zero
Date: Tue, 20 Jul 2021 21:40:51 +0100
Organization: A noiseless patient Spider
Lines: 32
Message-ID: <sd7cck$btm$1@dont-email.me>
References: <sd71dn$pil$1@dont-email.me> <sd731k$1b50$1@gioia.aioe.org>
Mime-Version: 1.0
Content-Type: text/plain; charset=utf-8; format=flowed
Content-Transfer-Encoding: 7bit
Injection-Date: Tue, 20 Jul 2021 20:40:52 -0000 (UTC)
Injection-Info: reader02.eternal-september.org; posting-host="1f365fd120c100f059d4bd246cb8e761";
logging-data="12214"; mail-complaints-to="abuse@eternal-september.org"; posting-account="U2FsdGVkX18jENgUiH+7TTgpj0qYs+HW/t5tCmDFxvc="
User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:78.0) Gecko/20100101
Thunderbird/78.11.0
Cancel-Lock: sha1:x7CZeQrBH/kxBSgoxn7nwWwIlNc=
In-Reply-To: <sd731k$1b50$1@gioia.aioe.org>
Content-Language: en-GB
 by: James Harris - Tue, 20 Jul 2021 20:40 UTC

On 20/07/2021 19:01, Dmitry A. Kazakov wrote:
> On 2021-07-20 19:33, James Harris wrote:
>
>> Hence the question: how should a language support access to address
>> zero? Any ideas?
>
> Where is a problem?

The problem is in the implementation. Should a language recognise such a
thing as an invalid address and, if so, what value or range of values
should indicate that a given address is invalid?

Compiled code often uses address zero to indicate 'invalid' but that
would not be possible if address zero were to be accessible.

>
> Machine address is neither integer nor pointer, any
> resemblance to persons living or dead is purely coincidental as they
> write before the movie starts...
>

How are you distinguishing between pointer and address? In C a pointer
is usually, though it does not have to be always, implemented as an
address.

(Doesn't that annotation usually come at the end?)

--
James Harris

Re: Pointer to address zero

<sd8hqm$7c2$1@gioia.aioe.org>

  copy mid

https://www.rocksolidbbs.com/devel/article-flat.php?id=278&group=comp.lang.misc#278

  copy link   Newsgroups: comp.lang.misc
Path: i2pn2.org!i2pn.org!aioe.org!N/bBT90+fJ5f2hH/+d3Lnw.user.46.165.242.91.POSTED!not-for-mail
From: mailbox@dmitry-kazakov.de (Dmitry A. Kazakov)
Newsgroups: comp.lang.misc
Subject: Re: Pointer to address zero
Date: Wed, 21 Jul 2021 09:19:50 +0200
Organization: Aioe.org NNTP Server
Message-ID: <sd8hqm$7c2$1@gioia.aioe.org>
References: <sd71dn$pil$1@dont-email.me> <sd731k$1b50$1@gioia.aioe.org>
<sd7cck$btm$1@dont-email.me>
Mime-Version: 1.0
Content-Type: text/plain; charset=utf-8; format=flowed
Content-Transfer-Encoding: 7bit
Injection-Info: gioia.aioe.org; logging-data="7554"; posting-host="N/bBT90+fJ5f2hH/+d3Lnw.user.gioia.aioe.org"; mail-complaints-to="abuse@aioe.org";
User-Agent: Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:78.0) Gecko/20100101
Thunderbird/78.12.0
Content-Language: en-US
X-Notice: Filtered by postfilter v. 0.9.2
 by: Dmitry A. Kazakov - Wed, 21 Jul 2021 07:19 UTC

On 2021-07-20 22:40, James Harris wrote:
> On 20/07/2021 19:01, Dmitry A. Kazakov wrote:
>> On 2021-07-20 19:33, James Harris wrote:
>>
>>> Hence the question: how should a language support access to address
>>> zero? Any ideas?
>>
>> Where is a problem?
>
> The problem is in the implementation. Should a language recognise such a
> thing as an invalid address and, if so, what value or range of values
> should indicate that a given address is invalid?

All addresses are valid. Some cannot be converted to some pointer types.

The system-dependent package would usually provide a constant
Null_Address that is guaranteed to never indicate accessible memory on
the given machine. The representation of Null_Address is irrelevant.

> Compiled code often uses address zero to indicate 'invalid' but that
> would not be possible if address zero were to be accessible.

In a properly designed language you would not be able to write a program
in such a manner.

> How are you distinguishing between pointer and address?

That thing is called the type.

> In C a pointer
> is usually, though it does not have to be always, implemented as an
> address.

So what?

--
Regards,
Dmitry A. Kazakov
http://www.dmitry-kazakov.de

Re: Pointer to address zero

<sd9jbm$efc$1@gioia.aioe.org>

  copy mid

https://www.rocksolidbbs.com/devel/article-flat.php?id=279&group=comp.lang.misc#279

  copy link   Newsgroups: comp.lang.misc
Path: i2pn2.org!i2pn.org!aioe.org!JKehOyGOGgs2f2NKLRXdGg.user.46.165.242.75.POSTED!not-for-mail
From: noemail@basdxcqvbe.com (Rod Pemberton)
Newsgroups: comp.lang.misc
Subject: Re: Pointer to address zero
Date: Wed, 21 Jul 2021 12:53:27 -0500
Organization: Aioe.org NNTP Server
Message-ID: <sd9jbm$efc$1@gioia.aioe.org>
References: <sd71dn$pil$1@dont-email.me>
<sd731k$1b50$1@gioia.aioe.org>
<sd7cck$btm$1@dont-email.me>
Mime-Version: 1.0
Content-Type: text/plain; charset=US-ASCII
Content-Transfer-Encoding: 7bit
Injection-Info: gioia.aioe.org; logging-data="14828"; posting-host="JKehOyGOGgs2f2NKLRXdGg.user.gioia.aioe.org"; mail-complaints-to="abuse@aioe.org";
X-Notice: Filtered by postfilter v. 0.9.2
 by: Rod Pemberton - Wed, 21 Jul 2021 17:53 UTC

On Tue, 20 Jul 2021 21:40:51 +0100
James Harris <james.harris.1@gmail.com> wrote:

> On 20/07/2021 19:01, Dmitry A. Kazakov wrote:
> > On 2021-07-20 19:33, James Harris wrote:

> >> Hence the question: how should a language support access to
> >> address zero? Any ideas?
> >
> > Where is a problem?
>
> The problem is in the implementation. Should a language recognise
> such a thing as an invalid address and, if so, what value or range of
> values should indicate that a given address is invalid?

What exactly do you mean by,
"recognize such a thing as an invalid address"?

e.g., prohibiting any of the following,

a) assignment of zero value to an address pointer
b) comparison of zero value with a address pointer
c) writing to location with address zero, i.e., dereference

First, let's for sake of argument say the language's NULL pointer (or
equivalent) is actually of value zero, as it doesn't have to be, at
least for the C language, but usually is implemented as a zero value.
This eliminates the "thought" complexity over NULL in C etc being a
non-zero value.

I'd strongly argue that allowing b) is a language requirement.
I'd argue that a) is up to the language implementation.
I'd argue that c) is up to the operating system implementation or
hardware.

> Compiled code often uses address zero to indicate 'invalid' but that
> would not be possible if address zero were to be accessible.

Sure it is.

The 'invalid' condition of which you speak is the result of a pointer
to pointer comparison, as nothing is written to address zero.

The writing to address zero is a dereference, not a pointer comparison,
and will be allowed if the hardware is incapable of blocking the write
to the address location, i.e., zero, e.g., via an invalid or unmapped
page.

--
Liberals preach diversity, equity, and inclusion, but engage in
misandry, are racist against whites, promote hatred of conservatives.

Re: Pointer to address zero

<sd9l7n$geq$1@dont-email.me>

  copy mid

https://www.rocksolidbbs.com/devel/article-flat.php?id=280&group=comp.lang.misc#280

  copy link   Newsgroups: comp.lang.misc
Path: i2pn2.org!i2pn.org!eternal-september.org!reader02.eternal-september.org!.POSTED!not-for-mail
From: james.harris.1@gmail.com (James Harris)
Newsgroups: comp.lang.misc
Subject: Re: Pointer to address zero
Date: Wed, 21 Jul 2021 18:24:01 +0100
Organization: A noiseless patient Spider
Lines: 94
Message-ID: <sd9l7n$geq$1@dont-email.me>
References: <sd71dn$pil$1@dont-email.me> <sd731k$1b50$1@gioia.aioe.org>
<sd7cck$btm$1@dont-email.me> <sd8hqm$7c2$1@gioia.aioe.org>
Mime-Version: 1.0
Content-Type: text/plain; charset=utf-8; format=flowed
Content-Transfer-Encoding: 7bit
Injection-Date: Wed, 21 Jul 2021 17:24:07 -0000 (UTC)
Injection-Info: reader02.eternal-september.org; posting-host="9e7bc7cf7aa3e00b00b7f3e5bcadfdbe";
logging-data="16858"; mail-complaints-to="abuse@eternal-september.org"; posting-account="U2FsdGVkX18RMi3WBU8vxA14p/KpzHdwgTxAFO2DQRk="
User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:78.0) Gecko/20100101
Thunderbird/78.11.0
Cancel-Lock: sha1:OtSGb+XrNCkLyhNuHYNvuLbuSc8=
In-Reply-To: <sd8hqm$7c2$1@gioia.aioe.org>
Content-Language: en-GB
 by: James Harris - Wed, 21 Jul 2021 17:24 UTC

On 21/07/2021 08:19, Dmitry A. Kazakov wrote:
> On 2021-07-20 22:40, James Harris wrote:
>> On 20/07/2021 19:01, Dmitry A. Kazakov wrote:
>>> On 2021-07-20 19:33, James Harris wrote:
>>>
>>>> Hence the question: how should a language support access to address
>>>> zero? Any ideas?
>>>
>>> Where is a problem?
>>
>> The problem is in the implementation. Should a language recognise such
>> a thing as an invalid address and, if so, what value or range of
>> values should indicate that a given address is invalid?
>
> All addresses are valid. Some cannot be converted to some pointer types.
>
> The system-dependent package would usually provide a constant
> Null_Address that is guaranteed to never indicate accessible memory on
> the given machine. The representation of Null_Address is irrelevant.
>
>> Compiled code often uses address zero to indicate 'invalid' but that
>> would not be possible if address zero were to be accessible.
>
> In a properly designed language you would not be able to write a program
> in such a manner.

I said /compiled/ code, not program source. In the /source/ the
programmer would still be able to write tests akin to

if p != null

I may even allow

if p

as meaning the same as the above though I guess some folk (e.g. Dmitry?)
will not like the idea of treating a pointer as a boolean.

As I say, compiled code often uses address zero as a null pointer,
knowing that the OS will make the lowest page inaccessible so that an
attempt to access it would generate a fault. And that's what I have long
intended to do. But when looking at different ways of loading a program
I realised that it was not always possible to reserve address zero.

In fact, sometimes a lot more than just one page is reserved. According
to a video explainer I watched a while ago 32-bit Linux sets the lowest
accessible address to something like 0x0040_0000 so that the hardware
will trap not just

*p

but also

p[q]

for some significant size of q (all where p is null). The trouble is
that that takes away 4M of the address space and, more importantly,
means that the addresses a programmer will see in a debugging session or
a dump would have more significant digits than they need to have and,
therefore, be harder to read than necessary.

If, by contrast, null is set to a little higher than the accessible
memory program data areas could be at lower addresses making debugging
sessions and dumps easier to read.

The hardware would still trap on both of

*p
p[q]

In fact, q could potentially be a lot higher than in the 'normal' model
because not just 4M but all the addresses from p[0] to the highest
memory address would be inaccessible and would trap (given a suitable CPU).

To make this work I would have to have null (the null address)
determined not at compile time but at program load time.

That ought to cope well with various OS memory models though it does
have a downside. If a structure containing a pointer is mapped over
zeroed memory the pointer will not be null but will be considered to be
valid. (It will point at location zero.)

>
>> How are you distinguishing between pointer and address?
>
> That thing is called the type.

OK ... then what, to you, distinguishes them? Alignment? Range? History?
Something else?

--
James Harris

Re: Pointer to address zero

<sd9mdv$aep$1@gioia.aioe.org>

  copy mid

https://www.rocksolidbbs.com/devel/article-flat.php?id=281&group=comp.lang.misc#281

  copy link   Newsgroups: comp.lang.misc
Path: i2pn2.org!i2pn.org!aioe.org!N/bBT90+fJ5f2hH/+d3Lnw.user.46.165.242.91.POSTED!not-for-mail
From: mailbox@dmitry-kazakov.de (Dmitry A. Kazakov)
Newsgroups: comp.lang.misc
Subject: Re: Pointer to address zero
Date: Wed, 21 Jul 2021 19:44:33 +0200
Organization: Aioe.org NNTP Server
Message-ID: <sd9mdv$aep$1@gioia.aioe.org>
References: <sd71dn$pil$1@dont-email.me> <sd731k$1b50$1@gioia.aioe.org>
<sd7cck$btm$1@dont-email.me> <sd8hqm$7c2$1@gioia.aioe.org>
<sd9l7n$geq$1@dont-email.me>
Mime-Version: 1.0
Content-Type: text/plain; charset=utf-8; format=flowed
Content-Transfer-Encoding: 8bit
Injection-Info: gioia.aioe.org; logging-data="10713"; posting-host="N/bBT90+fJ5f2hH/+d3Lnw.user.gioia.aioe.org"; mail-complaints-to="abuse@aioe.org";
User-Agent: Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:78.0) Gecko/20100101
Thunderbird/78.12.0
X-Notice: Filtered by postfilter v. 0.9.2
Content-Language: en-US
 by: Dmitry A. Kazakov - Wed, 21 Jul 2021 17:44 UTC

On 2021-07-21 19:24, James Harris wrote:

> I said /compiled/ code, not program source. In the /source/ the
> programmer would still be able to write tests akin to
>
>   if p != null
>
> I may even allow
>
>   if p
>
> as meaning the same as the above though I guess some folk (e.g. Dmitry?)
> will not like the idea of treating a pointer as a boolean.
>
> As I say, compiled code often uses address zero as a null pointer,

No, it uses the representation of null, whatever it be [*].

Furthermore, in a decent language with memory pools support each pool
could have its own null.

>> That thing is called the type.
>
> OK ... then what, to you, distinguishes them? Alignment? Range? History?
> Something else?

https://en.wikipedia.org/wiki/Nominal_type_system

-------------------------
In an advanced language pointer comparisons could be non-trivial, e.g.
when two pointers indicate different classes of the same object under
multiple inheritance. In that case memory representations of p and q
could be different, yet semantically p = q because both ultimately point
to the same object [provided, the language lets p and q be comparable].
An implementation would convert both p and q to the pointers of specific
type and then compare these.

--
Regards,
Dmitry A. Kazakov
http://www.dmitry-kazakov.de

Re: Pointer to address zero

<sda39h$kr7$1@dont-email.me>

  copy mid

https://www.rocksolidbbs.com/devel/article-flat.php?id=282&group=comp.lang.misc#282

  copy link   Newsgroups: comp.lang.misc
Path: i2pn2.org!i2pn.org!eternal-september.org!reader02.eternal-september.org!.POSTED!not-for-mail
From: james.harris.1@gmail.com (James Harris)
Newsgroups: comp.lang.misc
Subject: Re: Pointer to address zero
Date: Wed, 21 Jul 2021 22:24:00 +0100
Organization: A noiseless patient Spider
Lines: 122
Message-ID: <sda39h$kr7$1@dont-email.me>
References: <sd71dn$pil$1@dont-email.me> <sd731k$1b50$1@gioia.aioe.org>
<sd7cck$btm$1@dont-email.me> <sd9jbm$efc$1@gioia.aioe.org>
Mime-Version: 1.0
Content-Type: text/plain; charset=utf-8; format=flowed
Content-Transfer-Encoding: 7bit
Injection-Date: Wed, 21 Jul 2021 21:24:01 -0000 (UTC)
Injection-Info: reader02.eternal-september.org; posting-host="9e7bc7cf7aa3e00b00b7f3e5bcadfdbe";
logging-data="21351"; mail-complaints-to="abuse@eternal-september.org"; posting-account="U2FsdGVkX1+uu5JeoOP1lhm9Gm1uCgRjPwaPtJWgI4g="
User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:78.0) Gecko/20100101
Thunderbird/78.11.0
Cancel-Lock: sha1:8/6J3P1nvMuRg6yFk9ZFqRbWg8I=
In-Reply-To: <sd9jbm$efc$1@gioia.aioe.org>
Content-Language: en-GB
 by: James Harris - Wed, 21 Jul 2021 21:24 UTC

On 21/07/2021 18:53, Rod Pemberton wrote:
> On Tue, 20 Jul 2021 21:40:51 +0100
> James Harris <james.harris.1@gmail.com> wrote:
>
>> On 20/07/2021 19:01, Dmitry A. Kazakov wrote:
>>> On 2021-07-20 19:33, James Harris wrote:
>
>>>> Hence the question: how should a language support access to
>>>> address zero? Any ideas?
>>>
>>> Where is a problem?
>>
>> The problem is in the implementation. Should a language recognise
>> such a thing as an invalid address and, if so, what value or range of
>> values should indicate that a given address is invalid?
>
> What exactly do you mean by,
> "recognize such a thing as an invalid address"?
>
> e.g., prohibiting any of the following,
>
> a) assignment of zero value to an address pointer
> b) comparison of zero value with a address pointer
> c) writing to location with address zero, i.e., dereference

I'm not really sure what you mean by the above so my replies below may
not be in line with what you were thinking of.

I was really asking whether a language ought to have the concept of one
address which is invalid. Imagine a 64k machine. One might want to point
to any of those 64k memory cells. In which case it's maybe a bad idea to
reserve one address as invalid.

But the rest of the topic is assuming that one address would be reserved
as the null address.

>
>
> First, let's for sake of argument say the language's NULL pointer (or
> equivalent) is actually of value zero, as it doesn't have to be, at
> least for the C language, but usually is implemented as a zero value.
> This eliminates the "thought" complexity over NULL in C etc being a
> non-zero value.
>
> I'd strongly argue that allowing b) is a language requirement.

Do you mean as in C's

if (p == 0)

?

Why not, instead, require the comparison to be against NULL, instead of
zero, as in

if (p == NULL)

or, indeed, just allow

if (p)

?

> I'd argue that a) is up to the language implementation.

I presume you mean

p = 0;

but is zero really needed when that could be, instead,

p = NULL;

?

> I'd argue that c) is up to the operating system implementation or
> hardware.

OK.

>
>> Compiled code often uses address zero to indicate 'invalid' but that
>> would not be possible if address zero were to be accessible.
>
> Sure it is.

I don't understand what you mean.

>
> The 'invalid' condition of which you speak is the result of a pointer
> to pointer comparison, as nothing is written to address zero.

No, I was really thinking of paging reserving page 0 as inaccessible -
so that any dereference of a null pointer (null being 0, in this case)
whether read or write would generate a fault.

But it's only really possible to prohibit access to address zero if one
is using paging. And as you know, that's not the only way to design an
OS. That's why I was thinking to make null an address /above/ the
user-accessible memory space. That would work whether one was using
paging or not. (The actual address for null would be set when a program
was loaded rather than being a constant known at compile time. That's a
bit different from normal but I think it could be done.)

>
> The writing to address zero is a dereference, not a pointer comparison,
> and will be allowed if the hardware is incapable of blocking the write
> to the address location, i.e., zero, e.g., via an invalid or unmapped
> page.

Interesting. So you are suggesting that if p is zero then p will compare
as NULL (and, hence, invalid) but that

*p = 99;

would still work?

--
James Harris

Re: Pointer to address zero

<sda87i$kmn$1@dont-email.me>

  copy mid

https://www.rocksolidbbs.com/devel/article-flat.php?id=283&group=comp.lang.misc#283

  copy link   Newsgroups: comp.lang.misc
Path: i2pn2.org!i2pn.org!eternal-september.org!reader02.eternal-september.org!.POSTED!not-for-mail
From: bc@freeuk.com (Bart)
Newsgroups: comp.lang.misc
Subject: Re: Pointer to address zero
Date: Wed, 21 Jul 2021 23:48:09 +0100
Organization: A noiseless patient Spider
Lines: 48
Message-ID: <sda87i$kmn$1@dont-email.me>
References: <sd71dn$pil$1@dont-email.me>
Mime-Version: 1.0
Content-Type: text/plain; charset=utf-8; format=flowed
Content-Transfer-Encoding: 7bit
Injection-Date: Wed, 21 Jul 2021 22:48:18 -0000 (UTC)
Injection-Info: reader02.eternal-september.org; posting-host="5931370de13184dd22c1e81964028b1d";
logging-data="21207"; mail-complaints-to="abuse@eternal-september.org"; posting-account="U2FsdGVkX19O7rq0McLXS0HDB4NXoqr+aAjK+P6rvAQ="
User-Agent: Mozilla/5.0 (Windows NT 6.1; Win64; x64; rv:78.0) Gecko/20100101
Thunderbird/78.11.0
Cancel-Lock: sha1:m/udG3RyXLz2L2Tpe8f7pgoTWDc=
In-Reply-To: <sd71dn$pil$1@dont-email.me>
X-Antivirus-Status: Clean
Content-Language: en-GB
X-Antivirus: AVG (VPS 210721-14, 21/7/2021), Outbound message
 by: Bart - Wed, 21 Jul 2021 22:48 UTC

On 20/07/2021 18:33, James Harris wrote:
> For my OS project I have been looking at program loading and that has
> led me to query what would be required in a language to support address
> zero being accessible and a pointer to it being considered to be valid.
>
> If I use paging I can reserve the lowest page so that address zero is
> inaccessible. A dereference of a zeroed pointer would be caught by the
> CPU triggering a fault.
>
> However, if on x86 I don't use paging then a reference to address zero
> would not trigger a fault so it would not be caught and diagnosed.

How is that different from any other access to invalid memory? Or a
access of address 1 (or 8 if aligned)?

> And
> it's not just that case. Other CPUs or microcontrollers may, presumably,
> allow access to address zero. Therefore there are cases where a program
> may have a pointer to address zero and that pointer could be legitimate.
>
> Hence the question: how should a language support access to address
> zero? Any ideas?

If the hardware allows a meaningful dereference to address 0, and you
need to have your HLL access that same location via a pointer, then you
need to make it possible.

However if zero is also used for a null pointer value, so that for
example P=null means that P has not been asigned to anything, then that
might interfere with that,

Then you might look at using an alternate representation for a null
pointer value.

Personally, I'd just make the first few bytes of memory special. Make
sure address 0 never occurs as a heap allocation, and rarely comes up as
the address of an object in the HLL.

My latest language has 'nil' value for pointers (you can't use 0),
whose value is not specified. But it is generally understood it is all
zeros.

That means that data structures that exist in the zero-data segment
(BSS?) will be guaranteed to have any embedded pointers set to nil.

Just stick something at address 0 that is not going to be dereferenced
via a HLL pointer. But if you really need to access that location, then
just do it.

Re: Pointer to address zero

<sdap3d$1llt$2@gioia.aioe.org>

  copy mid

https://www.rocksolidbbs.com/devel/article-flat.php?id=284&group=comp.lang.misc#284

  copy link   Newsgroups: comp.lang.misc
Path: i2pn2.org!i2pn.org!aioe.org!JKehOyGOGgs2f2NKLRXdGg.user.46.165.242.75.POSTED!not-for-mail
From: noemail@basdxcqvbe.com (Rod Pemberton)
Newsgroups: comp.lang.misc
Subject: Re: Pointer to address zero
Date: Wed, 21 Jul 2021 23:37:34 -0500
Organization: Aioe.org NNTP Server
Message-ID: <sdap3d$1llt$2@gioia.aioe.org>
References: <sd71dn$pil$1@dont-email.me>
<sd731k$1b50$1@gioia.aioe.org>
<sd7cck$btm$1@dont-email.me>
<sd9jbm$efc$1@gioia.aioe.org>
<sda39h$kr7$1@dont-email.me>
Mime-Version: 1.0
Content-Type: text/plain; charset=US-ASCII
Content-Transfer-Encoding: 7bit
Injection-Info: gioia.aioe.org; logging-data="54973"; posting-host="JKehOyGOGgs2f2NKLRXdGg.user.gioia.aioe.org"; mail-complaints-to="abuse@aioe.org";
X-Notice: Filtered by postfilter v. 0.9.2
 by: Rod Pemberton - Thu, 22 Jul 2021 04:37 UTC

On Wed, 21 Jul 2021 22:24:00 +0100
James Harris <james.harris.1@gmail.com> wrote:

> On 21/07/2021 18:53, Rod Pemberton wrote:
> > On Tue, 20 Jul 2021 21:40:51 +0100
> > James Harris <james.harris.1@gmail.com> wrote:
> >> On 20/07/2021 19:01, Dmitry A. Kazakov wrote:
> >>> On 2021-07-20 19:33, James Harris wrote:

> >>>> Hence the question: how should a language support access to
> >>>> address zero? Any ideas?
> >>>
> >>> Where is a problem?
> >>
> >> The problem is in the implementation. Should a language recognise
> >> such a thing as an invalid address and, if so, what value or range
> >> of values should indicate that a given address is invalid?
> >
> > What exactly do you mean by,
> > "recognize such a thing as an invalid address"?
> >
> > e.g., prohibiting any of the following,
> >
> > a) assignment of zero value to an address pointer
> > b) comparison of zero value with a address pointer
> > c) writing to location with address zero, i.e., dereference
>
> I'm not really sure what you mean by the above so my replies below
> may not be in line with what you were thinking of.
>
> I was really asking whether a language ought to have the concept of
> one address which is invalid.

Invalid for what? i.e., comparison? writing? reading?

> Imagine a 64k machine. One might want
> to point to any of those 64k memory cells. In which case it's maybe a
> bad idea to reserve one address as invalid.
>
> But the rest of the topic is assuming that one address would be
> reserved as the null address.
>

The fact that one address is reserved as the null address doesn't
prohibit writing to that address or dereferencing that address. The
address only ensures pointers don't compare equal, in order to detect
their initialization or lack thereof.

> > First, let's for sake of argument say the language's NULL pointer
> > (or equivalent) is actually of value zero, as it doesn't have to
> > be, at least for the C language, but usually is implemented as a
> > zero value. This eliminates the "thought" complexity over NULL in C
> > etc being a non-zero value.
> >
> > I'd strongly argue that allowing b) is a language requirement.
>
> Do you mean as in C's
>
> if (p == 0)
>
> ?
>
> Why not, instead, require the comparison to be against NULL, instead
> of zero, as in
>
> if (p == NULL)
>
> or, indeed, just allow
>
> if (p)
>
> ?

Since I defined NULL as zero for argument's sake, these are equivalent.
There is no difference between using NULL and using zero. If we had
chosen to use or allow a non-zero value for NULL, then we'd need to do
the latter, i.e., use (p==NULL), to ensure the correct non-zero value
is used for NULL in the comparison.

> > I'd argue that a) is up to the language implementation.
>
> I presume you mean
>
> p = 0;
>
> but is zero really needed when that could be, instead,
>
> p = NULL;
>
> ?

Since I defined NULL as zero for argument's sake, these are equivalent.
There is no difference between using NULL and using zero. If we had
chosen to use or allow a non-zero value for NULL, then, yes, we'd
really need to have a distinct zero value (since NULL would be
non-zero), as there would be no other way to access address zero.

> > [snip]
>
> No, I was really thinking of paging reserving page 0 as inaccessible
> - so that any dereference of a null pointer (null being 0, in this
> case) whether read or write would generate a fault.
>
> But it's only really possible to prohibit access to address zero if
> one is using paging. And as you know, that's not the only way to
> design an OS. That's why I was thinking to make null an address
> /above/ the user-accessible memory space. That would work whether one
> was using paging or not. (The actual address for null would be set
> when a program was loaded rather than being a constant known at
> compile time. That's a bit different from normal but I think it could
> be done.)

Yes. Didn't I link you to one of the C specification's authors on a
example non-zero implementation of NULL?
https://groups.google.com/g/comp.std.c/c/ez822gwxxYA/m/Jt94XH7AVacJ

> > The writing to address zero is a dereference, not a pointer
> > comparison, and will be allowed if the hardware is incapable of
> > blocking the write to the address location, i.e., zero, e.g., via
> > an invalid or unmapped page.
>
> Interesting. So you are suggesting that if p is zero then p will
> compare as NULL (and, hence, invalid) but that
>

Let's at least initialize p. Should we declare it with some type?

p = NULL;

> *p = 99;
>
> would still work?
>

In general, that's "Yes", but it's conditional. On Linux or Windows,
I'd expect a page fault, since they use paging and probably mark the
zero'th page as invalid. For DOS, I'd **usually** - but not always -
expect a write to address location zero with value 99.

If NULL is defined as zero, and writing to address zero isn't blocked
due to paging etc, then the answer is: Yes. That is how most C
compilers work/worked on environments a) without paging or prior to
paging, and b) where NULL is defined as zero (which is most of them).
For example, it works this way for many DOS C compilers, as most of
them are set up for 16-bit DOS, or for most 32-bit DOS DPMI hosts which
don't use paging. However, some 32-bit DOS DPMI hosts do use paging,
where such a dereference would page fault, just like for Windows or
Linux.

If you remember, assignment of an integer to a pointer is explicitly
undefined behavior (UB) for C, but most C compilers define this
behavior to be valid in order to access memory locations outside of C's
memory allocations. This is especially useful for memory-mapped
hardware or data structures, e.g., BIOS BDA, EBDA, IVT, etc. Typically,
you'll need to declare such access to be "volatile" as well, as the
compiler doesn't recognize the region as being allocated or in-use, and
may optimize away access.

--
Liberals preach diversity, equity, and inclusion, but engage in
misandry, are racist against whites, promote hatred of conservatives.

Re: Pointer to address zero

<sdap6p$1llt$3@gioia.aioe.org>

  copy mid

https://www.rocksolidbbs.com/devel/article-flat.php?id=285&group=comp.lang.misc#285

  copy link   Newsgroups: comp.lang.misc
Path: i2pn2.org!i2pn.org!aioe.org!JKehOyGOGgs2f2NKLRXdGg.user.46.165.242.75.POSTED!not-for-mail
From: noemail@basdxcqvbe.com (Rod Pemberton)
Newsgroups: comp.lang.misc
Subject: Re: Pointer to address zero
Date: Wed, 21 Jul 2021 23:39:21 -0500
Organization: Aioe.org NNTP Server
Message-ID: <sdap6p$1llt$3@gioia.aioe.org>
References: <sd71dn$pil$1@dont-email.me>
<sd731k$1b50$1@gioia.aioe.org>
<sd7cck$btm$1@dont-email.me>
<sd8hqm$7c2$1@gioia.aioe.org>
<sd9l7n$geq$1@dont-email.me>
Mime-Version: 1.0
Content-Type: text/plain; charset=US-ASCII
Content-Transfer-Encoding: 7bit
Injection-Info: gioia.aioe.org; logging-data="54973"; posting-host="JKehOyGOGgs2f2NKLRXdGg.user.gioia.aioe.org"; mail-complaints-to="abuse@aioe.org";
X-Notice: Filtered by postfilter v. 0.9.2
 by: Rod Pemberton - Thu, 22 Jul 2021 04:39 UTC

On Wed, 21 Jul 2021 18:24:01 +0100
James Harris <james.harris.1@gmail.com> wrote:

> As I say, compiled code often uses address zero as a null pointer,
> knowing that the OS will make the lowest page inaccessible so that an
> attempt to access it would generate a fault. And that's what I have
> long intended to do. But when looking at different ways of loading a
> program I realised that it was not always possible to reserve address
> zero.

Yes.

You'll have problems on any environment without paging, e.g., any 8-bit
computers (C64, Apple II), 16-bit computers (MacIntosh, Amiga), 16-bit
DOS, 32-bit DOS DPMI without a paging DPMI host, etc.

> In fact, sometimes a lot more than just one page is reserved.
> According to a video explainer I watched a while ago 32-bit Linux
> sets the lowest accessible address to something like 0x0040_0000 so
> that the hardware will trap not just
>
> *p
>
> but also
>
> p[q]
>
> for some significant size of q (all where p is null).

Good idea. That will prohibit C programmers from directly programming
the hardware. This is only needed for rudimentary OSes like DOS, and
isn't needed and shouldn't be allowed for advanced OSes like Linux or
Windows, which have paging. I.e., for safety, the modern C programmer
should be restricted to just the C application space - no hardware. Of
course, if the programmer was doing OS development, they'd need a way
around that restriction. E.g., DJGPP has special functions to allow
access to memory below 1MB.

> The trouble is that that takes away 4M of the address space and, more
> importantly, means that the addresses a programmer will see in a
> debugging session or a dump would have more significant digits than
> they need to have and, therefore, be harder to read than necessary.

To me, this is irrelevant. I use printf() to debug.

> If, by contrast, null is set to a little higher than the accessible
> memory program data areas could be at lower addresses making
> debugging sessions and dumps easier to read.
>
> The hardware would still trap on both of
>
> *p
> p[q]
>
> In fact, q could potentially be a lot higher than in the 'normal'
> model because not just 4M but all the addresses from p[0] to the
> highest memory address would be inaccessible and would trap (given a
> suitable CPU).
>
> To make this work I would have to have null (the null address)
> determined not at compile time but at program load time.

Why would it need to be any value above 1MB? ... It's usually only RAM
above 1MB, especially for older computers. I.e., in general, all the
older BIOS, I/O ports, Vesa BIOS, IVT, BDA, EBDA, RTC, CMOS, etc
hardware are all located low, below 1MB. It's only modern
memory-mapped devices, e.g., video frame buffers etc, which are located
high above 1MB, and indicated via E820h memory map. Don't ask me about
UEFI.

> That ought to cope well with various OS memory models though it does
> have a downside. If a structure containing a pointer is mapped over
> zeroed memory the pointer will not be null but will be considered to
> be valid. (It will point at location zero.)

True. If that's a concern, you'd need to write NULL's value.

--
Liberals preach diversity, equity, and inclusion, but engage in
misandry, are racist against whites, promote hatred of conservatives.

Re: Pointer to address zero

<ae991986-8485-4756-861e-426a12f5ad27n@googlegroups.com>

  copy mid

https://www.rocksolidbbs.com/devel/article-flat.php?id=286&group=comp.lang.misc#286

  copy link   Newsgroups: comp.lang.misc
X-Received: by 2002:ae9:c316:: with SMTP id n22mr2690778qkg.481.1627011729350;
Thu, 22 Jul 2021 20:42:09 -0700 (PDT)
X-Received: by 2002:a05:620a:749:: with SMTP id i9mr2899268qki.307.1627011729126;
Thu, 22 Jul 2021 20:42:09 -0700 (PDT)
Path: i2pn2.org!i2pn.org!weretis.net!feeder8.news.weretis.net!proxad.net!feeder1-2.proxad.net!209.85.160.216.MISMATCH!news-out.google.com!nntp.google.com!postnews.google.com!google-groups.googlegroups.com!not-for-mail
Newsgroups: comp.lang.misc
Date: Thu, 22 Jul 2021 20:42:08 -0700 (PDT)
In-Reply-To: <sd9l7n$geq$1@dont-email.me>
Injection-Info: google-groups.googlegroups.com; posting-host=24.207.183.245; posting-account=G1KGwgkAAAAyw4z0LxHH0fja6wAbo7Cz
NNTP-Posting-Host: 24.207.183.245
References: <sd71dn$pil$1@dont-email.me> <sd731k$1b50$1@gioia.aioe.org>
<sd7cck$btm$1@dont-email.me> <sd8hqm$7c2$1@gioia.aioe.org> <sd9l7n$geq$1@dont-email.me>
User-Agent: G2/1.0
MIME-Version: 1.0
Message-ID: <ae991986-8485-4756-861e-426a12f5ad27n@googlegroups.com>
Subject: Re: Pointer to address zero
From: mijoryx@yahoo.com (luserdroog)
Injection-Date: Fri, 23 Jul 2021 03:42:09 +0000
Content-Type: text/plain; charset="UTF-8"
 by: luserdroog - Fri, 23 Jul 2021 03:42 UTC

On Wednesday, July 21, 2021 at 12:24:09 PM UTC-5, James Harris wrote:
> On 21/07/2021 08:19, Dmitry A. Kazakov wrote:
> > On 2021-07-20 22:40, James Harris wrote:

> >> How are you distinguishing between pointer and address?
> >
> > That thing is called the type.
> OK ... then what, to you, distinguishes them? Alignment? Range? History?
> Something else?
>

I think the point others are trying to make is that your compiler should
make this distinction at a higher level, using higher level information
that the language describes.

The compiled code should have no need to make this determination
at runtime using only the machine representation of a memory address.

Re: Pointer to address zero

<sde6tv$mdt$1@dont-email.me>

  copy mid

https://www.rocksolidbbs.com/devel/article-flat.php?id=287&group=comp.lang.misc#287

  copy link   Newsgroups: comp.lang.misc
Path: i2pn2.org!i2pn.org!eternal-september.org!reader02.eternal-september.org!.POSTED!not-for-mail
From: james.harris.1@gmail.com (James Harris)
Newsgroups: comp.lang.misc
Subject: Re: Pointer to address zero
Date: Fri, 23 Jul 2021 11:50:38 +0100
Organization: A noiseless patient Spider
Lines: 223
Message-ID: <sde6tv$mdt$1@dont-email.me>
References: <sd71dn$pil$1@dont-email.me> <sd731k$1b50$1@gioia.aioe.org>
<sd7cck$btm$1@dont-email.me> <sd9jbm$efc$1@gioia.aioe.org>
<sda39h$kr7$1@dont-email.me> <sdap3d$1llt$2@gioia.aioe.org>
Mime-Version: 1.0
Content-Type: text/plain; charset=utf-8; format=flowed
Content-Transfer-Encoding: 7bit
Injection-Date: Fri, 23 Jul 2021 10:50:39 -0000 (UTC)
Injection-Info: reader02.eternal-september.org; posting-host="ab2fa568c193e99f33826ae13001cd3a";
logging-data="22973"; mail-complaints-to="abuse@eternal-september.org"; posting-account="U2FsdGVkX19Ry0dp36oe2DFW3YC7GdWpRUgOFx99BiY="
User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:78.0) Gecko/20100101
Thunderbird/78.11.0
Cancel-Lock: sha1:ZjOPaxmsARBCwzDE6+ns/63U/yQ=
In-Reply-To: <sdap3d$1llt$2@gioia.aioe.org>
Content-Language: en-GB
X-Mozilla-News-Host: news://news.eternal-september.org
 by: James Harris - Fri, 23 Jul 2021 10:50 UTC

On 22/07/2021 05:37, Rod Pemberton wrote:
> On Wed, 21 Jul 2021 22:24:00 +0100
> James Harris <james.harris.1@gmail.com> wrote:
>
>> On 21/07/2021 18:53, Rod Pemberton wrote:
>>> On Tue, 20 Jul 2021 21:40:51 +0100
>>> James Harris <james.harris.1@gmail.com> wrote:
>>>> On 20/07/2021 19:01, Dmitry A. Kazakov wrote:
>>>>> On 2021-07-20 19:33, James Harris wrote:
>
>>>>>> Hence the question: how should a language support access to
>>>>>> address zero? Any ideas?
>>>>>
>>>>> Where is a problem?
>>>>
>>>> The problem is in the implementation. Should a language recognise
>>>> such a thing as an invalid address and, if so, what value or range
>>>> of values should indicate that a given address is invalid?
>>>
>>> What exactly do you mean by,
>>> "recognize such a thing as an invalid address"?
>>>
>>> e.g., prohibiting any of the following,
>>>
>>> a) assignment of zero value to an address pointer
>>> b) comparison of zero value with a address pointer
>>> c) writing to location with address zero, i.e., dereference
>>
>> I'm not really sure what you mean by the above so my replies below
>> may not be in line with what you were thinking of.
>>
>> I was really asking whether a language ought to have the concept of
>> one address which is invalid.
>
> Invalid for what? i.e., comparison? writing? reading?

By "invalid" I mean "guaranteed not to be the address of any object" and
that it would be an error to attempt to dereference an invalid pointer.

>
>> Imagine a 64k machine. One might want
>> to point to any of those 64k memory cells. In which case it's maybe a
>> bad idea to reserve one address as invalid.
>>
>> But the rest of the topic is assuming that one address would be
>> reserved as the null address.
>>
>
> The fact that one address is reserved as the null address doesn't
> prohibit writing to that address or dereferencing that address. The
> address only ensures pointers don't compare equal, in order to detect
> their initialization or lack thereof.

That's an interesting idea. How would it work? Say one had a routine
which checked whether a pointer was valid before dereferencing it, along
the lines of

if (p != NULL) {
/* work with the object at p */
}

Most of the time that routine would work properly but if were to be
passed the address of an object which just happened to sit at the null
address then the routine would fail - a nasty bug which might not show
up in testing.

>
>>> First, let's for sake of argument say the language's NULL pointer
>>> (or equivalent) is actually of value zero, as it doesn't have to
>>> be, at least for the C language, but usually is implemented as a
>>> zero value. This eliminates the "thought" complexity over NULL in C
>>> etc being a non-zero value.
>>>
>>> I'd strongly argue that allowing b) is a language requirement.
>>
>> Do you mean as in C's
>>
>> if (p == 0)
>>
>> ?
>>
>> Why not, instead, require the comparison to be against NULL, instead
>> of zero, as in
>>
>> if (p == NULL)
>>
>> or, indeed, just allow
>>
>> if (p)
>>
>> ?
>
> Since I defined NULL as zero for argument's sake, these are equivalent.

Yes, though the "== NULL" variant would work always. It would work
whether NULL was zero or not. Further, in a language other than C the simple

if p

could, if p was of pointer type, be defined as

if bool(p)

where bool() of a pointer would return false if the pointer was to the
null address.

> There is no difference between using NULL and using zero. If we had
> chosen to use or allow a non-zero value for NULL, then we'd need to do
> the latter, i.e., use (p==NULL), to ensure the correct non-zero value
> is used for NULL in the comparison.
>

Yes.

>>> I'd argue that a) is up to the language implementation.
>>
>> I presume you mean
>>
>> p = 0;
>>
>> but is zero really needed when that could be, instead,
>>
>> p = NULL;
>>
>> ?
>
> Since I defined NULL as zero for argument's sake, these are equivalent.

That's the easy case!

> There is no difference between using NULL and using zero. If we had
> chosen to use or allow a non-zero value for NULL, then, yes, we'd
> really need to have a distinct zero value (since NULL would be
> non-zero), as there would be no other way to access address zero.

Yes.

>
>>> [snip]
>>
>> No, I was really thinking of paging reserving page 0 as inaccessible
>> - so that any dereference of a null pointer (null being 0, in this
>> case) whether read or write would generate a fault.
>>
>> But it's only really possible to prohibit access to address zero if
>> one is using paging. And as you know, that's not the only way to
>> design an OS. That's why I was thinking to make null an address
>> /above/ the user-accessible memory space. That would work whether one
>> was using paging or not. (The actual address for null would be set
>> when a program was loaded rather than being a constant known at
>> compile time. That's a bit different from normal but I think it could
>> be done.)
>
> Yes. Didn't I link you to one of the C specification's authors on a
> example non-zero implementation of NULL?
> https://groups.google.com/g/comp.std.c/c/ez822gwxxYA/m/Jt94XH7AVacJ

I've been reading that but I'm not sure I understand it or see how it helps.

>
>>> The writing to address zero is a dereference, not a pointer
>>> comparison, and will be allowed if the hardware is incapable of
>>> blocking the write to the address location, i.e., zero, e.g., via
>>> an invalid or unmapped page.
>>
>> Interesting. So you are suggesting that if p is zero then p will
>> compare as NULL (and, hence, invalid) but that
>>
>
> Let's at least initialize p. Should we declare it with some type?
>
> p = NULL;
>
>> *p = 99;
>>
>> would still work?
>>

That would be required to fail.

>
> In general, that's "Yes", but it's conditional. On Linux or Windows,
> I'd expect a page fault, since they use paging and probably mark the
> zero'th page as invalid. For DOS, I'd **usually** - but not always -
> expect a write to address location zero with value 99.

True, but for DOS the compiler could insert a check so that it, too,
would lead to an exception. (I would like attempts to dereference a null
pointer to throw an exception on all platforms to that program behaviour
is consistent.)

>
> If NULL is defined as zero, and writing to address zero isn't blocked
> due to paging etc, then the answer is: Yes. That is how most C
> compilers work/worked on environments a) without paging or prior to
> paging, and b) where NULL is defined as zero (which is most of them).
> For example, it works this way for many DOS C compilers, as most of
> them are set up for 16-bit DOS, or for most 32-bit DOS DPMI hosts which
> don't use paging. However, some 32-bit DOS DPMI hosts do use paging,
> where such a dereference would page fault, just like for Windows or
> Linux.
>
> If you remember, assignment of an integer to a pointer is explicitly
> undefined behavior (UB) for C, but most C compilers define this
> behavior to be valid in order to access memory locations outside of C's
> memory allocations.

Isn't it /implementation-defined/ behaviour rather than UB?

> This is especially useful for memory-mapped
> hardware or data structures, e.g., BIOS BDA, EBDA, IVT, etc. Typically,
> you'll need to declare such access to be "volatile" as well, as the
> compiler doesn't recognize the region as being allocated or in-use, and
> may optimize away access.

Agreed.

--
James Harris

Re: Pointer to address zero

<sdeh3b$ej9$1@z-news.wcss.wroc.pl>

  copy mid

https://www.rocksolidbbs.com/devel/article-flat.php?id=288&group=comp.lang.misc#288

  copy link   Newsgroups: comp.lang.misc
Path: i2pn2.org!i2pn.org!weretis.net!feeder6.news.weretis.net!4.us.feeder.erje.net!3.eu.feeder.erje.net!feeder.erje.net!newsfeed.pionier.net.pl!pwr.wroc.pl!news.wcss.wroc.pl!not-for-mail
From: antispam@math.uni.wroc.pl
Newsgroups: comp.lang.misc
Subject: Re: Pointer to address zero
Date: Fri, 23 Jul 2021 13:44:11 +0000 (UTC)
Organization: Politechnika Wroclawska
Lines: 66
Message-ID: <sdeh3b$ej9$1@z-news.wcss.wroc.pl>
References: <sd71dn$pil$1@dont-email.me>
NNTP-Posting-Host: hera.math.uni.wroc.pl
X-Trace: z-news.wcss.wroc.pl 1627047851 14953 156.17.86.1 (23 Jul 2021 13:44:11 GMT)
X-Complaints-To: abuse@news.pwr.wroc.pl
NNTP-Posting-Date: Fri, 23 Jul 2021 13:44:11 +0000 (UTC)
Cancel-Lock: sha1:EsT4hFObnYpF53MMai1jrUTr2uw=
User-Agent: tin/2.4.3-20181224 ("Glen Mhor") (UNIX) (Linux/4.19.0-10-amd64 (x86_64))
 by: antispam@math.uni.wroc.pl - Fri, 23 Jul 2021 13:44 UTC

James Harris <james.harris.1@gmail.com> wrote:
> For my OS project I have been looking at program loading and that has
> led me to query what would be required in a language to support address
> zero being accessible and a pointer to it being considered to be valid.
>
> If I use paging I can reserve the lowest page so that address zero is
> inaccessible. A dereference of a zeroed pointer would be caught by the
> CPU triggering a fault.
>
> However, if on x86 I don't use paging then a reference to address zero
> would not trigger a fault so it would not be caught and diagnosed. And
> it's not just that case. Other CPUs or microcontrollers may, presumably,
> allow access to address zero. Therefore there are cases where a program
> may have a pointer to address zero and that pointer could be legitimate.
>
> Hence the question: how should a language support access to address
> zero? Any ideas?

You mix many different things. As other noted, null pointer in
a programming may be address 0, but there are also different
possible representations. It is for programming language to
decide what happens with null pointer is dereferenced. AFAIK
in Vax C page 0 was filled with 0, null pointer was represented
by 0, so read acccess via null pointer gave 0. Effectively,
null pointer was pointer to a canonical empty string. I guess
that trying to write gave a fault. On machine with true ROM
normally attempts to write to ROM are ignored. If there is
0 byte in ROM you could represent null pointer as address of
zero byte in ROM. In such case reads via null pointer would
gave 0, writes would be ignored. In Lisp instead of null
pointer there is NIL. In Lisp dereferencing (reading via) NIL
has defined effect (IIRC writes are illegal).

So it is really up to you what you decide. Most modern
languages make dereferencing null pointers illegal, but
there are notable examples that do differently.

Another thing is safety, namely question if your implementation
will detect illegal program behaviour. There is "C attitude"
which basically says that "do what the machine do". Actually,
this attitude goes back at least to Fortran and to same
degree is present in Pascal. According to this attitude
it is programmer responsibility to avoid dereferencing
null pointer (original Fortran had no pointers, but
similar attitude applyes in other places). If hardware
have appropriate support language my arrange to trap
dereferencing via null pointer, but otherwise programmer
has to handle this. Some languages, notably Ada have
different attitude, they promise to catch all illegal
actions. In case of null pointer this may require
extra checks before each memory access via pointer.
"Safe" compiler is supposed to insert such checks.
Expensive, but optimizing compilers can see that most
checks are unnecessary and insert checks only in
case of doubt. And when hardware is capable of checking
compiler can depend on hardware checks.

Personaly I am against making null pointer dereference
legal, IMO gains from this are negligible, and loss
(mainly inability to catch errors) is large. I am
for having as much checking as possible, if you feel
that having checks always on is too expensive at least
make them available for debugging.

--
Waldek Hebisch

Re: Pointer to address zero

<sdgs19$19q0$1@gioia.aioe.org>

  copy mid

https://www.rocksolidbbs.com/devel/article-flat.php?id=289&group=comp.lang.misc#289

  copy link   Newsgroups: comp.lang.misc
Path: i2pn2.org!i2pn.org!aioe.org!JKehOyGOGgs2f2NKLRXdGg.user.46.165.242.75.POSTED!not-for-mail
From: noemail@basdxcqvbe.com (Rod Pemberton)
Newsgroups: comp.lang.misc
Subject: Re: Pointer to address zero
Date: Sat, 24 Jul 2021 07:04:26 -0500
Organization: Aioe.org NNTP Server
Message-ID: <sdgs19$19q0$1@gioia.aioe.org>
References: <sd71dn$pil$1@dont-email.me>
<sd731k$1b50$1@gioia.aioe.org>
<sd7cck$btm$1@dont-email.me>
<sd9jbm$efc$1@gioia.aioe.org>
<sda39h$kr7$1@dont-email.me>
<sdap3d$1llt$2@gioia.aioe.org>
<sde6tv$mdt$1@dont-email.me>
Mime-Version: 1.0
Content-Type: text/plain; charset=US-ASCII
Content-Transfer-Encoding: 7bit
Injection-Info: gioia.aioe.org; logging-data="42816"; posting-host="JKehOyGOGgs2f2NKLRXdGg.user.gioia.aioe.org"; mail-complaints-to="abuse@aioe.org";
X-Notice: Filtered by postfilter v. 0.9.2
 by: Rod Pemberton - Sat, 24 Jul 2021 12:04 UTC

On Fri, 23 Jul 2021 11:50:38 +0100
James Harris <james.harris.1@gmail.com> wrote:

> On 22/07/2021 05:37, Rod Pemberton wrote:
> > On Wed, 21 Jul 2021 22:24:00 +0100
> > James Harris <james.harris.1@gmail.com> wrote:
> >> On 21/07/2021 18:53, Rod Pemberton wrote:
> >>> On Tue, 20 Jul 2021 21:40:51 +0100
> >>> James Harris <james.harris.1@gmail.com> wrote:
> >>>> On 20/07/2021 19:01, Dmitry A. Kazakov wrote:
> >>>>> On 2021-07-20 19:33, James Harris wrote:

> >>>>>> Hence the question: how should a language support access to
> >>>>>> address zero? Any ideas?
> >>>>>
> >>>>> Where is a problem?
> >>>>
> >>>> The problem is in the implementation. Should a language recognise
> >>>> such a thing as an invalid address and, if so, what value or
> >>>> range of values should indicate that a given address is invalid?
> >>>
> >>> What exactly do you mean by,
> >>> "recognize such a thing as an invalid address"?
> >>>
> >>> e.g., prohibiting any of the following,
> >>>
> >>> a) assignment of zero value to an address pointer
> >>> b) comparison of zero value with a address pointer
> >>> c) writing to location with address zero, i.e., dereference
> >>
> >> I'm not really sure what you mean by the above so my replies below
> >> may not be in line with what you were thinking of.
> >>
> >> I was really asking whether a language ought to have the concept of
> >> one address which is invalid.
> >
> > Invalid for what? i.e., comparison? writing? reading?
>
> By "invalid" I mean "guaranteed not to be the address of any object"

If you meant to say, "... any C object ..." or
" ... any object in my language ...", then

Yes.

Obviously, if you're linking to code from another language, or
accessing the address is for a memory-mapped device or data, then you
can't make that claim of the address being "invalid", because it's
outside the scope of what the C compiler controls or what your language
compile controls.

> and that it would be an error to attempt to dereference an invalid
> pointer.

I guess this depends on how you happen to define "error" for this
statement.

If this is just a coding error to be avoided by the programmer, much
like a syntax error but without a warning, and nothing else, that's
acceptable.

If it's compilation error, it could be problematic, as the pointer's
value may not be defined until run time, and could be changed to NULL
at some random point during execution. I.e., the compiler wouldn't
always be able to detect during compilation that the pointer would be
dereferenced after being set to NULL.

If it's a run time error, this would really require hardware support,
such as paging to be effective. Software checks on the pointer's value
could add excessive overhead to identify this case.

> >> Imagine a 64k machine. One might want
> >> to point to any of those 64k memory cells. In which case it's
> >> maybe a bad idea to reserve one address as invalid.
> >>
> >> But the rest of the topic is assuming that one address would be
> >> reserved as the null address.
> >>
> >
> > The fact that one address is reserved as the null address doesn't
> > prohibit writing to that address or dereferencing that address. The
> > address only ensures pointers don't compare equal, in order to
> > detect their initialization or lack thereof.
>
> That's an interesting idea. How would it work?

AISI, it would work just like any other pointer dereference.

I.e., the value zero isn't special, nor is NULL as zero, nor is NULL as
non-zero. For C, NULL is only special because valid C objects can't be
located at the same address. I.e., the address of C objects must
compare unequal to NULL.

> Say one had a routine
> which checked whether a pointer was valid before dereferencing it,
> along the lines of
>
> if (p != NULL) {
> /* work with the object at p */
> }

Sigh. Why would you (or anyone) ever do this? ...

As the programmer, you should be coding your program to not have
pointers set to NULL especially if the pointer is to be dereferenced.
I.e., initialize the pointer prior to use.

Also, the programmer should be tracking when and where your pointers
get set to NULL to prevent this, e.g., set to NULL by library functions.

While this is clearly defensive programming (CYA) against a potential
error, you're not doing your job properly, if you need to do this as a
matter of course, IMO.

> Most of the time that routine would work properly but if were to be
> passed the address of an object which just happened to sit at the
> null address then the routine would fail - a nasty bug which might
> not show up in testing.

I think your assumption that simply accessing junk data at the NULL
address would cause the routine to fail to be incorrect.

If the routine fails for junk data, then it fails for junk data no
matter where it's located or retrieved from. The NULL address isn't
special in this regard. The data there (at NULL) may be invalid or
junk for the routine, but there are millions or billions of other
memory locations where the data could be invalid or junk for the
routine too, thereby causing it to fail.

How do you intend to filter out millions or billions of other "invalid"
addresses from being passed to this routine, knowing that they would
cause the routine to fail? If you're doing this for x86, shouldn't
you, at a minimum, filter out every pointer address below 1MB?

> True, but for DOS the compiler could insert a check so that it, too,
> would lead to an exception.

Why? Why would you generate an exception for a NULL dereference?

E.g., on x86, the RM IVT starts at zero (by default). It's a
memory-mapped data structure which would have a zero address in C,
which would likely match the NULL pointer address. Address zero
corresponds to the divide-by-zero interrupt vector, which may need to
be changed by DOS programs.

Let's get back to the question of why would you generate an exception
for that. Does the ability to detect a NULL pointer dereference
outweigh the ability to manipulate the interrupt vectors? What about a
memory-mapped device or data? If you answered "Yes" in favor of the
dereference, I'd have to ask you, "Since when?"

If you can't use the language to program the hardware, then there is no
point in using the language at all. If you don't understand this,
you're clearly not familiar Pascal. Pascal effectively died because it
was unable to be used to directly program the hardware or access memory
outside of the scope of the language, e.g., memory-mapped devices,
because it lacked pointers.

> (I would like attempts to dereference a null pointer to throw an
> exception on all platforms [so] that program behaviour is consistent.)

I simply don't agree with this. That is only valid for platforms where
the NULL address doesn't correspond to valid memory-mapped data or valid
memory-mapped device, which admittedly is most of them nowadays, but
clearly not all of them. I.e., you're going to break or restrict
something by doing this for every platform. Obviously, this would
affect DOS for x86, and old 8-bits like Apple II or C64 which use
6502/6510 micro-processors which use the zero-page for a register file.

> > If NULL is defined as zero, and writing to address zero isn't
> > blocked due to paging etc, then the answer is: Yes. That is how
> > most C compilers work/worked on environments a) without paging or
> > prior to paging, and b) where NULL is defined as zero (which is
> > most of them). For example, it works this way for many DOS C
> > compilers, as most of them are set up for 16-bit DOS, or for most
> > 32-bit DOS DPMI hosts which don't use paging. However, some 32-bit
> > DOS DPMI hosts do use paging, where such a dereference would page
> > fault, just like for Windows or Linux.
> >
> > If you remember, assignment of an integer to a pointer is explicitly
> > undefined behavior (UB) for C, but most C compilers define this
> > behavior to be valid in order to access memory locations outside of
> > C's memory allocations.
>
> Isn't it /implementation-defined/ behaviour rather than UB?

Sigh ...

--
Liberals preach diversity, equity, and inclusion, but engage in
misandry, are racist against whites, promote hatred of conservatives.

Re: Pointer to address zero

<sdh1on$m63$1@dont-email.me>

  copy mid

https://www.rocksolidbbs.com/devel/article-flat.php?id=290&group=comp.lang.misc#290

  copy link   Newsgroups: comp.lang.misc
Path: i2pn2.org!i2pn.org!eternal-september.org!reader02.eternal-september.org!.POSTED!not-for-mail
From: james.harris.1@gmail.com (James Harris)
Newsgroups: comp.lang.misc
Subject: Re: Pointer to address zero
Date: Sat, 24 Jul 2021 13:40:54 +0100
Organization: A noiseless patient Spider
Lines: 79
Message-ID: <sdh1on$m63$1@dont-email.me>
References: <sd71dn$pil$1@dont-email.me> <sd731k$1b50$1@gioia.aioe.org>
<sd7cck$btm$1@dont-email.me> <sd9jbm$efc$1@gioia.aioe.org>
<sda39h$kr7$1@dont-email.me> <sdap3d$1llt$2@gioia.aioe.org>
<sde6tv$mdt$1@dont-email.me> <sdgs19$19q0$1@gioia.aioe.org>
Mime-Version: 1.0
Content-Type: text/plain; charset=utf-8; format=flowed
Content-Transfer-Encoding: 7bit
Injection-Date: Sat, 24 Jul 2021 12:40:55 -0000 (UTC)
Injection-Info: reader02.eternal-september.org; posting-host="be68ed00ca90ea05033faa575b8cf5f8";
logging-data="22723"; mail-complaints-to="abuse@eternal-september.org"; posting-account="U2FsdGVkX18qU3zG7QDK+uXTDvU+r3kutzyBpwcBtjQ="
User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:78.0) Gecko/20100101
Thunderbird/78.11.0
Cancel-Lock: sha1:pYirNkznqYi8uxE/WfAGoJPq3Io=
In-Reply-To: <sdgs19$19q0$1@gioia.aioe.org>
Content-Language: en-GB
 by: James Harris - Sat, 24 Jul 2021 12:40 UTC

On 24/07/2021 13:04, Rod Pemberton wrote:
> On Fri, 23 Jul 2021 11:50:38 +0100
> James Harris <james.harris.1@gmail.com> wrote:

....

Rod, your replies on this topic are puzzling me a bit. Perhaps I
misunderstand what you are saying but ISTM that you have the wrong idea
of how a null pointer is frequently used. I don't think that's likely to
be the case so to try to clear up any confusion I'll reply to this part
of your post specifically and will come back to the rest of your post
later.

>> Say one had a routine
>> which checked whether a pointer was valid before dereferencing it,
>> along the lines of
>>
>> if (p != NULL) {
>> /* work with the object at p */
>> }
>
> Sigh. Why would you (or anyone) ever do this? ...

It's standard programming which is used all the time. For example, if
you wanted to process a tree - let's say in preorder - you could write

preorder(node *n)
{
process(n->data);
if (n->left != NULL) preorder(n->left);
if (n->right != NULL) preorder(n->right);
}

or, if you prefer,

preorder(node *n)
{
if (n != NULL)
{
process(n->data);
preorder(n->left);
preorder(n->right);
}
}

In either case the point is that the code tests for a pointer being null
because null means that there is no node.

As with the theme of this topic, there is a particular address which is
guaranteed not to refer to an object.

>
> As the programmer, you should be coding your program to not have
> pointers set to NULL especially if the pointer is to be dereferenced.
> I.e., initialize the pointer prior to use.

Null doesn't mean uninitialised. A pointer can be initialised to null,
as it would be for the tree-walking code, above.

Similarly, someone might walk a linked list by

while (n != NULL)
{
process(n->data);
n = n->next;
}

Again, the pointer being null means that there is no object being
referred to. You can imagine that n-next will have explicitly been set
to NULL in the last node in the list.

Does that change your view of the topic and where a pointer may be null?

--
James Harris

Re: Pointer to address zero

<sdh2o6$rst$1@dont-email.me>

  copy mid

https://www.rocksolidbbs.com/devel/article-flat.php?id=291&group=comp.lang.misc#291

  copy link   Newsgroups: comp.lang.misc
Path: i2pn2.org!i2pn.org!eternal-september.org!reader02.eternal-september.org!.POSTED!not-for-mail
From: bc@freeuk.com (Bart)
Newsgroups: comp.lang.misc
Subject: Re: Pointer to address zero
Date: Sat, 24 Jul 2021 13:57:27 +0100
Organization: A noiseless patient Spider
Lines: 29
Message-ID: <sdh2o6$rst$1@dont-email.me>
References: <sd71dn$pil$1@dont-email.me> <sd731k$1b50$1@gioia.aioe.org>
<sd7cck$btm$1@dont-email.me> <sd9jbm$efc$1@gioia.aioe.org>
<sda39h$kr7$1@dont-email.me> <sdap3d$1llt$2@gioia.aioe.org>
<sde6tv$mdt$1@dont-email.me> <sdgs19$19q0$1@gioia.aioe.org>
Mime-Version: 1.0
Content-Type: text/plain; charset=utf-8; format=flowed
Content-Transfer-Encoding: 7bit
Injection-Date: Sat, 24 Jul 2021 12:57:42 -0000 (UTC)
Injection-Info: reader02.eternal-september.org; posting-host="ecc6c3c2b8f9d46aead64b7c8bfad8b4";
logging-data="28573"; mail-complaints-to="abuse@eternal-september.org"; posting-account="U2FsdGVkX1+CBaigjB2VuFdl6NNPYNcEm4FsnFvhNBk="
User-Agent: Mozilla/5.0 (Windows NT 6.1; Win64; x64; rv:78.0) Gecko/20100101
Thunderbird/78.11.0
Cancel-Lock: sha1:GQBdGjCzBvh+atLdBL4hpDob284=
In-Reply-To: <sdgs19$19q0$1@gioia.aioe.org>
X-Antivirus-Status: Clean
Content-Language: en-GB
X-Antivirus: AVG (VPS 210724-0, 24/7/2021), Outbound message
 by: Bart - Sat, 24 Jul 2021 12:57 UTC

On 24/07/2021 13:04, Rod Pemberton wrote:
> On Fri, 23 Jul 2021 11:50:38 +0100

>> Say one had a routine
>> which checked whether a pointer was valid before dereferencing it,
>> along the lines of
>>
>> if (p != NULL) {
>> /* work with the object at p */
>> }
>
> Sigh. Why would you (or anyone) ever do this? ...
>
> As the programmer, you should be coding your program to not have
> pointers set to NULL especially if the pointer is to be dereferenced.
> I.e., initialize the pointer prior to use.

You don't seem to understand what NULL is for.

How do /you/ use NULL in C, or do you never actually use it?

Actually, how would you code a linked list without using some sentinel
to mark the end of the list?

The use of NULL like this is EVERYWHERE and in every API.

For example the return value of C's fopen() function is NULL when the
operation failed.

Re: Pointer to address zero

<sdh4jt$82o$1@dont-email.me>

  copy mid

https://www.rocksolidbbs.com/devel/article-flat.php?id=292&group=comp.lang.misc#292

  copy link   Newsgroups: comp.lang.misc
Path: i2pn2.org!i2pn.org!eternal-september.org!reader02.eternal-september.org!.POSTED!not-for-mail
From: david.brown@hesbynett.no (David Brown)
Newsgroups: comp.lang.misc
Subject: Re: Pointer to address zero
Date: Sat, 24 Jul 2021 15:29:33 +0200
Organization: A noiseless patient Spider
Lines: 71
Message-ID: <sdh4jt$82o$1@dont-email.me>
References: <sd71dn$pil$1@dont-email.me> <sd731k$1b50$1@gioia.aioe.org>
<sd7cck$btm$1@dont-email.me> <sd9jbm$efc$1@gioia.aioe.org>
<sda39h$kr7$1@dont-email.me> <sdap3d$1llt$2@gioia.aioe.org>
<sde6tv$mdt$1@dont-email.me> <sdgs19$19q0$1@gioia.aioe.org>
<sdh2o6$rst$1@dont-email.me>
Mime-Version: 1.0
Content-Type: text/plain; charset=utf-8
Content-Transfer-Encoding: 8bit
Injection-Date: Sat, 24 Jul 2021 13:29:33 -0000 (UTC)
Injection-Info: reader02.eternal-september.org; posting-host="2de9e2e753a61e34d025ab893d70f5d9";
logging-data="8280"; mail-complaints-to="abuse@eternal-september.org"; posting-account="U2FsdGVkX18kq1Y3weaoPsI+E43j+5GayP0JlDy84WU="
User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:68.0) Gecko/20100101
Thunderbird/68.10.0
Cancel-Lock: sha1:UFhePTmr8A/8ITOAwZMPAnwSoMY=
In-Reply-To: <sdh2o6$rst$1@dont-email.me>
Content-Language: en-GB
 by: David Brown - Sat, 24 Jul 2021 13:29 UTC

On 24/07/2021 14:57, Bart wrote:
> On 24/07/2021 13:04, Rod Pemberton wrote:
>> On Fri, 23 Jul 2021 11:50:38 +0100
>
>>> Say one had a routine
>>> which checked whether a pointer was valid before dereferencing it,
>>> along the lines of
>>>
>>>     if (p != NULL) {
>>>       /* work with the object at p */
>>>     }
>>
>> Sigh.  Why would you (or anyone) ever do this? ...
>>
>> As the programmer, you should be coding your program to not have
>> pointers set to NULL especially if the pointer is to be dereferenced.
>> I.e., initialize the pointer prior to use.
>
> You don't seem to understand what NULL is for.
>
> How do /you/ use NULL in C, or do you never actually use it?
>
> Actually, how would you code a linked list without using some sentinel
> to mark the end of the list?
>
> The use of NULL like this is EVERYWHERE and in every API.
>
> For example the return value of C's fopen() function is NULL when the
> operation failed.
>

It is useful for a language to have a concept of pointers or references
that are guaranteed valid - then you don't need to check them like this.
(C does not have this concept, but C++ does - you can't create a
reference "pointer" without it being a reference to /something/.)

It is also useful for a language to have a concept of "optional" values.
That is, a way of saying "x is either NULL or a valid value of type T".
Sometimes this is so useful that it applies to all types (like SQL).
Sometimes you force the programmer to do this manually, like in C using
"struct maybe_int { bool valid; int x; };". Sometimes you make it a
convenient part of the standard library, like C++ "std::optional<int>".
Sometimes you have it through summation or algebraic types fully
supported in the language, like Haskell "data Maybeint = Invalid | Int".

There is no simple and efficient way to do this for simple types - for
an integer, you either have to sacrifice a valid integer value, or you
have to add an extra boolean flag to go with it. Sacrificing a value
makes your arithmetic coding a lot more complicated and inefficient.
But for pointers, sacrificing a value to use as an "invalid" indicator
is cheap and easy, and the gains are certainly worth it. The most
efficient "invalid" value to use is 0, since it is quick and easy to
test. An alternative worth considering is to use the highest bit to
indicate invalid - cutting half your address space is often not a
problem, you can use a pointer to address 0, and you can represent many
different invalid values.

So IMHO it makes sense for a language to support /both/ concepts of an
always valid pointer, and of optionally valid pointers.

I'd also suggest pointers being viewed as a lot more general than just
holding an address, allowing for references, weak references, shared
pointers, and other ways of referring to objects.

I'd also avoid "if (p != NULL)", and prefer "if (p)" or "if (valid(p))",
making it clearer that you are checking for the validity of the pointer
rather than for it happening to match a particular value. (In C, an
implementation can have more than one null pointer, and "if (p != NULL)"
actually checks for any of them - something that is not apparent from
the syntax.)

Re: Pointer to address zero

<sdha8p$74v$1@gioia.aioe.org>

  copy mid

https://www.rocksolidbbs.com/devel/article-flat.php?id=293&group=comp.lang.misc#293

  copy link   Newsgroups: comp.lang.misc
Path: i2pn2.org!i2pn.org!aioe.org!XHGCo5bqYLkMQpewNWKdqA.user.46.165.242.75.POSTED!not-for-mail
From: anw@cuboid.co.uk (Andy Walker)
Newsgroups: comp.lang.misc
Subject: Re: Pointer to address zero
Date: Sat, 24 Jul 2021 16:06:01 +0100
Organization: Not very much
Message-ID: <sdha8p$74v$1@gioia.aioe.org>
References: <sd71dn$pil$1@dont-email.me> <sd731k$1b50$1@gioia.aioe.org>
<sd7cck$btm$1@dont-email.me> <sd9jbm$efc$1@gioia.aioe.org>
<sda39h$kr7$1@dont-email.me> <sdap3d$1llt$2@gioia.aioe.org>
<sde6tv$mdt$1@dont-email.me> <sdgs19$19q0$1@gioia.aioe.org>
Mime-Version: 1.0
Content-Type: text/plain; charset=utf-8; format=flowed
Content-Transfer-Encoding: 7bit
Injection-Info: gioia.aioe.org; logging-data="7327"; posting-host="XHGCo5bqYLkMQpewNWKdqA.user.gioia.aioe.org"; mail-complaints-to="abuse@aioe.org";
User-Agent: Mozilla/5.0 (X11; Linux i686; rv:78.0) Gecko/20100101
Thunderbird/78.11.0
Content-Language: en-GB
X-Notice: Filtered by postfilter v. 0.9.2
 by: Andy Walker - Sat, 24 Jul 2021 15:06 UTC

On 24/07/2021 13:04, Rod Pemberton wrote:
[...]
[Dereferencing "NULL":]
> If it's compilation error, it could be problematic, as the pointer's
> value may not be defined until run time, and could be changed to NULL
> at some random point during execution. I.e., the compiler wouldn't
> always be able to detect during compilation that the pointer would be
> dereferenced after being set to NULL.

A sufficiently-explicit dereference of "NULL" for it to be a
compile-time error is surely in hen's-teeth territory; as rare as
explicitly writing "i := 1/0". It's clearly an error, but how far
that propagates back up the compilation is to some extent a matter
of taste.

> If it's a run time error, this would really require hardware support,
> such as paging to be effective. Software checks on the pointer's value
> could add excessive overhead to identify this case.

It /is/ a run-time error, in any half-way sensible language.
/Not/ checking is the sort of thing that generates bugs that surface
years later. The time spent debugging and re-issuing software [often
after the bug has been exploited to install malware on millions of
computers] typically far exceeds the overhead of checking in the first
place. It's not just dereferencing "NULL", of course; it's also
buffer over-runs, uninitialised variables, storage used after "free",
and many others. In sensible languages, most of the checks can be
optimised away [see below] even by quite simple compilers.

[James:]
>> Say one had a routine
>> which checked whether a pointer was valid before dereferencing it,
>> along the lines of
>> if (p != NULL) {
>> /* work with the object at p */
>> }
> Sigh. Why would you (or anyone) ever do this? ...

Others have discussed why lots of us would do this. Note
that after the initial check, for as far into "/* work ... */" as
the first potential assignment to "p" [very commonly the whole
statement], no further checks need to be made; IOW, in typical
uses, /no/ run-time checks at all need be added by the compiler.

In similar vein, in Algol's "destination := source", it is
required [Revised Report 5.2.1.2b] that "destination" be not "NIL"
and that "source" be not newer in scope than "destination". Taken
literally, that implies that extravagant checking be done on very
assignment. In Real Life, both checks are commonly trivial to
optimise away at compilation time, and where they aren't they enable
some nasty bugs to be detected as they happen, not much later when
the program eventually fails.

[...]
> If you can't use the language to program the hardware, then there is no
> point in using the language at all. If you don't understand this,
> you're clearly not familiar Pascal. Pascal effectively died because it
> was unable to be used to directly program the hardware or access memory
> outside of the scope of the language, e.g., memory-mapped devices,
> because it lacked pointers.

Almost by definition, no high-level language can be used
/portably/ to program the hardware. You need either or both of an
escape into a lower-level language or a [portable?] library of ways
to access the hardware. But many of us write programs that never
need to program the hardware. For the past ~40 years, I've been
quite content to let others worry about the hardware; I just write
programs in C, Algol, Java, yes even Pascal, and a dozen+ scripting
languages [HTML, Sh, Sed, Awk, CAL, ...]. The idea that there's
no point to any of this is manifestly absurd. Rather the opposite;
for as long as "we" [FSVO] had to worry about the hardware, it was
a sign that computers had not yet matured.

As for Pascal, I don't believe that's the reason it died.
See Brian Kernighan's polemic for better reasons.

--
Andy Walker, Nottingham.
Andy's music pages: www.cuboid.me.uk/andy/Music
Composer of the day: www.cuboid.me.uk/andy/Music/Composers/Rubinstein

Re: Pointer to address zero

<sdhhr0$60b$1@dont-email.me>

  copy mid

https://www.rocksolidbbs.com/devel/article-flat.php?id=294&group=comp.lang.misc#294

  copy link   Newsgroups: comp.lang.misc
Path: i2pn2.org!i2pn.org!eternal-september.org!reader02.eternal-september.org!.POSTED!not-for-mail
From: james.harris.1@gmail.com (James Harris)
Newsgroups: comp.lang.misc
Subject: Re: Pointer to address zero
Date: Sat, 24 Jul 2021 18:15:12 +0100
Organization: A noiseless patient Spider
Lines: 38
Message-ID: <sdhhr0$60b$1@dont-email.me>
References: <sd71dn$pil$1@dont-email.me> <sd731k$1b50$1@gioia.aioe.org>
<sd7cck$btm$1@dont-email.me> <sd9jbm$efc$1@gioia.aioe.org>
<sda39h$kr7$1@dont-email.me> <sdap3d$1llt$2@gioia.aioe.org>
<sde6tv$mdt$1@dont-email.me> <sdgs19$19q0$1@gioia.aioe.org>
<sdh2o6$rst$1@dont-email.me> <sdh4jt$82o$1@dont-email.me>
Mime-Version: 1.0
Content-Type: text/plain; charset=utf-8; format=flowed
Content-Transfer-Encoding: 7bit
Injection-Date: Sat, 24 Jul 2021 17:15:12 -0000 (UTC)
Injection-Info: reader02.eternal-september.org; posting-host="be68ed00ca90ea05033faa575b8cf5f8";
logging-data="6155"; mail-complaints-to="abuse@eternal-september.org"; posting-account="U2FsdGVkX1/gujNaenxhDVwbDFr2Of08olp4pOG9BZ0="
User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:78.0) Gecko/20100101
Thunderbird/78.11.0
Cancel-Lock: sha1:5AfI/Qqfw48lnG9wVrUrdQyYF/I=
In-Reply-To: <sdh4jt$82o$1@dont-email.me>
Content-Language: en-GB
 by: James Harris - Sat, 24 Jul 2021 17:15 UTC

On 24/07/2021 14:29, David Brown wrote:

....

> So IMHO it makes sense for a language to support /both/ concepts of an
> always valid pointer, and of optionally valid pointers.

OK.

....

>
> I'd also avoid "if (p != NULL)", and prefer "if (p)" or "if (valid(p))",

Agreed. But if you reject the test

p != NULL

then presumably you also reject the assignment

p = NULL;

If so, what's your preferred way to make a pointer invalid?

> making it clearer that you are checking for the validity of the pointer
> rather than for it happening to match a particular value. (In C, an
> implementation can have more than one null pointer, and "if (p != NULL)"
> actually checks for any of them - something that is not apparent from
> the syntax.)
>

That's surprising!

--
James Harris

Re: Pointer to address zero

<sdhilf$b5f$1@dont-email.me>

  copy mid

https://www.rocksolidbbs.com/devel/article-flat.php?id=295&group=comp.lang.misc#295

  copy link   Newsgroups: comp.lang.misc
Path: i2pn2.org!i2pn.org!eternal-september.org!reader02.eternal-september.org!.POSTED!not-for-mail
From: james.harris.1@gmail.com (James Harris)
Newsgroups: comp.lang.misc
Subject: Re: Pointer to address zero
Date: Sat, 24 Jul 2021 18:29:18 +0100
Organization: A noiseless patient Spider
Lines: 56
Message-ID: <sdhilf$b5f$1@dont-email.me>
References: <sd71dn$pil$1@dont-email.me> <sd731k$1b50$1@gioia.aioe.org>
<sd7cck$btm$1@dont-email.me> <sd8hqm$7c2$1@gioia.aioe.org>
<sd9l7n$geq$1@dont-email.me> <sd9mdv$aep$1@gioia.aioe.org>
Mime-Version: 1.0
Content-Type: text/plain; charset=utf-8; format=flowed
Content-Transfer-Encoding: 8bit
Injection-Date: Sat, 24 Jul 2021 17:29:19 -0000 (UTC)
Injection-Info: reader02.eternal-september.org; posting-host="be68ed00ca90ea05033faa575b8cf5f8";
logging-data="11439"; mail-complaints-to="abuse@eternal-september.org"; posting-account="U2FsdGVkX1+BUrPcOvmbk269DIVoDf7nCjqqciHxMHk="
User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:78.0) Gecko/20100101
Thunderbird/78.11.0
Cancel-Lock: sha1:7H5snG53DQva9NmVxOIH66x4RSs=
In-Reply-To: <sd9mdv$aep$1@gioia.aioe.org>
Content-Language: en-GB
 by: James Harris - Sat, 24 Jul 2021 17:29 UTC

On 21/07/2021 18:44, Dmitry A. Kazakov wrote:
> On 2021-07-21 19:24, James Harris wrote:
>
>> I said /compiled/ code, not program source. In the /source/ the
>> programmer would still be able to write tests akin to
>>
>>    if p != null
>>
>> I may even allow
>>
>>    if p
>>
>> as meaning the same as the above though I guess some folk (e.g.
>> Dmitry?) will not like the idea of treating a pointer as a boolean.
>>
>> As I say, compiled code often uses address zero as a null pointer,
>
> No, it uses the representation of null, whatever it be [*].

You are trying to correct a correct statement.

>
> Furthermore, in a decent language with memory pools support each pool
> could have its own null.

This is a rather mechanical view, coming from you. I would have thought
you would prefer each /type/ to have its own version of null, especially
given your focus on nominal rather than structural type systems!

>
>>> That thing is called the type.
>>
>> OK ... then what, to you, distinguishes them? Alignment? Range?
>> History? Something else?
>
> https://en.wikipedia.org/wiki/Nominal_type_system

Aka different name for the same thing. ;-)

>
> -------------------------
> In an advanced language pointer comparisons could be non-trivial, e.g.
> when two pointers indicate different classes of the same object under
> multiple inheritance. In that case memory representations of p and q
> could be different, yet semantically p = q because both ultimately point
> to the same object [provided, the language lets p and q be comparable].
> An implementation would convert both p and q to the pointers of specific
> type and then compare these.
>

OK.

--
James Harris

Re: Pointer to address zero

<sdhjfe$dp3$1@gioia.aioe.org>

  copy mid

https://www.rocksolidbbs.com/devel/article-flat.php?id=296&group=comp.lang.misc#296

  copy link   Newsgroups: comp.lang.misc
Path: i2pn2.org!i2pn.org!aioe.org!N/bBT90+fJ5f2hH/+d3Lnw.user.46.165.242.91.POSTED!not-for-mail
From: mailbox@dmitry-kazakov.de (Dmitry A. Kazakov)
Newsgroups: comp.lang.misc
Subject: Re: Pointer to address zero
Date: Sat, 24 Jul 2021 19:43:11 +0200
Organization: Aioe.org NNTP Server
Message-ID: <sdhjfe$dp3$1@gioia.aioe.org>
References: <sd71dn$pil$1@dont-email.me> <sd731k$1b50$1@gioia.aioe.org>
<sd7cck$btm$1@dont-email.me> <sd8hqm$7c2$1@gioia.aioe.org>
<sd9l7n$geq$1@dont-email.me> <sd9mdv$aep$1@gioia.aioe.org>
<sdhilf$b5f$1@dont-email.me>
Mime-Version: 1.0
Content-Type: text/plain; charset=utf-8; format=flowed
Content-Transfer-Encoding: 7bit
Injection-Info: gioia.aioe.org; logging-data="14115"; posting-host="N/bBT90+fJ5f2hH/+d3Lnw.user.gioia.aioe.org"; mail-complaints-to="abuse@aioe.org";
User-Agent: Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:78.0) Gecko/20100101
Thunderbird/78.12.0
X-Notice: Filtered by postfilter v. 0.9.2
Content-Language: en-US
 by: Dmitry A. Kazakov - Sat, 24 Jul 2021 17:43 UTC

On 2021-07-24 19:29, James Harris wrote:
> On 21/07/2021 18:44, Dmitry A. Kazakov wrote:

>> Furthermore, in a decent language with memory pools support each pool
>> could have its own null.
>
> This is a rather mechanical view, coming from you. I would have thought
> you would prefer each /type/ to have its own version of null, especially
> given your focus on nominal rather than structural type systems!

Null is a value. Each type has values. Values of one type are not values
of another. So, yes, each pointer type has null values of its own.

But you asked about representations of such values. It is possible that
representations differ too.

>> https://en.wikipedia.org/wiki/Nominal_type_system
>
> Aka different name for the same thing. ;-)

How do you know if the thing is same? Is there a serial number on the
back side?

--
Regards,
Dmitry A. Kazakov
http://www.dmitry-kazakov.de

Re: Pointer to address zero

<sdhjus$jm8$1@dont-email.me>

  copy mid

https://www.rocksolidbbs.com/devel/article-flat.php?id=297&group=comp.lang.misc#297

  copy link   Newsgroups: comp.lang.misc
Path: i2pn2.org!i2pn.org!eternal-september.org!reader02.eternal-september.org!.POSTED!not-for-mail
From: james.harris.1@gmail.com (James Harris)
Newsgroups: comp.lang.misc
Subject: Re: Pointer to address zero
Date: Sat, 24 Jul 2021 18:51:24 +0100
Organization: A noiseless patient Spider
Lines: 89
Message-ID: <sdhjus$jm8$1@dont-email.me>
References: <sd71dn$pil$1@dont-email.me> <sda87i$kmn$1@dont-email.me>
Mime-Version: 1.0
Content-Type: text/plain; charset=utf-8; format=flowed
Content-Transfer-Encoding: 8bit
Injection-Date: Sat, 24 Jul 2021 17:51:24 -0000 (UTC)
Injection-Info: reader02.eternal-september.org; posting-host="be68ed00ca90ea05033faa575b8cf5f8";
logging-data="20168"; mail-complaints-to="abuse@eternal-september.org"; posting-account="U2FsdGVkX19N7GhvBFedl8nf+lioHCwBsguVrfaeDtA="
User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:78.0) Gecko/20100101
Thunderbird/78.11.0
Cancel-Lock: sha1:ITX6JLoF4ux1w25YIum7ABteGdw=
In-Reply-To: <sda87i$kmn$1@dont-email.me>
Content-Language: en-GB
 by: James Harris - Sat, 24 Jul 2021 17:51 UTC

On 21/07/2021 23:48, Bart wrote:
> On 20/07/2021 18:33, James Harris wrote:
>> For my OS project I have been looking at program loading and that has
>> led me to query what would be required in a language to support
>> address zero being accessible and a pointer to it being considered to
>> be valid.
>>
>> If I use paging I can reserve the lowest page so that address zero is
>> inaccessible. A dereference of a zeroed pointer would be caught by the
>> CPU triggering a fault.
>>
>> However, if on x86 I don't use paging then a reference to address zero
>> would not trigger a fault so it would not be caught and diagnosed.
>
> How is that different from any other access to invalid memory? Or a
> access of address 1 (or 8 if aligned)?

I cannot quite work out what you are asking. Could you say more?

>
>> And it's not just that case. Other CPUs or microcontrollers may,
>> presumably, allow access to address zero. Therefore there are cases
>> where a program may have a pointer to address zero and that pointer
>> could be legitimate.
>>
>> Hence the question: how should a language support access to address
>> zero? Any ideas?
>
> If the hardware allows a meaningful dereference to address 0, and you
> need to have your HLL access that same location via a pointer, then you
> need to make it possible.

Noted.

>
> However if zero is also used for a null pointer value, so that for
> example P=null means that P has not been asigned to anything, then that
> might interfere with that,
>
> Then you might look at using an alternate representation for a null
> pointer value.
>
> Personally, I'd just make the first few bytes of memory special. Make
> sure address 0 never occurs as a heap allocation, and rarely comes up as
> the address of an object in the HLL.

There are two issues with that.

First, dereferences of address zero can be /automatically/ checked for
validity by the hardware if one is using paging but if paging is not
enabled then address zero cannot be checked automatically and, crucially
for this thread, that would mean that the same executable would run
differently on the two systems - which I want to avoid.

By contrast, and, again, with a given executable, an address /above/ the
program's accessible address space would trigger an exception whether
paging was enabled or not. IOW if I use a high address rather than a low
one then my compiled code can omit checks for null and would work the
same way in both environments.

Second, there are CPUs which, rather unwelcomely, use signed addresses.
For them, a 16-bit address, say, will use the range -32768 to 32767 and
address zero could easily be part of the heap.

>
> My latest language has  'nil' value for pointers (you can't use 0),
> whose value is not specified. But it is generally understood it is all
> zeros.

That sounds good. A minor point but why use the keyword nil rather than
null?

>
> That means that data structures that exist in the zero-data segment
> (BSS?) will be guaranteed to have any embedded pointers set to nil.

The issue with that is that it could run into problems if nil ever
happened to be something other than zero, couldn't it?

>
> Just stick something at address 0 that is not going to be dereferenced
> via a HLL pointer. But if you really need to access that location, then
> just do it.

--
James Harris

Re: Pointer to address zero

<sdi0ro$7lu$1@dont-email.me>

  copy mid

https://www.rocksolidbbs.com/devel/article-flat.php?id=298&group=comp.lang.misc#298

  copy link   Newsgroups: comp.lang.misc
Path: i2pn2.org!i2pn.org!eternal-september.org!reader02.eternal-september.org!.POSTED!not-for-mail
From: bc@freeuk.com (Bart)
Newsgroups: comp.lang.misc
Subject: Re: Pointer to address zero
Date: Sat, 24 Jul 2021 22:31:21 +0100
Organization: A noiseless patient Spider
Lines: 110
Message-ID: <sdi0ro$7lu$1@dont-email.me>
References: <sd71dn$pil$1@dont-email.me> <sda87i$kmn$1@dont-email.me>
<sdhjus$jm8$1@dont-email.me>
Mime-Version: 1.0
Content-Type: text/plain; charset=utf-8; format=flowed
Content-Transfer-Encoding: 8bit
Injection-Date: Sat, 24 Jul 2021 21:31:36 -0000 (UTC)
Injection-Info: reader02.eternal-september.org; posting-host="ecc6c3c2b8f9d46aead64b7c8bfad8b4";
logging-data="7870"; mail-complaints-to="abuse@eternal-september.org"; posting-account="U2FsdGVkX18cEOHOYFZkJig21aJWZ05dgtebx/oR3ok="
User-Agent: Mozilla/5.0 (Windows NT 6.1; Win64; x64; rv:78.0) Gecko/20100101
Thunderbird/78.11.0
Cancel-Lock: sha1:MQcE93QTJoxkii13PSAJ5Yp0Le8=
In-Reply-To: <sdhjus$jm8$1@dont-email.me>
X-Antivirus-Status: Clean
Content-Language: en-GB
X-Antivirus: AVG (VPS 210724-4, 24/7/2021), Outbound message
 by: Bart - Sat, 24 Jul 2021 21:31 UTC

On 24/07/2021 18:51, James Harris wrote:
> On 21/07/2021 23:48, Bart wrote:
>> On 20/07/2021 18:33, James Harris wrote:
>>> For my OS project I have been looking at program loading and that has
>>> led me to query what would be required in a language to support
>>> address zero being accessible and a pointer to it being considered to
>>> be valid.
>>>
>>> If I use paging I can reserve the lowest page so that address zero is
>>> inaccessible. A dereference of a zeroed pointer would be caught by
>>> the CPU triggering a fault.
>>>
>>> However, if on x86 I don't use paging then a reference to address
>>> zero would not trigger a fault so it would not be caught and diagnosed.
>>
>> How is that different from any other access to invalid memory? Or a
>> access of address 1 (or 8 if aligned)?
>
> I cannot quite work out what you are asking. Could you say more?

Suppose you are accessing u16 value at address 0, occupying addresses 0
and 1. That access is illegal, but what about the accessing the upper
byte separately, at address 1?

What I'm really saying is there will be lots of memory addresses that
are not valid; why make address 0 special compared with any of those
(including address 1), when trying to detect an illegal access.

The difference is that an arbitrary address is usually due to some bug,
while address 0 can be deliberately stored in a pointer.

The purpose may be for the software to check that a pointer is in use; I
don't think it's that critical for a runtime or hardware check for
acessing address zero. But it would be useful while debugging.

>> Personally, I'd just make the first few bytes of memory special. Make
>> sure address 0 never occurs as a heap allocation, and rarely comes up
>> as the address of an object in the HLL.
>
> There are two issues with that.
>
> First, dereferences of address zero can be /automatically/ checked for
> validity by the hardware if one is using paging but if paging is not
> enabled then address zero cannot be checked automatically and, crucially
> for this thread, that would mean that the same executable would run
> differently on the two systems - which I want to avoid.
>
> By contrast, and, again, with a given executable, an address /above/ the
> program's accessible address space would trigger an exception whether
> paging was enabled or not.

An address can be within the program's data space, but can still be
wrong: pointing at the wrong object, or inside an object, or spanning
two objects, or at some unused gap.

This is the same point I made above really. While a pointer value of
null can be easy to check even in software, an invalid one is harder.

But this is also depends on the language: how easy is it for a user
program to allow some random number to be stored in a pointer? A
lower-level one like C makes it very easy. (Or like mine, but it makes
it a little bit harder!)

> IOW if I use a high address rather than a low
> one then my compiled code can omit checks for null and would work the
> same way in both environments.
>
> Second, there are CPUs which, rather unwelcomely, use signed addresses.
> For them, a 16-bit address, say, will use the range -32768 to 32767 and
> address zero could easily be part of the heap.

Which ones are those? (So I can make a note to never use them!)

A language could take care of that aspect (so programs see an address
space of 0 to 65535) but that comes at a cost.

However, remember that's C pandering to weird hardware that no one is
ever going to encounter in real life is probably the cause of half of
its UBs.

>
>>
>> My latest language has  'nil' value for pointers (you can't use 0),
>> whose value is not specified. But it is generally understood it is all
>> zeros.
>
> That sounds good. A minor point but why use the keyword nil rather than
> null?

I think that was copied from Pascal which used 'nil'. I didn't encounter
C until over a decade later.

>>
>> That means that data structures that exist in the zero-data segment
>> (BSS?) will be guaranteed to have any embedded pointers set to nil.
>
> The issue with that is that it could run into problems if nil ever
> happened to be something other than zero, couldn't it?

That's why you should strive to have null as all zeros if possible. In
the same you try to have have all zeros also for integer 0, or float 0.0.

And I still, now, when creating sets of enums, often arrange to have the
first have a value of 0, meaning no-value or not-set, so that when it is
used as a tag (in a manually tagged union), then zeroed data won't have
erroneous values. (As might happen if enum 0 means another field is
expected to have a certain set-up.)

Re: Pointer to address zero

<sdi8cv$jpj$1@dont-email.me>

  copy mid

https://www.rocksolidbbs.com/devel/article-flat.php?id=299&group=comp.lang.misc#299

  copy link   Newsgroups: comp.lang.misc
Path: i2pn2.org!i2pn.org!eternal-september.org!reader02.eternal-september.org!.POSTED!not-for-mail
From: bc@freeuk.com (Bart)
Newsgroups: comp.lang.misc
Subject: Re: Pointer to address zero
Date: Sun, 25 Jul 2021 00:40:00 +0100
Organization: A noiseless patient Spider
Lines: 83
Message-ID: <sdi8cv$jpj$1@dont-email.me>
References: <sd71dn$pil$1@dont-email.me> <sd731k$1b50$1@gioia.aioe.org>
<sd7cck$btm$1@dont-email.me> <sd9jbm$efc$1@gioia.aioe.org>
<sda39h$kr7$1@dont-email.me> <sdap3d$1llt$2@gioia.aioe.org>
<sde6tv$mdt$1@dont-email.me> <sdgs19$19q0$1@gioia.aioe.org>
<sdh2o6$rst$1@dont-email.me> <sdh4jt$82o$1@dont-email.me>
Mime-Version: 1.0
Content-Type: text/plain; charset=utf-8; format=flowed
Content-Transfer-Encoding: 7bit
Injection-Date: Sat, 24 Jul 2021 23:40:15 -0000 (UTC)
Injection-Info: reader02.eternal-september.org; posting-host="2db3e2bad5fe81ab2c04ae0af444b35b";
logging-data="20275"; mail-complaints-to="abuse@eternal-september.org"; posting-account="U2FsdGVkX19RLaBWMRursOecZAnhPFQmx3SCgoxNVsc="
User-Agent: Mozilla/5.0 (Windows NT 6.1; Win64; x64; rv:78.0) Gecko/20100101
Thunderbird/78.11.0
Cancel-Lock: sha1:pAntAF4apPLMNk8+vFLnqfRLwrs=
In-Reply-To: <sdh4jt$82o$1@dont-email.me>
X-Antivirus-Status: Clean
Content-Language: en-GB
X-Antivirus: AVG (VPS 210724-4, 24/7/2021), Outbound message
 by: Bart - Sat, 24 Jul 2021 23:40 UTC

On 24/07/2021 14:29, David Brown wrote:
> On 24/07/2021 14:57, Bart wrote:

>> Actually, how would you code a linked list without using some sentinel
>> to mark the end of the list?
>>
>> The use of NULL like this is EVERYWHERE and in every API.
>>
>> For example the return value of C's fopen() function is NULL when the
>> operation failed.
>>
>
> It is useful for a language to have a concept of pointers or references
> that are guaranteed valid - then you don't need to check them like this.

How does that work for my linked list example?

> It is also useful for a language to have a concept of "optional" values.
> That is, a way of saying "x is either NULL or a valid value of type T".
> Sometimes this is so useful that it applies to all types (like SQL).
> Sometimes you force the programmer to do this manually, like in C using
> "struct maybe_int { bool valid; int x; };". Sometimes you make it a
> convenient part of the standard library, like C++ "std::optional<int>".
> Sometimes you have it through summation or algebraic types fully
> supported in the language, like Haskell "data Maybeint = Invalid | Int".

So how does Haskell distinguish, internally, between an Int value, and
an Invalid type?

> There is no simple and efficient way to do this for simple types - for
> an integer, you either have to sacrifice a valid integer value, or you
> have to add an extra boolean flag to go with it.

Exactly.

> Sacrificing a value
> makes your arithmetic coding a lot more complicated and inefficient.
> But for pointers, sacrificing a value to use as an "invalid" indicator
> is cheap and easy, and the gains are certainly worth it. The most
> efficient "invalid" value to use is 0, since it is quick and easy to
> test. An alternative worth considering is to use the highest bit to
> indicate invalid - cutting half your address space is often not a
> problem, you can use a pointer to address 0, and you can represent many
> different invalid values.

The interesting invalid values are the ones in your 'valid' range. Just
randomly pointing somewhere in your memory doesn't mean the pointer is
any good!

>
> So IMHO it makes sense for a language to support /both/ concepts of an
> always valid pointer, and of optionally valid pointers.
>
> I'd also suggest pointers being viewed as a lot more general than just
> holding an address, allowing for references, weak references, shared
> pointers, and other ways of referring to objects.

I'd rather keep them as simple as possible, and preferably use them as
little as possible too.

(I implement some of those things, but transparently so you aren't even
aware of pointers being used.)

> I'd also avoid "if (p != NULL)", and prefer "if (p)" or "if (valid(p))",
> making it clearer that you are checking for the validity of the pointer

What pointer? Because if writing 'if (X)', then X can be anything. Write
'if (X==NULL)', and you can assume (in C) or know (in mine) that X is a
pointer.

> rather than for it happening to match a particular value. (In C, an
> implementation can have more than one null pointer, and "if (p != NULL)"
> actually checks for any of them - something that is not apparent from
> the syntax.)

Never heard of that. And it also sounds a nightmare to implement. How
many kinds of pointer are we talking about? Which version of NULL do you
get when you do p = NULL?

Pages:1234
server_pubkey.txt

rocksolid light 0.9.81
clearnet tor