Message-ID:

19 May, 2024: Line wrapping has been changed to be more consistent with Usenet standards.
If you find that it is broken please let me know here rocksolid.nodes.help

devel / comp.lang.c / Re: C vs Haskell for XML parsing

Re: C vs Haskell for XML parsing

<1e79f8a1-b707-4074-b272-ce4327ee7bc0n@googlegroups.com>

https://www.rocksolidbbs.com/devel/article-flat.php?id=29520&group=comp.lang.c#29520

X-Received: by 2002:a05:620a:c4c:b0:765:a4f2:51ec with SMTP id u12-20020a05620a0c4c00b00765a4f251ecmr212445qki.4.1692802119388;
Wed, 23 Aug 2023 07:48:39 -0700 (PDT)
X-Received: by 2002:a17:903:22c1:b0:1bd:df9a:4fbd with SMTP id
y1-20020a17090322c100b001bddf9a4fbdmr5538375plg.11.1692802119087; Wed, 23 Aug
2023 07:48:39 -0700 (PDT)
Path: i2pn2.org!i2pn.org!usenet.blueworldhosting.com!diablo1.usenet.blueworldhosting.com!peer03.iad!feed-me.highwinds-media.com!news.highwinds-media.com!news-out.google.com!nntp.google.com!postnews.google.com!google-groups.googlegroups.com!not-for-mail
Newsgroups: comp.lang.c
Date: Wed, 23 Aug 2023 07:48:38 -0700 (PDT)
In-Reply-To: <uc4e4t$2rdlt$1@dont-email.me>
Injection-Info: google-groups.googlegroups.com; posting-host=2a00:23a8:400a:5601:2c26:aadb:e9db:4185;
posting-account=Dz2zqgkAAADlK5MFu78bw3ab-BRFV4Qn
NNTP-Posting-Host: 2a00:23a8:400a:5601:2c26:aadb:e9db:4185
References: <576801fa-2842-40dc-bf19-221a5b1cf660n@googlegroups.com>
<ubi7hd$38q7d$1@dont-email.me> <87o7j6fu74.fsf@bsb.me.uk> <37f1a926-972c-42c8-a276-8d3f6457ccb8n@googlegroups.com>
<877cptgbli.fsf@bsb.me.uk> <250cc72c-f682-4986-96bd-80011967c8dbn@googlegroups.com>
<87o7j4vt6r.fsf@bsb.me.uk> <cb35076d-f8ec-441c-a963-7077bd5f884cn@googlegroups.com>
<87jztqvhwf.fsf@bsb.me.uk> <7f9fbbd6-7f5c-4e12-a73b-c9abe91b7f5bn@googlegroups.com>
<87jztpu2iu.fsf@bsb.me.uk> <610a41a0-a3a3-4e01-a9a7-8b5e1fe31ec0n@googlegroups.com>
<87350dtive.fsf@bsb.me.uk> <ubvan6$1rb3s$1@dont-email.me> <3c87ec37-8fe1-4171-9500-609fad6701b7n@googlegroups.com>
<ubvo4d$1tm0p$1@dont-email.me> <e9853969-42ce-48db-81e1-d37c8e4da59dn@googlegroups.com>
<uc28id$2dc7f$1@dont-email.me> <b21393a6-c4f5-436a-9975-8ffedd6bf20bn@googlegroups.com>
<uc2dbv$2e4tg$1@dont-email.me> <d734d616-b18e-4e67-b858-f0eb0a636a87n@googlegroups.com>
<uc2qnl$2gh96$1@dont-email.me> <d651e08e-033d-4a90-8477-6a5fa13d30f3n@googlegroups.com>
<uc4e4t$2rdlt$1@dont-email.me>
User-Agent: G2/1.0
MIME-Version: 1.0
Message-ID: <1e79f8a1-b707-4074-b272-ce4327ee7bc0n@googlegroups.com>
Subject: Re: C vs Haskell for XML parsing
From: malcolm.arthur.mclean@gmail.com (Malcolm McLean)
Injection-Date: Wed, 23 Aug 2023 14:48:39 +0000
Content-Type: text/plain; charset="UTF-8"
X-Received-Bytes: 7614

by: Malcolm McLean - Wed, 23 Aug 2023 14:48 UTC

On Wednesday, 23 August 2023 at 08:57:32 UTC+1, David Brown wrote:
> On 23/08/2023 06:59, Malcolm McLean wrote:
>
> All I can tell you here is that /I/ would not expect a "determinant"
> function to trash the matrix I pass to it. If /I/ were writing a
> determinate function, I would do so in a way that did not trash the
> caller's data, and did not take noticeably longer. And if I were
> convinced that some operations, such as this one, were significantly
> more efficient without copying, and destruction was fine for a
> significant proportion of use-cases, I'd make an explicit
> "determinant_destructive" version. The non-destructive version would,
> of course, take a const pointer.
I don't need the matrix after taking the determinant. And at the moment, I
don't have another use for the function. There's a trade-off between speed
and good interfaces, and I might change it to retain the matrix. There might
also be a faster way of obtaining the information I need. My own
mathematical ability is also a constraint.
>
> In no circumstances would I make a function that left the caller's data
> "corrupt" or "indeterminate".
>
> (Oh, and I'd write it in C++ to give a much better user API here. C's
> great for some things - but it's not the best choice for everything.)
>
No, it's far better in C. In C++ it would pull in a "Matrix" class, and then
it's totally unusable in another program unless someone wants to take
that entire apparatus.
>
> > No, it's a known problem.
> I note you haven't actually looked at references for it. It is a known
> /consideration/ - not a known /problem/. When you hit a row with a zero
> on the diagonal, you simply swap it with a row further down that does
> not have zero in that column. Swapping rows multiplies the determinant
> by -1. If there is no such row, the determinant is 0. (In fact, it is
> common to swap rows for Gaussian elimination anyway to improve numerical
> stability. But that is definitely off-topic here.)
>
My matrix will occasionally have a zero in the top left hand cell. (It's a Sylvester
matrix giving the coefficients of two simulataneous equations, and the
determinant represents the product of the difference of the roots, plus
a term based on the highest coefficents. If the determiant is zero or so close
to zero that it represents machine precision problems, then the equations
have a root in common, which is what I'm interested in. However because of
the factor based on the highest coefficients, it's useless to me if the top left
is zero. However I have to do something.
>
> Do you really believe you are making a sound argument here? Or do you
> realise that you are conflating completely different concepts? I'm
> trying to think of a single example in this thread where you have
> actually addressed the question, and actually justified your decisions.
> But it's just a field of straw men fishing for red herrings.
>
If you provide aboolean type then you encourage people to write functions
that take boolean parameters. Of course intelligent people like you and I
can see, as soon as the problem is pointed out, that this isn't a good idea,
and will have the self-discipline to write enums. But a lot of people won't.
So it's better not to have a boolean type. Of course they can still pass
"1" or "0" in an int, but at least this isn't beign actively encouraged.
> > So bool is pretty useless and we're better off without it.
> No, bool is pretty useful and we are better off having it.
>
It's useful where a function returns a boolean true / false value whose
meaning is obvious from the function definition. And it is useful in
marking fields of structures which are logically true / false (though here
it can be a rather extravagant use of memory and you might be better
off with bitfields).
But these are minor aids to readability, whilst boolean parameters are
major impediments to readability.
The there's the if (value == true) problem.
>
> Are you /seriously/ suggesting that "readfilefromdisk" is easy to read?
> Better than, say, "read_file_from_disk" or "readFileFromDisk" ? (Not
> that I think the camel-case version is particularly easy to read here -
> but it is a world better than your choice of jumble.)
>
In programs, yes. Because you have many identifiers all jostling for attention.
As a single word in flowing lowercase text, the decorated version is
maybe easier to read.
> > In fact, as you
> > are obviously not aware, scripto continua was the norm for ancient manuscripts..
> I am entirely aware of that. I have not studied such things
> academically, but I have a far above average interest in history and
> writing systems. Are you aware of /why/ ancient manuscripts (and other
> old writing) was regularly written without spacing? I'll give you a
> clue - it was /not/ in order to make the text easier to read.
>
I'm not arguing that text without spaces is easier to read than text with
spaces. However spaces in C identifiers are not allowed. The fact that
for many hundereds of years people wrote text without spaces tells you
that, though it's less readable than text with spaces, it's not that difficult
to read.
>
> > A const void * is not an opaque pointer. We can say something about how the
> > called function will handle the data it points to.
> Yes - that's a good thing.
But it's not an opaque pointer any more.

Re: C vs Haskell for XML parsing

<uc5783$2vd55$1@dont-email.me>

copy mid

https://www.rocksolidbbs.com/devel/article-flat.php?id=29521&group=comp.lang.c#29521

copy link Newsgroups: comp.lang.c

Path: i2pn2.org!i2pn.org!eternal-september.org!news.eternal-september.org!.POSTED!not-for-mail
From: bc@freeuk.com (Bart)
Newsgroups: comp.lang.c
Subject: Re: C vs Haskell for XML parsing
Date: Wed, 23 Aug 2023 16:05:39 +0100
Organization: A noiseless patient Spider
Lines: 19
Message-ID: <uc5783$2vd55$1@dont-email.me>
References: <576801fa-2842-40dc-bf19-221a5b1cf660n@googlegroups.com>
<877cptgbli.fsf@bsb.me.uk>
<250cc72c-f682-4986-96bd-80011967c8dbn@googlegroups.com>
<87o7j4vt6r.fsf@bsb.me.uk>
<cb35076d-f8ec-441c-a963-7077bd5f884cn@googlegroups.com>
<87jztqvhwf.fsf@bsb.me.uk>
<7f9fbbd6-7f5c-4e12-a73b-c9abe91b7f5bn@googlegroups.com>
<87jztpu2iu.fsf@bsb.me.uk>
<610a41a0-a3a3-4e01-a9a7-8b5e1fe31ec0n@googlegroups.com>
<87350dtive.fsf@bsb.me.uk> <ubvan6$1rb3s$1@dont-email.me>
<3c87ec37-8fe1-4171-9500-609fad6701b7n@googlegroups.com>
<ubvo4d$1tm0p$1@dont-email.me>
<e9853969-42ce-48db-81e1-d37c8e4da59dn@googlegroups.com>
<uc28id$2dc7f$1@dont-email.me>
<b21393a6-c4f5-436a-9975-8ffedd6bf20bn@googlegroups.com>
<uc2dbv$2e4tg$1@dont-email.me>
<d734d616-b18e-4e67-b858-f0eb0a636a87n@googlegroups.com>
<uc2qnl$2gh96$1@dont-email.me>
<d651e08e-033d-4a90-8477-6a5fa13d30f3n@googlegroups.com>
<uc4e4t$2rdlt$1@dont-email.me>
<1e79f8a1-b707-4074-b272-ce4327ee7bc0n@googlegroups.com>
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8; format=flowed
Content-Transfer-Encoding: 7bit
Injection-Date: Wed, 23 Aug 2023 15:05:39 -0000 (UTC)
Injection-Info: dont-email.me; posting-host="021d6a6f9d48c197b9a50b52f6befac6";
logging-data="3126437"; mail-complaints-to="abuse@eternal-september.org"; posting-account="U2FsdGVkX19oE5gF9yvs2ljjNhoBt88dLMdZD+gPyCc="
User-Agent: Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:102.0) Gecko/20100101
Thunderbird/102.14.0
Cancel-Lock: sha1:TsoSDamsp45A83eSbxs7cZIqu3k=
In-Reply-To: <1e79f8a1-b707-4074-b272-ce4327ee7bc0n@googlegroups.com>

by: Bart - Wed, 23 Aug 2023 15:05 UTC

On 23/08/2023 15:48, Malcolm McLean wrote:
> On Wednesday, 23 August 2023 at 08:57:32 UTC+1, David Brown wrote:
>> On 23/08/2023 06:59, Malcolm McLean wrote:

> I'm not arguing that text without spaces is easier to read than text with
> spaces. However spaces in C identifiers are not allowed. The fact that
> for many hundereds of years people wrote text without spaces

Where did they do that? I guess you'd not talking about programming syntax.

tells you
> that, though it's less readable than text with spaces, it's not that difficult
> to read.

It can sometimes lead to ambiguities when word boundaries are not marked.

Eg. #amazonshitcarshow

Re: C vs Haskell for XML parsing

<af0dbbdb-3f59-46e3-9d61-36c327a7d141n@googlegroups.com>

copy mid

https://www.rocksolidbbs.com/devel/article-flat.php?id=29522&group=comp.lang.c#29522

copy link Newsgroups: comp.lang.c

X-Received: by 2002:a05:6214:ab4:b0:649:fc3d:7659 with SMTP id ew20-20020a0562140ab400b00649fc3d7659mr143633qvb.12.1692804114177;
Wed, 23 Aug 2023 08:21:54 -0700 (PDT)
X-Received: by 2002:a05:6a00:1791:b0:68a:49bc:e0b3 with SMTP id
s17-20020a056a00179100b0068a49bce0b3mr4884728pfg.2.1692804113907; Wed, 23 Aug
2023 08:21:53 -0700 (PDT)
Path: i2pn2.org!i2pn.org!usenet.blueworldhosting.com!diablo1.usenet.blueworldhosting.com!peer03.iad!feed-me.highwinds-media.com!news.highwinds-media.com!news-out.google.com!nntp.google.com!postnews.google.com!google-groups.googlegroups.com!not-for-mail
Newsgroups: comp.lang.c
Date: Wed, 23 Aug 2023 08:21:53 -0700 (PDT)
In-Reply-To: <uc5783$2vd55$1@dont-email.me>
Injection-Info: google-groups.googlegroups.com; posting-host=2a00:23a8:400a:5601:1c5f:c8af:ef99:3315;
posting-account=Dz2zqgkAAADlK5MFu78bw3ab-BRFV4Qn
NNTP-Posting-Host: 2a00:23a8:400a:5601:1c5f:c8af:ef99:3315
References: <576801fa-2842-40dc-bf19-221a5b1cf660n@googlegroups.com>
<877cptgbli.fsf@bsb.me.uk> <250cc72c-f682-4986-96bd-80011967c8dbn@googlegroups.com>
<87o7j4vt6r.fsf@bsb.me.uk> <cb35076d-f8ec-441c-a963-7077bd5f884cn@googlegroups.com>
<87jztqvhwf.fsf@bsb.me.uk> <7f9fbbd6-7f5c-4e12-a73b-c9abe91b7f5bn@googlegroups.com>
<87jztpu2iu.fsf@bsb.me.uk> <610a41a0-a3a3-4e01-a9a7-8b5e1fe31ec0n@googlegroups.com>
<87350dtive.fsf@bsb.me.uk> <ubvan6$1rb3s$1@dont-email.me> <3c87ec37-8fe1-4171-9500-609fad6701b7n@googlegroups.com>
<ubvo4d$1tm0p$1@dont-email.me> <e9853969-42ce-48db-81e1-d37c8e4da59dn@googlegroups.com>
<uc28id$2dc7f$1@dont-email.me> <b21393a6-c4f5-436a-9975-8ffedd6bf20bn@googlegroups.com>
<uc2dbv$2e4tg$1@dont-email.me> <d734d616-b18e-4e67-b858-f0eb0a636a87n@googlegroups.com>
<uc2qnl$2gh96$1@dont-email.me> <d651e08e-033d-4a90-8477-6a5fa13d30f3n@googlegroups.com>
<uc4e4t$2rdlt$1@dont-email.me> <1e79f8a1-b707-4074-b272-ce4327ee7bc0n@googlegroups.com>
<uc5783$2vd55$1@dont-email.me>
User-Agent: G2/1.0
MIME-Version: 1.0
Message-ID: <af0dbbdb-3f59-46e3-9d61-36c327a7d141n@googlegroups.com>
Subject: Re: C vs Haskell for XML parsing
From: malcolm.arthur.mclean@gmail.com (Malcolm McLean)
Injection-Date: Wed, 23 Aug 2023 15:21:54 +0000
Content-Type: text/plain; charset="UTF-8"
X-Received-Bytes: 3467

by: Malcolm McLean - Wed, 23 Aug 2023 15:21 UTC

On Wednesday, 23 August 2023 at 16:05:55 UTC+1, Bart wrote:
> On 23/08/2023 15:48, Malcolm McLean wrote:
> > On Wednesday, 23 August 2023 at 08:57:32 UTC+1, David Brown wrote:
> >> On 23/08/2023 06:59, Malcolm McLean wrote:
>
> > I'm not arguing that text without spaces is easier to read than text with
> > spaces. However spaces in C identifiers are not allowed. The fact that
> > for many hundereds of years people wrote text without spaces
> Where did they do that? I guess you'd not talking about programming syntax.
> tells you
>
The modern system of separating words by spaces was only introduced in
about the seventh century, by Irish monks. The earliest classical texts we
have use dots or lines to separate words, but fairly soon that was dropped
and all the words were simply run together. That was in use for about 2,000
years.
> > that, though it's less readable than text with spaces, it's not that difficult
> > to read.
> It can sometimes lead to ambiguities when word boundaries are not marked.
>
> Eg. #amazonshitcarshow
>
Yes. Spaces are better. Although spoken language doesn't usually have pauses
between words.
The question is what to do for C identifiers, where spaces are not allowed. The
scriptio continua system was used for a very long time, so it must have at least
something going for it.

Re: C vs Haskell for XML parsing

<uc5dd0$30jrk$1@dont-email.me>

copy mid

https://www.rocksolidbbs.com/devel/article-flat.php?id=29523&group=comp.lang.c#29523

copy link Newsgroups: comp.lang.c

Path: i2pn2.org!i2pn.org!eternal-september.org!news.eternal-september.org!.POSTED!not-for-mail
From: david.brown@hesbynett.no (David Brown)
Newsgroups: comp.lang.c
Subject: Re: C vs Haskell for XML parsing
Date: Wed, 23 Aug 2023 18:50:39 +0200
Organization: A noiseless patient Spider
Lines: 174
Message-ID: <uc5dd0$30jrk$1@dont-email.me>
References: <576801fa-2842-40dc-bf19-221a5b1cf660n@googlegroups.com>
<877cptgbli.fsf@bsb.me.uk>
<250cc72c-f682-4986-96bd-80011967c8dbn@googlegroups.com>
<87o7j4vt6r.fsf@bsb.me.uk>
<cb35076d-f8ec-441c-a963-7077bd5f884cn@googlegroups.com>
<87jztqvhwf.fsf@bsb.me.uk>
<7f9fbbd6-7f5c-4e12-a73b-c9abe91b7f5bn@googlegroups.com>
<87jztpu2iu.fsf@bsb.me.uk>
<610a41a0-a3a3-4e01-a9a7-8b5e1fe31ec0n@googlegroups.com>
<87350dtive.fsf@bsb.me.uk> <ubvan6$1rb3s$1@dont-email.me>
<3c87ec37-8fe1-4171-9500-609fad6701b7n@googlegroups.com>
<ubvo4d$1tm0p$1@dont-email.me>
<e9853969-42ce-48db-81e1-d37c8e4da59dn@googlegroups.com>
<uc28id$2dc7f$1@dont-email.me>
<b21393a6-c4f5-436a-9975-8ffedd6bf20bn@googlegroups.com>
<uc2dbv$2e4tg$1@dont-email.me>
<d734d616-b18e-4e67-b858-f0eb0a636a87n@googlegroups.com>
<uc2qnl$2gh96$1@dont-email.me>
<d651e08e-033d-4a90-8477-6a5fa13d30f3n@googlegroups.com>
<uc4e4t$2rdlt$1@dont-email.me>
<1e79f8a1-b707-4074-b272-ce4327ee7bc0n@googlegroups.com>
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8; format=flowed
Content-Transfer-Encoding: 8bit
Injection-Date: Wed, 23 Aug 2023 16:50:40 -0000 (UTC)
Injection-Info: dont-email.me; posting-host="73d2f5c07fe5b58e515376363a9f9029";
logging-data="3166068"; mail-complaints-to="abuse@eternal-september.org"; posting-account="U2FsdGVkX19Nc6eIk9oQESUNTvgAKfSKaD3+yISfTPg="
User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:102.0) Gecko/20100101
Thunderbird/102.9.0
Cancel-Lock: sha1:aP4CkGUlf15DBgJoirjUfm3JM+A=
Content-Language: en-GB
In-Reply-To: <1e79f8a1-b707-4074-b272-ce4327ee7bc0n@googlegroups.com>

by: David Brown - Wed, 23 Aug 2023 16:50 UTC

On 23/08/2023 16:48, Malcolm McLean wrote:
> On Wednesday, 23 August 2023 at 08:57:32 UTC+1, David Brown wrote:
>> On 23/08/2023 06:59, Malcolm McLean wrote:
>>
>> All I can tell you here is that /I/ would not expect a "determinant"
>> function to trash the matrix I pass to it. If /I/ were writing a
>> determinate function, I would do so in a way that did not trash the
>> caller's data, and did not take noticeably longer. And if I were
>> convinced that some operations, such as this one, were significantly
>> more efficient without copying, and destruction was fine for a
>> significant proportion of use-cases, I'd make an explicit
>> "determinant_destructive" version. The non-destructive version would,
>> of course, take a const pointer.
> I don't need the matrix after taking the determinant. And at the moment, I
> don't have another use for the function. There's a trade-off between speed
> and good interfaces, and I might change it to retain the matrix. There might
> also be a faster way of obtaining the information I need. My own
> mathematical ability is also a constraint.

An API that unexpectedly trashes user data is a bad API. If you feel
that people will never need their matrices after finding the
determinant, and that this is the most efficient way of implementing the
function, then fair enough - but make it absolutely clear what you are
doing. One part of that is the documentation (as you have rightly
pointed out). Another is the name of the function - "determinant" is
not a good name for a destructive function. And another important aid
is to distinguish between functions that change their parameters and
functions that do not by using "const" pointers appropriately.

No matter how you look at it, "const" pointers are important.

>>
>> In no circumstances would I make a function that left the caller's data
>> "corrupt" or "indeterminate".
>>
>> (Oh, and I'd write it in C++ to give a much better user API here. C's
>> great for some things - but it's not the best choice for everything.)
>>
> No, it's far better in C. In C++ it would pull in a "Matrix" class, and then
> it's totally unusable in another program unless someone wants to take
> that entire apparatus.

If you don't know how to make good API's in C++, ask down the hall in
comp.lang.c++. I /do/ know how, and I know your disagreement here is
based on misunderstanding, but I really don't want to go more off-topic
here. I'll just give you a hint - free-standing functions exist in C++,
just like in C, and are "pulled in" to exactly the same extent.

>>
>>> No, it's a known problem.
>> I note you haven't actually looked at references for it. It is a known
>> /consideration/ - not a known /problem/. When you hit a row with a zero
>> on the diagonal, you simply swap it with a row further down that does
>> not have zero in that column. Swapping rows multiplies the determinant
>> by -1. If there is no such row, the determinant is 0. (In fact, it is
>> common to swap rows for Gaussian elimination anyway to improve numerical
>> stability. But that is definitely off-topic here.)
>>
> My matrix will occasionally have a zero in the top left hand cell. (It's a Sylvester
> matrix giving the coefficients of two simulataneous equations, and the
> determinant represents the product of the difference of the roots, plus
> a term based on the highest coefficents. If the determiant is zero or so close
> to zero that it represents machine precision problems, then the equations
> have a root in common, which is what I'm interested in. However because of
> the factor based on the highest coefficients, it's useless to me if the top left
> is zero. However I have to do something.

Gaussian elimination is an O(n³) algorithm for calculating determinants.
It is not, I believe, the most efficient known algorithm - but to get
much better you need a much more complicated algorithm. The recursive
formula - your fall-back - is O(n!). That quickly gets unusable if your
matrices are more than perhaps 6 rows/columns.

>>
>> Do you really believe you are making a sound argument here? Or do you
>> realise that you are conflating completely different concepts? I'm
>> trying to think of a single example in this thread where you have
>> actually addressed the question, and actually justified your decisions.
>> But it's just a field of straw men fishing for red herrings.
>>
> If you provide aboolean type then you encourage people to write functions
> that take boolean parameters.

We were talking about an API - /you/ pick the parameters.

And boolean parameters are fine, as long as their use is clear - just
like "int" parameters, and "char * " parameters, and all other
parameters. Booleans are not somehow magically complicated or
error-prone - /any/ generic type is equally error-prone in the face of
poor API's or unclear information.

> Of course intelligent people like you and I
> can see, as soon as the problem is pointed out, that this isn't a good idea,
> and will have the self-discipline to write enums. But a lot of people won't.
> So it's better not to have a boolean type. Of course they can still pass
> "1" or "0" in an int, but at least this isn't beign actively encouraged.
>>> So bool is pretty useless and we're better off without it.
>> No, bool is pretty useful and we are better off having it.
>>
> It's useful where a function returns a boolean true / false value whose
> meaning is obvious from the function definition. And it is useful in
> marking fields of structures which are logically true / false (though here
> it can be a rather extravagant use of memory and you might be better
> off with bitfields).
> But these are minor aids to readability, whilst boolean parameters are
> major impediments to readability.

Booleans can be significant aids to readability, and any duplicated
parameter type can be unclear - booleans are not different there.

> The there's the if (value == true) problem.

No there is not. Stop making up nonsense.

>>
>> Are you /seriously/ suggesting that "readfilefromdisk" is easy to read?
>> Better than, say, "read_file_from_disk" or "readFileFromDisk" ? (Not
>> that I think the camel-case version is particularly easy to read here -
>> but it is a world better than your choice of jumble.)
>>
> In programs, yes.

Such an extraordinary claim requires extraordinary evidence. Would you
like to provide references to studies showing this?

And I note that - from a quick look at your babyxrc code - you regularly
use functions with names like "bbx_utf8_skip" as well as camel-case such
as "trailingBytesForUTF8". If you really thought that "bbxutf8skip" and
"trailingbytesforutf8" were easier to read, why did you not use them?

> Because you have many identifiers all jostling for attention.

If that's a problem, try a better layout for your code, putting white
space in appropriately to keep parts clearer.

> As a single word in flowing lowercase text, the decorated version is
> maybe easier to read.

What do you mean by "decorated" ?

>>> In fact, as you
>>> are obviously not aware, scripto continua was the norm for ancient manuscripts..
>> I am entirely aware of that. I have not studied such things
>> academically, but I have a far above average interest in history and
>> writing systems. Are you aware of /why/ ancient manuscripts (and other
>> old writing) was regularly written without spacing? I'll give you a
>> clue - it was /not/ in order to make the text easier to read.
>>
> I'm not arguing that text without spaces is easier to read than text with
> spaces.

Yes you were. But if you've changed your mind, that's good. Sometimes
a hypocrite is just someone who is learning.

> However spaces in C identifiers are not allowed. The fact that
> for many hundereds of years people wrote text without spaces tells you
> that, though it's less readable than text with spaces, it's not that difficult
> to read.

OK, so you /don't/ know why some kinds of manuscripts were written
without spaces (or other inter-word symbols).

>>
>>> A const void * is not an opaque pointer. We can say something about how the
>>> called function will handle the data it points to.
>> Yes - that's a good thing.
> But it's not an opaque pointer any more.

Yes, it is still an opaque pointer (that's a bad thing in general), and
yes, we know something about it (that's a good thing). It is now an
opaque pointer to data that is not changed by the function using it.

Click here to read the complete article

Re: C vs Haskell for XML parsing

<uc5foc$311jt$1@dont-email.me>

copy mid

https://www.rocksolidbbs.com/devel/article-flat.php?id=29525&group=comp.lang.c#29525

copy link Newsgroups: comp.lang.c

Path: i2pn2.org!i2pn.org!eternal-september.org!news.eternal-september.org!.POSTED!not-for-mail
From: david.brown@hesbynett.no (David Brown)
Newsgroups: comp.lang.c
Subject: Re: C vs Haskell for XML parsing
Date: Wed, 23 Aug 2023 19:30:51 +0200
Organization: A noiseless patient Spider
Lines: 79
Message-ID: <uc5foc$311jt$1@dont-email.me>
References: <576801fa-2842-40dc-bf19-221a5b1cf660n@googlegroups.com>
<87o7j4vt6r.fsf@bsb.me.uk>
<cb35076d-f8ec-441c-a963-7077bd5f884cn@googlegroups.com>
<87jztqvhwf.fsf@bsb.me.uk>
<7f9fbbd6-7f5c-4e12-a73b-c9abe91b7f5bn@googlegroups.com>
<87jztpu2iu.fsf@bsb.me.uk>
<610a41a0-a3a3-4e01-a9a7-8b5e1fe31ec0n@googlegroups.com>
<87350dtive.fsf@bsb.me.uk> <ubvan6$1rb3s$1@dont-email.me>
<3c87ec37-8fe1-4171-9500-609fad6701b7n@googlegroups.com>
<ubvo4d$1tm0p$1@dont-email.me>
<e9853969-42ce-48db-81e1-d37c8e4da59dn@googlegroups.com>
<uc28id$2dc7f$1@dont-email.me>
<b21393a6-c4f5-436a-9975-8ffedd6bf20bn@googlegroups.com>
<uc2dbv$2e4tg$1@dont-email.me>
<d734d616-b18e-4e67-b858-f0eb0a636a87n@googlegroups.com>
<uc2qnl$2gh96$1@dont-email.me>
<d651e08e-033d-4a90-8477-6a5fa13d30f3n@googlegroups.com>
<uc4e4t$2rdlt$1@dont-email.me>
<1e79f8a1-b707-4074-b272-ce4327ee7bc0n@googlegroups.com>
<uc5783$2vd55$1@dont-email.me>
<af0dbbdb-3f59-46e3-9d61-36c327a7d141n@googlegroups.com>
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8; format=flowed
Content-Transfer-Encoding: 7bit
Injection-Date: Wed, 23 Aug 2023 17:30:52 -0000 (UTC)
Injection-Info: dont-email.me; posting-host="73d2f5c07fe5b58e515376363a9f9029";
logging-data="3180157"; mail-complaints-to="abuse@eternal-september.org"; posting-account="U2FsdGVkX1/v0iazcUIUhrmgd/cd2t39w84km+4etWY="
User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:102.0) Gecko/20100101
Thunderbird/102.9.0
Cancel-Lock: sha1:VLaL7KEFQbe7HPyZfxlsAB/omW0=
Content-Language: en-GB
In-Reply-To: <af0dbbdb-3f59-46e3-9d61-36c327a7d141n@googlegroups.com>

by: David Brown - Wed, 23 Aug 2023 17:30 UTC

On 23/08/2023 17:21, Malcolm McLean wrote:
> On Wednesday, 23 August 2023 at 16:05:55 UTC+1, Bart wrote:
>> On 23/08/2023 15:48, Malcolm McLean wrote:
>>> On Wednesday, 23 August 2023 at 08:57:32 UTC+1, David Brown wrote:
>>>> On 23/08/2023 06:59, Malcolm McLean wrote:
>>
>>> I'm not arguing that text without spaces is easier to read than text with
>>> spaces. However spaces in C identifiers are not allowed. The fact that
>>> for many hundereds of years people wrote text without spaces
>> Where did they do that? I guess you'd not talking about programming syntax.
>> tells you
>>
> The modern system of separating words by spaces was only introduced in
> about the seventh century, by Irish monks.

Not true. Earlier writing systems used a variety of word separation
systems, going back long before Greek or Latin manuscripts.

But there was a period in Greek and Latin writing where unseparated
writing was common. However, it is important to note that the source of
such writing was done by different people (not the authors, but scribes
- typically slaves who did not attempt to understand the text but simply
transcribed verbatim). They were written for a different purpose - as
memory aids to speeches (usually given by the same author), not as texts
to be read regularly. Making the text hard to read was an advantage,
because it hindered competitors and made reading more exclusive. And
saving on paper or writing material was an advantage due to the cost.

When readability was important, interpuncts were often used in Latin.
However, "scriptio continua" was common enough that it was also used in
books (again, saving a little space might have been a factor). It was
only later that Europeans began to use writing as something other than a
recording of speech that it became important to make text more readable
- and word separation made an enormous difference. It turned writing
into a way to spread knowledge, rather than just a memory aid. And it
was perhaps one of the reasons Europe was gradually able to catch up
with the Islamic world - the Arabic script was much easier to read
because the word divisions were clear from the shape of the letters.

So yes, there was a period where manuscripts in Europe were often
written without inter-word divisions. It was not a good thing, and
readability hugely improved when it went out of fashion.

> The earliest classical texts we
> have use dots or lines to separate words, but fairly soon that was dropped
> and all the words were simply run together. That was in use for about 2,000
> years.

Well, closer to about 1000 years - and it was not universally practised.

Humorism - the "theory" that sickness was due to an imbalance of the
four humors and should be cured by blood-letting - held sway for about
2000 years. That does not make it a good idea.

>>> that, though it's less readable than text with spaces, it's not that difficult
>>> to read.
>> It can sometimes lead to ambiguities when word boundaries are not marked.
>>
>> Eg. #amazonshitcarshow
>>
> Yes. Spaces are better. Although spoken language doesn't usually have pauses
> between words.

Yes, it does. They are often very small, but they are very significant
and it is easily distinguishable. If you speak a synthetic language
(i.e., a language where words are often combined into longer words), it
is entirely clear when you have separate words or a combined word,
because there is a pause in the sound. For example, in Norwegian
"dyrlege" is a vet, while "dyr lege" is an expensive doctor - Norwegian
speakers will hear the difference.

> The question is what to do for C identifiers, where spaces are not allowed. The
> scriptio continua system was used for a very long time, so it must have at least
> something going for it.

Re: C vs Haskell for XML parsing

<81879984-43e7-409a-a029-1ca6677f536dn@googlegroups.com>

copy mid

https://www.rocksolidbbs.com/devel/article-flat.php?id=29528&group=comp.lang.c#29528

copy link Newsgroups: comp.lang.c

X-Received: by 2002:a05:622a:198f:b0:40c:8ba5:33e6 with SMTP id u15-20020a05622a198f00b0040c8ba533e6mr138367qtc.6.1692813001082;
Wed, 23 Aug 2023 10:50:01 -0700 (PDT)
X-Received: by 2002:a17:903:22c1:b0:1b7:c803:4818 with SMTP id
y1-20020a17090322c100b001b7c8034818mr6273982plg.0.1692813000584; Wed, 23 Aug
2023 10:50:00 -0700 (PDT)
Path: i2pn2.org!i2pn.org!weretis.net!feeder8.news.weretis.net!proxad.net!feeder1-2.proxad.net!209.85.160.216.MISMATCH!news-out.google.com!nntp.google.com!postnews.google.com!google-groups.googlegroups.com!not-for-mail
Newsgroups: comp.lang.c
Date: Wed, 23 Aug 2023 10:49:59 -0700 (PDT)
In-Reply-To: <uc5dd0$30jrk$1@dont-email.me>
Injection-Info: google-groups.googlegroups.com; posting-host=2a00:23a8:400a:5601:2c26:aadb:e9db:4185;
posting-account=Dz2zqgkAAADlK5MFu78bw3ab-BRFV4Qn
NNTP-Posting-Host: 2a00:23a8:400a:5601:2c26:aadb:e9db:4185
References: <576801fa-2842-40dc-bf19-221a5b1cf660n@googlegroups.com>
<877cptgbli.fsf@bsb.me.uk> <250cc72c-f682-4986-96bd-80011967c8dbn@googlegroups.com>
<87o7j4vt6r.fsf@bsb.me.uk> <cb35076d-f8ec-441c-a963-7077bd5f884cn@googlegroups.com>
<87jztqvhwf.fsf@bsb.me.uk> <7f9fbbd6-7f5c-4e12-a73b-c9abe91b7f5bn@googlegroups.com>
<87jztpu2iu.fsf@bsb.me.uk> <610a41a0-a3a3-4e01-a9a7-8b5e1fe31ec0n@googlegroups.com>
<87350dtive.fsf@bsb.me.uk> <ubvan6$1rb3s$1@dont-email.me> <3c87ec37-8fe1-4171-9500-609fad6701b7n@googlegroups.com>
<ubvo4d$1tm0p$1@dont-email.me> <e9853969-42ce-48db-81e1-d37c8e4da59dn@googlegroups.com>
<uc28id$2dc7f$1@dont-email.me> <b21393a6-c4f5-436a-9975-8ffedd6bf20bn@googlegroups.com>
<uc2dbv$2e4tg$1@dont-email.me> <d734d616-b18e-4e67-b858-f0eb0a636a87n@googlegroups.com>
<uc2qnl$2gh96$1@dont-email.me> <d651e08e-033d-4a90-8477-6a5fa13d30f3n@googlegroups.com>
<uc4e4t$2rdlt$1@dont-email.me> <1e79f8a1-b707-4074-b272-ce4327ee7bc0n@googlegroups.com>
<uc5dd0$30jrk$1@dont-email.me>
User-Agent: G2/1.0
MIME-Version: 1.0
Message-ID: <81879984-43e7-409a-a029-1ca6677f536dn@googlegroups.com>
Subject: Re: C vs Haskell for XML parsing
From: malcolm.arthur.mclean@gmail.com (Malcolm McLean)
Injection-Date: Wed, 23 Aug 2023 17:50:01 +0000
Content-Type: text/plain; charset="UTF-8"
Content-Transfer-Encoding: quoted-printable

by: Malcolm McLean - Wed, 23 Aug 2023 17:49 UTC

On Wednesday, 23 August 2023 at 17:50:54 UTC+1, David Brown wrote:
> On 23/08/2023 16:48, Malcolm McLean wrote:
> > On Wednesday, 23 August 2023 at 08:57:32 UTC+1, David Brown wrote:
> >> On 23/08/2023 06:59, Malcolm McLean wrote:
> >>
> >> All I can tell you here is that /I/ would not expect a "determinant"
> >> function to trash the matrix I pass to it. If /I/ were writing a
> >> determinate function, I would do so in a way that did not trash the
> >> caller's data, and did not take noticeably longer. And if I were
> >> convinced that some operations, such as this one, were significantly
> >> more efficient without copying, and destruction was fine for a
> >> significant proportion of use-cases, I'd make an explicit
> >> "determinant_destructive" version. The non-destructive version would,
> >> of course, take a const pointer.
> > I don't need the matrix after taking the determinant. And at the moment, I
> > don't have another use for the function. There's a trade-off between speed
> > and good interfaces, and I might change it to retain the matrix. There might
> > also be a faster way of obtaining the information I need. My own
> > mathematical ability is also a constraint.
> An API that unexpectedly trashes user data is a bad API. If you feel
> that people will never need their matrices after finding the
> determinant, and that this is the most efficient way of implementing the
> function, then fair enough - but make it absolutely clear what you are
> doing. One part of that is the documentation (as you have rightly
> pointed out). Another is the name of the function - "determinant" is
> not a good name for a destructive function. And another important aid
> is to distinguish between functions that change their parameters and
> functions that do not by using "const" pointers appropriately.
>
> No matter how you look at it, "const" pointers are important.
>
The function was written specially for the specific problem of the Sylvester
matrix. The first version was obviously the recursive version, since it
is easy to get right, and is adequate for testing the logic. That version
doens't corrupt the matrix. So by your logic, it should have taken a const *.
But then I moved to Gaussian elimination. Now to keep the same signature,
I would have had to take a local copy, involving a memory allocation
(and, to be strict about it, a failure condition). But there was no need, because
all I need is the determinant. I can throw away the matrix after I have that.
So the best decisoon seemed to be not to introduce the unneccessary
inefficiency. However I'm fully aware that it makes the function less usable
for other purposes.
> >>
> >> In no circumstances would I make a function that left the caller's data
> >> "corrupt" or "indeterminate".
> >>
> >> (Oh, and I'd write it in C++ to give a much better user API here. C's
> >> great for some things - but it's not the best choice for everything.)
> >>
> > No, it's far better in C. In C++ it would pull in a "Matrix" class, and then
> > it's totally unusable in another program unless someone wants to take
> > that entire apparatus.
> If you don't know how to make good API's in C++, ask down the hall in
> comp.lang.c++. I /do/ know how, and I know your disagreement here is
> based on misunderstanding, but I really don't want to go more off-topic
> here. I'll just give you a hint - free-standing functions exist in C++,
> just like in C, and are "pulled in" to exactly the same extent.
>
You might. But my experience of C++ is that it is much harder to take a function
froma C++ prgram and integrate it into another program than it is to take
a fucntion written in C and do the same thing.
>
> > My matrix will occasionally have a zero in the top left hand cell. (It's a Sylvester
> > matrix giving the coefficients of two simulataneous equations, and the
> > determinant represents the product of the difference of the roots, plus
> > a term based on the highest coefficents. If the determiant is zero or so close
> > to zero that it represents machine precision problems, then the equations
> > have a root in common, which is what I'm interested in. However because of
> > the factor based on the highest coefficients, it's useless to me if the top left
> > is zero. However I have to do something.
> Gaussian elimination is an O(n³) algorithm for calculating determinants.
> It is not, I believe, the most efficient known algorithm - but to get
> much better you need a much more complicated algorithm. The recursive
> formula - your fall-back - is O(n!). That quickly gets unusable if your
> matrices are more than perhaps 6 rows/columns.
>
The matrices are 6x6 so the fallback, though slow, isn't catastrophic.
There might be a faster way of doing 6x6es, but I don't know it.
>
> >> Do you really believe you are making a sound argument here? Or do you
> >> realise that you are conflating completely different concepts? I'm
> >> trying to think of a single example in this thread where you have
> >> actually addressed the question, and actually justified your decisions..
> >> But it's just a field of straw men fishing for red herrings.
> >>
> > If you provide aboolean type then you encourage people to write functions
> > that take boolean parameters.
> We were talking about an API - /you/ pick the parameters.
>
> And boolean parameters are fine, as long as their use is clear - just
> like "int" parameters, and "char * " parameters, and all other
> parameters. Booleans are not somehow magically complicated or
> error-prone - /any/ generic type is equally error-prone in the face of
> poor API's or unclear information.
>
No they are not. drawPath(mypath, false) conveys no information. The
first parameter is obviously a path, but the second one could mean almost
anything. Functions that take boolean parameters are a bad idea, and
should be discouraged. And one way to do that is not to have a boolean
type, so people are not tempted to use it.
>
>
> Booleans can be significant aids to readability, and any duplicated
> parameter type can be unclear - booleans are not different there.
>
No. Take qsort(). It is a bit of a nuisance that the two middle parameters
are both size_ts, and it's hard to remember which is which. But when you
read a call to qsort in code, it will be.
qsort(data, sizeof(ELEMENT), Nelements, compfunc);
So you can tell that this is a bug, without any other information.

>
> And I note that - from a quick look at your babyxrc code - you regularly
> use functions with names like "bbx_utf8_skip" as well as camel-case such
> as "trailingBytesForUTF8". If you really thought that "bbxutf8skip" and
> "trailingbytesforutf8" were easier to read, why did you not use them?
>
That code was written ten years ago. It is still solid. The tables were
copied from some reference implementation and I kept the original
identiifers.
I do use bbx_ as a prefix. That's because C doesn't have namespaces, so
the sequence bbx_ marks it as part of Baby X. But it's meant to be invisible,
I want the reader to filter it out.
> > Because you have many identifiers all jostling for attention.
> If that's a problem, try a better layout for your code, putting white
> space in appropriately to keep parts clearer.
Oh come on. You've been programming for long enough to know what
typical code looks like.
> > As a single word in flowing lowercase text, the decorated version is
> > maybe easier to read.
> What do you mean by "decorated" ?
camelCase or under_scores, decorated. continuoustext, undecorated.
>
> > However spaces in C identifiers are not allowed. The fact that
> > for many hundereds of years people wrote text without spaces tells you
> > that, though it's less readable than text with spaces, it's not that difficult
> > to read.
> OK, so you /don't/ know why some kinds of manuscripts were written
> without spaces (or other inter-word symbols).
>
It;s not really known. Hebrew texts do have spaces. It's probably something to do
with different Semitic and Indo-European concepts of word building. (That also
explains why the Greeks changed the direction of the text, it's to do with the
word entering the right visual field and therefore brain hemisphere). But this
sort of thing is a bit speculative. Saving expensive manuscript may have been a
factor, but probably not the real reason.
>

Click here to read the complete article

Malcolm McLean <malcolm.arthur.mclean@gmail.com> writes:
>On Wednesday, 23 August 2023 at 17:50:54 UTC+1, David Brown wrote:

>> >>=20
>> >> (Oh, and I'd write it in C++ to give a much better user API here. C's=
>=20
>> >> great for some things - but it's not the best choice for everything.)=
>=20
>> >>=20
>> > No, it's far better in C. In C++ it would pull in a "Matrix" class, and=
> then=20
>> > it's totally unusable in another program unless someone wants to take=
>=20
>> > that entire apparatus.
>> If you don't know how to make good API's in C++, ask down the hall in=20
>> comp.lang.c++. I /do/ know how, and I know your disagreement here is=20
>> based on misunderstanding, but I really don't want to go more off-topic=
>=20
>> here. I'll just give you a hint - free-standing functions exist in C++,=
>=20
>> just like in C, and are "pulled in" to exactly the same extent.
>>=20
>You might. But my experience of C++ is that it is much harder to take a fun=
>ction
>froma C++ prgram and integrate it into another program than it is to take
>a fucntion written in C and do the same thing.

That has not been my experience. I routinely call C++ functions from
C, from assembler and from python - all in the same application.

Re: C vs Haskell for XML parsing

<uc5mlk$32gl3$1@dont-email.me>

copy mid

https://www.rocksolidbbs.com/devel/article-flat.php?id=29531&group=comp.lang.c#29531

copy link Newsgroups: comp.lang.c

Path: i2pn2.org!i2pn.org!eternal-september.org!news.eternal-september.org!.POSTED!not-for-mail
From: david.brown@hesbynett.no (David Brown)
Newsgroups: comp.lang.c
Subject: Re: C vs Haskell for XML parsing
Date: Wed, 23 Aug 2023 21:28:51 +0200
Organization: A noiseless patient Spider
Lines: 248
Message-ID: <uc5mlk$32gl3$1@dont-email.me>
References: <576801fa-2842-40dc-bf19-221a5b1cf660n@googlegroups.com>
<87o7j4vt6r.fsf@bsb.me.uk>
<cb35076d-f8ec-441c-a963-7077bd5f884cn@googlegroups.com>
<87jztqvhwf.fsf@bsb.me.uk>
<7f9fbbd6-7f5c-4e12-a73b-c9abe91b7f5bn@googlegroups.com>
<87jztpu2iu.fsf@bsb.me.uk>
<610a41a0-a3a3-4e01-a9a7-8b5e1fe31ec0n@googlegroups.com>
<87350dtive.fsf@bsb.me.uk> <ubvan6$1rb3s$1@dont-email.me>
<3c87ec37-8fe1-4171-9500-609fad6701b7n@googlegroups.com>
<ubvo4d$1tm0p$1@dont-email.me>
<e9853969-42ce-48db-81e1-d37c8e4da59dn@googlegroups.com>
<uc28id$2dc7f$1@dont-email.me>
<b21393a6-c4f5-436a-9975-8ffedd6bf20bn@googlegroups.com>
<uc2dbv$2e4tg$1@dont-email.me>
<d734d616-b18e-4e67-b858-f0eb0a636a87n@googlegroups.com>
<uc2qnl$2gh96$1@dont-email.me>
<d651e08e-033d-4a90-8477-6a5fa13d30f3n@googlegroups.com>
<uc4e4t$2rdlt$1@dont-email.me>
<1e79f8a1-b707-4074-b272-ce4327ee7bc0n@googlegroups.com>
<uc5dd0$30jrk$1@dont-email.me>
<81879984-43e7-409a-a029-1ca6677f536dn@googlegroups.com>
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8; format=flowed
Content-Transfer-Encoding: 8bit
Injection-Date: Wed, 23 Aug 2023 19:28:52 -0000 (UTC)
Injection-Info: dont-email.me; posting-host="73d2f5c07fe5b58e515376363a9f9029";
logging-data="3228323"; mail-complaints-to="abuse@eternal-september.org"; posting-account="U2FsdGVkX1/zdqLIweHSMta401y3oAunMv39G1dIKkA="
User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:102.0) Gecko/20100101
Thunderbird/102.9.0
Cancel-Lock: sha1:6sHhnumlgjzYjg4Zjbs1SBTtgD0=
Content-Language: en-GB
In-Reply-To: <81879984-43e7-409a-a029-1ca6677f536dn@googlegroups.com>

by: David Brown - Wed, 23 Aug 2023 19:28 UTC

On 23/08/2023 19:49, Malcolm McLean wrote:
> On Wednesday, 23 August 2023 at 17:50:54 UTC+1, David Brown wrote:
>> On 23/08/2023 16:48, Malcolm McLean wrote:
>>> On Wednesday, 23 August 2023 at 08:57:32 UTC+1, David Brown wrote:
>>>> On 23/08/2023 06:59, Malcolm McLean wrote:
>>>>
>>>> All I can tell you here is that /I/ would not expect a "determinant"
>>>> function to trash the matrix I pass to it. If /I/ were writing a
>>>> determinate function, I would do so in a way that did not trash the
>>>> caller's data, and did not take noticeably longer. And if I were
>>>> convinced that some operations, such as this one, were significantly
>>>> more efficient without copying, and destruction was fine for a
>>>> significant proportion of use-cases, I'd make an explicit
>>>> "determinant_destructive" version. The non-destructive version would,
>>>> of course, take a const pointer.
>>> I don't need the matrix after taking the determinant. And at the moment, I
>>> don't have another use for the function. There's a trade-off between speed
>>> and good interfaces, and I might change it to retain the matrix. There might
>>> also be a faster way of obtaining the information I need. My own
>>> mathematical ability is also a constraint.
>> An API that unexpectedly trashes user data is a bad API. If you feel
>> that people will never need their matrices after finding the
>> determinant, and that this is the most efficient way of implementing the
>> function, then fair enough - but make it absolutely clear what you are
>> doing. One part of that is the documentation (as you have rightly
>> pointed out). Another is the name of the function - "determinant" is
>> not a good name for a destructive function. And another important aid
>> is to distinguish between functions that change their parameters and
>> functions that do not by using "const" pointers appropriately.
>>
>> No matter how you look at it, "const" pointers are important.
>>
> The function was written specially for the specific problem of the Sylvester
> matrix. The first version was obviously the recursive version, since it
> is easy to get right, and is adequate for testing the logic. That version
> doens't corrupt the matrix. So by your logic, it should have taken a const *.

My logic is that if the function /specification/ is for read-only access
to the data, it should be "const". If the /specification/ says that the
pointer parameter is for output data (or modified data), then it should
be non-const. There is very little call for a function whose
specification is that it will corrupt or destroy the input data - but if
that's the specification, then it should be non-const.

The declaration is part of the specification, not the implementation -
the implementation is irrelevant once the API specification is in place.
(You may, of course, have implementation details in mind when writing
the specification and declaration.)

> But then I moved to Gaussian elimination. Now to keep the same signature,
> I would have had to take a local copy, involving a memory allocation
> (and, to be strict about it, a failure condition). But there was no need, because
> all I need is the determinant. I can throw away the matrix after I have that.
> So the best decisoon seemed to be not to introduce the unneccessary
> inefficiency. However I'm fully aware that it makes the function less usable
> for other purposes.
>>>>
>>>> In no circumstances would I make a function that left the caller's data
>>>> "corrupt" or "indeterminate".
>>>>
>>>> (Oh, and I'd write it in C++ to give a much better user API here. C's
>>>> great for some things - but it's not the best choice for everything.)
>>>>
>>> No, it's far better in C. In C++ it would pull in a "Matrix" class, and then
>>> it's totally unusable in another program unless someone wants to take
>>> that entire apparatus.
>> If you don't know how to make good API's in C++, ask down the hall in
>> comp.lang.c++. I /do/ know how, and I know your disagreement here is
>> based on misunderstanding, but I really don't want to go more off-topic
>> here. I'll just give you a hint - free-standing functions exist in C++,
>> just like in C, and are "pulled in" to exactly the same extent.
>>
> You might. But my experience of C++ is that it is much harder to take a function
> froma C++ prgram and integrate it into another program than it is to take
> a fucntion written in C and do the same thing.

It's true that C++ tends to use more proper types, and this means more
of the code hangs together. This goes along with better APIs for things
like matrices (or at least, the possibility of writing better APIs) that
are more modular, and clearer and safer to use. But done well the
classes and their functions are perfectly usable in other programs -
better, often, than functions written in C, because they are often more
flexible.

>>
>>> My matrix will occasionally have a zero in the top left hand cell. (It's a Sylvester
>>> matrix giving the coefficients of two simulataneous equations, and the
>>> determinant represents the product of the difference of the roots, plus
>>> a term based on the highest coefficents. If the determiant is zero or so close
>>> to zero that it represents machine precision problems, then the equations
>>> have a root in common, which is what I'm interested in. However because of
>>> the factor based on the highest coefficients, it's useless to me if the top left
>>> is zero. However I have to do something.
>> Gaussian elimination is an O(n³) algorithm for calculating determinants.
>> It is not, I believe, the most efficient known algorithm - but to get
>> much better you need a much more complicated algorithm. The recursive
>> formula - your fall-back - is O(n!). That quickly gets unusable if your
>> matrices are more than perhaps 6 rows/columns.
>>
> The matrices are 6x6 so the fallback, though slow, isn't catastrophic.
> There might be a faster way of doing 6x6es, but I don't know it.

I'm telling you - Gaussian elimination, with row-swapping as needed.
It's not hard.

>>
>>>> Do you really believe you are making a sound argument here? Or do you
>>>> realise that you are conflating completely different concepts? I'm
>>>> trying to think of a single example in this thread where you have
>>>> actually addressed the question, and actually justified your decisions.
>>>> But it's just a field of straw men fishing for red herrings.
>>>>
>>> If you provide aboolean type then you encourage people to write functions
>>> that take boolean parameters.
>> We were talking about an API - /you/ pick the parameters.
>>
>> And boolean parameters are fine, as long as their use is clear - just
>> like "int" parameters, and "char * " parameters, and all other
>> parameters. Booleans are not somehow magically complicated or
>> error-prone - /any/ generic type is equally error-prone in the face of
>> poor API's or unclear information.
>>
> No they are not. drawPath(mypath, false) conveys no information.

Nor does drawRectangle(100, 200, 300, 400).

And certainly nor does drawPath(mypath, 0).

It's all a matter of suitable function naming, declarations and typing,
along with documentation and consistent API design.

I fully agree that an enum can often be clearer - though without C++ or
good compiler warnings, enums are easily abused. My point is not that
"bool" is clearer than all alternatives - merely that it is clearer than
a 0/1 int.

> The
> first parameter is obviously a path, but the second one could mean almost
> anything. Functions that take boolean parameters are a bad idea, and
> should be discouraged. And one way to do that is not to have a boolean
> type, so people are not tempted to use it.
>>
>>
>> Booleans can be significant aids to readability, and any duplicated
>> parameter type can be unclear - booleans are not different there.
>>
> No. Take qsort(). It is a bit of a nuisance that the two middle parameters
> are both size_ts, and it's hard to remember which is which. But when you
> read a call to qsort in code, it will be.
> qsort(data, sizeof(ELEMENT), Nelements, compfunc);
> So you can tell that this is a bug, without any other information.

You have the information about the parameters to the function. How is
that any different if the parameters are size_t or if they are bool?

Click here to read the complete article

Malcolm McLean <malcolm.arthur.mclean@gmail.com> writes:
[...]
> If you provide aboolean type then you encourage people to write functions
> that take boolean parameters. Of course intelligent people like you and I
> can see, as soon as the problem is pointed out, that this isn't a good idea,
> and will have the self-discipline to write enums. But a lot of people won't.
> So it's better not to have a boolean type. Of course they can still pass
> "1" or "0" in an int, but at least this isn't beign actively encouraged.
>> > So bool is pretty useless and we're better off without it.
>> No, bool is pretty useful and we are better off having it.
>>
> It's useful where a function returns a boolean true / false value whose
> meaning is obvious from the function definition. And it is useful in
> marking fields of structures which are logically true / false (though here
> it can be a rather extravagant use of memory and you might be better
> off with bitfields).
> But these are minor aids to readability, whilst boolean parameters are
> major impediments to readability.

Passing literal true or false as a function argument can be a problem
for legibility. It's still better than passing a literal 0 or 1 and
letting the reader guess whether it's logically a Boolean or a count.

Some languages address that with named parameter associations.
C doesn't, and I don't expect it to any time soon.

But the idea that we shouldn't have a Boolean type just because some
function calls are unclear is extremely silly. I won't try to guess
whether you're being serious. Yes, `func(some_arg, true, false)`
can be difficult to read, but there are plenty of solutions.

> The there's the if (value == true) problem.

Which has a trivial solution: Don't do that.

>> Are you /seriously/ suggesting that "readfilefromdisk" is easy to read?
>> Better than, say, "read_file_from_disk" or "readFileFromDisk" ? (Not
>> that I think the camel-case version is particularly easy to read here -
>> but it is a world better than your choice of jumble.)
>>
> In programs, yes. Because you have many identifiers all jostling for
> attention.

Please explain how "many identifiers all jostling for attention" implies
that readfilefromdisk is easier to read than read_file_from_disk or
readFileFromDisk.

[...]

> I'm not arguing that text without spaces is easier to read than text
> with spaces. However spaces in C identifiers are not allowed. The fact
> that for many hundereds of years people wrote text without spaces
> tells you that, though it's less readable than text with spaces, it's
> not that difficult to read.

The fact that we stopped doing that suggests that separating words is
better than not separating words.

[...]

--
Keith Thompson (The_Other_Keith) Keith.S.Thompson+u@gmail.com
Will write code for food.
void Void(void) { Void(); } /* The recursive call of the void */

Re: C vs Haskell for XML parsing

<8eec8404-4928-4bc3-8b00-c673ea22ab60n@googlegroups.com>

copy mid

https://www.rocksolidbbs.com/devel/article-flat.php?id=29534&group=comp.lang.c#29534

copy link Newsgroups: comp.lang.c

X-Received: by 2002:a05:620a:489b:b0:76d:77d2:e74e with SMTP id ea27-20020a05620a489b00b0076d77d2e74emr138656qkb.2.1692849239823;
Wed, 23 Aug 2023 20:53:59 -0700 (PDT)
X-Received: by 2002:a63:935e:0:b0:569:1460:bbf1 with SMTP id
w30-20020a63935e000000b005691460bbf1mr2840279pgm.11.1692849239303; Wed, 23
Aug 2023 20:53:59 -0700 (PDT)
Path: i2pn2.org!i2pn.org!newsfeed.endofthelinebbs.com!usenet.blueworldhosting.com!diablo1.usenet.blueworldhosting.com!peer02.iad!feed-me.highwinds-media.com!news.highwinds-media.com!news-out.google.com!nntp.google.com!postnews.google.com!google-groups.googlegroups.com!not-for-mail
Newsgroups: comp.lang.c
Date: Wed, 23 Aug 2023 20:53:58 -0700 (PDT)
In-Reply-To: <uc5mlk$32gl3$1@dont-email.me>
Injection-Info: google-groups.googlegroups.com; posting-host=2a00:23a8:400a:5601:2c26:aadb:e9db:4185;
posting-account=Dz2zqgkAAADlK5MFu78bw3ab-BRFV4Qn
NNTP-Posting-Host: 2a00:23a8:400a:5601:2c26:aadb:e9db:4185
References: <576801fa-2842-40dc-bf19-221a5b1cf660n@googlegroups.com>
<87o7j4vt6r.fsf@bsb.me.uk> <cb35076d-f8ec-441c-a963-7077bd5f884cn@googlegroups.com>
<87jztqvhwf.fsf@bsb.me.uk> <7f9fbbd6-7f5c-4e12-a73b-c9abe91b7f5bn@googlegroups.com>
<87jztpu2iu.fsf@bsb.me.uk> <610a41a0-a3a3-4e01-a9a7-8b5e1fe31ec0n@googlegroups.com>
<87350dtive.fsf@bsb.me.uk> <ubvan6$1rb3s$1@dont-email.me> <3c87ec37-8fe1-4171-9500-609fad6701b7n@googlegroups.com>
<ubvo4d$1tm0p$1@dont-email.me> <e9853969-42ce-48db-81e1-d37c8e4da59dn@googlegroups.com>
<uc28id$2dc7f$1@dont-email.me> <b21393a6-c4f5-436a-9975-8ffedd6bf20bn@googlegroups.com>
<uc2dbv$2e4tg$1@dont-email.me> <d734d616-b18e-4e67-b858-f0eb0a636a87n@googlegroups.com>
<uc2qnl$2gh96$1@dont-email.me> <d651e08e-033d-4a90-8477-6a5fa13d30f3n@googlegroups.com>
<uc4e4t$2rdlt$1@dont-email.me> <1e79f8a1-b707-4074-b272-ce4327ee7bc0n@googlegroups.com>
<uc5dd0$30jrk$1@dont-email.me> <81879984-43e7-409a-a029-1ca6677f536dn@googlegroups.com>
<uc5mlk$32gl3$1@dont-email.me>
User-Agent: G2/1.0
MIME-Version: 1.0
Message-ID: <8eec8404-4928-4bc3-8b00-c673ea22ab60n@googlegroups.com>
Subject: Re: C vs Haskell for XML parsing
From: malcolm.arthur.mclean@gmail.com (Malcolm McLean)
Injection-Date: Thu, 24 Aug 2023 03:53:59 +0000
Content-Type: text/plain; charset="UTF-8"
X-Received-Bytes: 9672

by: Malcolm McLean - Thu, 24 Aug 2023 03:53 UTC

On Wednesday, 23 August 2023 at 20:29:07 UTC+1, David Brown wrote:
>
> > No they are not. drawPath(mypath, false) conveys no information.
> Nor does drawRectangle(100, 200, 300, 400).
>
The function obviously draws a rectangle. And those values are almost
certainly pixel values. If they were rbg values they would go from 0-255,
and a normal raster designed for human viewing will have pixel co-ordinates
in that range.
So the call means either "draw a rectangle at 100,200 with width 300
pixels and height 400 pixels. Or it means "draw rectangle with top left
at 100, 200 and bottom right at 300,400". We don't know which. (And
we don't know whether y represents row indices or height from origin,
or how the system gets context for colour, transformation, and so on,
but we'll have that sort of information froma general understanding of
the graphics system).

Of course you can implement things perversely or unusually, so
there might be no dimensions passed at all and it's all done by
matrix transformations in the graphics context. That's a logical
possibility. But we're talking about what the code probably means,
and what the reader can understand.
>
> And certainly nor does drawPath(mypath, 0).
>
> It's all a matter of suitable function naming, declarations and typing,
> along with documentation and consistent API design.
>
> I fully agree that an enum can often be clearer - though without C++ or
> good compiler warnings, enums are easily abused. My point is not that
> "bool" is clearer than all alternatives - merely that it is clearer than
> a 0/1 int.
>
Yes, that is true. But it's all about psychology. If a parameter is an int,
people are more likely to think "we should have a define for that". If it is
boolean, you already have true/flase, so what more do you need?

>
> > qsort(data, sizeof(ELEMENT), Nelements, compfunc);
> > So you can tell that this is a bug, without any other information.
> You have the information about the parameters to the function. How is
> that any different if the parameters are size_t or if they are bool?
>
void mysort(ELEMENT *data, int N, bool casesensitive, bool embeddednumbers)

Ok, so it's pretty obvious what mysort() does, and most people ought to be able
to call it successfully and say what it is doing, based on that information.

But when we see a call

mysort(data, Nelements, false, true);

what are the last two parameters meant to mean? Even if we can remember that
mysort() can be case sensitive or case-insensitive, and has an option for
treating embedded numbers as values, we probably can't remember which one
is the first parameter and which one is the second.

But consider now the second parameter. Because it's an integer, it's unlikely
to be hard coded. And most people will give it a sensible name like Nelements.
So we know that it's the number of elements in the "data" array, not the
size of an element, and not some sort of flag to control the type of sort. If
it's hardcoded, it will have a value like "100", and we will probably know from
context that that is the hardcoded size of the data array. Even if we don't have
such context, "100" is quite likely as the size of a table.
>
> I do, yes - to the extent that there is such a thing as "typical code".
> Long identifiers, all in lower case, made of multiple words with no
> underscores, are not typical code.
>
Standard library functions are the most commonly called functions in C code,
and are always named in my style.

User functions can't use abbreviations as easily, so they tend to be a bit
longer. But most aren't very long, one English word or two words. There
are occasional exceptions where it's hard to think of a short name which
adequately describes the function. But it's better to remain consistent.
>
> And sometimes code looks jumbled and crushed - often better spacing can
> make a significant difference if you feel that's a problem for your
> code. (Too much spacing is also bad, of course.)
> >>> As a single word in flowing lowercase text, the decorated version is
> >>> maybe easier to read.
> >> What do you mean by "decorated" ?
> > camelCase or under_scores, decorated. continuoustext, undecorated.
> OK, so now you are saying that camelCase and under_scores makes things
> more readable? Or are you saying that this only applies to "flowing
> lowercase text", and that continuoustext identifiers are magically more
> readable than alternatives in other circumstances?
>
Yes. It;s the highlighting paradox. A bit of highlighting makes text easier to
read. But if you over-do it, then the highlighted text becomes hard to read.
>
One identifier in camelCase is easier to read than the same program with
that identiifer in camelcase. But programs don't usually contain only one
identifier, they have hundreds of them, in close proximity to each other.
So you're in the "over-highlighted" area of the highlighting paradox.

> > It;s not really known.
> There are some reasonable plausibilities, as I wrote in another post.
> You are, of course, correct that such things are rarely fully known -
> that's the nature of history.
>
Your source is probably looking at classical slavery throuhg the lens of
American slavery. American slaves were menial workers and held in
contempt. But a slave scribe in the Roman Empire would have been a
fairly high status individual. He might have had a slave himself to attend
to his personal needs. Think of Bill Gates' personal pilot for the sort of
figure he would be.

> > Hebrew texts do have spaces. It's probably something to do
> > with different Semitic and Indo-European concepts of word building.
> It is primarily a matter of vowels. Hebrew was traditionally written
> with no vowels - at most, a few diacriticals as pronunciation hints.
> Such writing without any word-break indications would be hopelessly
> unreadable. Thus Hebrew has never (or at least, very rarely) been
> without word breaks. When writing in the Greek and Latin alphabets,
> however, vowels were always written - it is then feasible (though
> difficult) to interpret the text despite a lack of spaces.
>
But the spaces are not part of the Torah. They are regarded as man-made, not
God-given. So there must have been an early tradition of text without spaces.
But the Dead Sea scrolls are written with spaces, and they were composed at
the same time that most Latin and Greek texts were written continuously.
>
> > (That also
> > explains why the Greeks changed the direction of the text, it's to do with the
> > word entering the right visual field and therefore brain hemisphere).
> No, it is not. It is because the Greeks wrote notes in sand, and also
> in wax tablets (as did the Romans). Right-to-left writing would smudge
> that for right-handed writers. (Earlier Greek was written more slowly,
> and often carved - like a number of older languages, it was not uncommon
> to change direction for each line.)
>
> In contrast, Arabic was written using a fine brush, held below the text
> being written, and could therefore have developed in either direction.
> I don't know why it happened to be right-to-left - it may have been as
> much luck as design.
>
> Brain hemispheres have nothing to do with it - any bias you think you
> have here is a result of the direction you learned to read, not the
> cause of the direction of the writing.
>
You might be right here.

Re: C vs Haskell for XML parsing

<uc7l4o$3fp72$1@dont-email.me>

copy mid

https://www.rocksolidbbs.com/devel/article-flat.php?id=29536&group=comp.lang.c#29536

copy link Newsgroups: comp.lang.c

Path: i2pn2.org!i2pn.org!eternal-september.org!news.eternal-september.org!.POSTED!not-for-mail
From: david.brown@hesbynett.no (David Brown)
Newsgroups: comp.lang.c
Subject: Re: C vs Haskell for XML parsing
Date: Thu, 24 Aug 2023 15:15:04 +0200
Organization: A noiseless patient Spider
Lines: 235
Message-ID: <uc7l4o$3fp72$1@dont-email.me>
References: <576801fa-2842-40dc-bf19-221a5b1cf660n@googlegroups.com>
<87jztqvhwf.fsf@bsb.me.uk>
<7f9fbbd6-7f5c-4e12-a73b-c9abe91b7f5bn@googlegroups.com>
<87jztpu2iu.fsf@bsb.me.uk>
<610a41a0-a3a3-4e01-a9a7-8b5e1fe31ec0n@googlegroups.com>
<87350dtive.fsf@bsb.me.uk> <ubvan6$1rb3s$1@dont-email.me>
<3c87ec37-8fe1-4171-9500-609fad6701b7n@googlegroups.com>
<ubvo4d$1tm0p$1@dont-email.me>
<e9853969-42ce-48db-81e1-d37c8e4da59dn@googlegroups.com>
<uc28id$2dc7f$1@dont-email.me>
<b21393a6-c4f5-436a-9975-8ffedd6bf20bn@googlegroups.com>
<uc2dbv$2e4tg$1@dont-email.me>
<d734d616-b18e-4e67-b858-f0eb0a636a87n@googlegroups.com>
<uc2qnl$2gh96$1@dont-email.me>
<d651e08e-033d-4a90-8477-6a5fa13d30f3n@googlegroups.com>
<uc4e4t$2rdlt$1@dont-email.me>
<1e79f8a1-b707-4074-b272-ce4327ee7bc0n@googlegroups.com>
<uc5dd0$30jrk$1@dont-email.me>
<81879984-43e7-409a-a029-1ca6677f536dn@googlegroups.com>
<uc5mlk$32gl3$1@dont-email.me>
<8eec8404-4928-4bc3-8b00-c673ea22ab60n@googlegroups.com>
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8; format=flowed
Content-Transfer-Encoding: 7bit
Injection-Date: Thu, 24 Aug 2023 13:15:04 -0000 (UTC)
Injection-Info: dont-email.me; posting-host="98c0a4bc5222ada7855dacae0a147b79";
logging-data="3663074"; mail-complaints-to="abuse@eternal-september.org"; posting-account="U2FsdGVkX19mcXIqCZipUch+Ffa9Rg8FMd6LzZa66GQ="
User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:102.0) Gecko/20100101
Thunderbird/102.9.0
Cancel-Lock: sha1:ZDB9ZJ4lp5LQjTMOAJIuAka3wc4=
In-Reply-To: <8eec8404-4928-4bc3-8b00-c673ea22ab60n@googlegroups.com>
Content-Language: en-GB

by: David Brown - Thu, 24 Aug 2023 13:15 UTC

On 24/08/2023 05:53, Malcolm McLean wrote:
> On Wednesday, 23 August 2023 at 20:29:07 UTC+1, David Brown wrote:
>>
>>> No they are not. drawPath(mypath, false) conveys no information.
>> Nor does drawRectangle(100, 200, 300, 400).
>>
> The function obviously draws a rectangle. And those values are almost
> certainly pixel values. If they were rbg values they would go from 0-255,
> and a normal raster designed for human viewing will have pixel co-ordinates
> in that range.

Is it (x1, y1, x2, y2) ? (x, y, w, h) ? (row, column, height, width) ?
Is the y coordinate from the top down, or bottom up? Are we in
pixels, millimetres, points, or something else? Are we relative to a
window, or a screen, or an abstract desktop? Are we doing something
very different, such as using indices into arrays of 3-D points in a
scene definition?

The point is, without context and more information (such as the
declaration of the function with parameter names, or a knowledge of its
use), you don't know enough to interpret the meaning of the parameters.
You can guess, but you could be wrong. This is exactly the same whether
it is a series of "int" parameters or some "bool" parameters, or other
generic types.

Such issues are well-known, as are solutions - strong typing (possible
even in C, though the language doesn't have great support for it - a
struct parameter with compound literals and designated literals one
way), named parameters, chained function calls, etc. And of course,
clear and consistent API's with good declarations and documentation are key.

>
>> I fully agree that an enum can often be clearer - though without C++ or
>> good compiler warnings, enums are easily abused. My point is not that
>> "bool" is clearer than all alternatives - merely that it is clearer than
>> a 0/1 int.
>>
> Yes, that is true. But it's all about psychology. If a parameter is an int,
> people are more likely to think "we should have a define for that". If it is
> boolean, you already have true/flase, so what more do you need?

I think you are imagining things, or extrapolating them from your own
coding habits.

>
>>
>>> qsort(data, sizeof(ELEMENT), Nelements, compfunc);
>>> So you can tell that this is a bug, without any other information.
>> You have the information about the parameters to the function. How is
>> that any different if the parameters are size_t or if they are bool?
>>
> void mysort(ELEMENT *data, int N, bool casesensitive, bool embeddednumbers)
>
> Ok, so it's pretty obvious what mysort() does, and most people ought to be able
> to call it successfully and say what it is doing, based on that information.
>
> But when we see a call
>
> mysort(data, Nelements, false, true);
>
> what are the last two parameters meant to mean? Even if we can remember that
> mysort() can be case sensitive or case-insensitive, and has an option for
> treating embedded numbers as values, we probably can't remember which one
> is the first parameter and which one is the second.

I agree entirely that a call like this is hard to interpret unless you
know the details of the function. I agree entirely that an enum type
for each parameter would make it clearer. (C++'s strong enum types are
/hugely/ better than C's weak enums for such purposes.)

I simply disagree with the suggestion that "bool" is worse than other
types. "mysort(data, Nelements, 0, 1);" is even harder to interpret -
at least with "false" and "true" we know the parameters are on/off flags.

>>
>> I do, yes - to the extent that there is such a thing as "typical code".
>> Long identifiers, all in lower case, made of multiple words with no
>> underscores, are not typical code.
>>
> Standard library functions are the most commonly called functions in C code,
> and are always named in my style.

No, they are not. Names with all lower case and no word separation are
common in the C standard library for short identifiers (like "strlen"),
while underscores are common for longer identifiers (like
"memory_order_relaxed").

It would be nice if you were to check this kind of thing before making
pointlessly incorrect claims.

>
> User functions can't use abbreviations as easily,

Of course they can.

> so they tend to be a bit
> longer. But most aren't very long, one English word or two words. There
> are occasional exceptions where it's hard to think of a short name which
> adequately describes the function. But it's better to remain consistent.

Short names are useful for things that are used often. Longer and more
descriptive names are better for things that are used more rarely.
Either way, readability is the key - for common identifiers, brevity
improves reading speed, while for less common identifiers it is helpful
to be clearer and more explicit. For longer identifiers, underscores
are vital to readability (camel-case is fine in medium cases, but can be
a poor choice when the identifier consists of several words).

Consistency is a benefit, but it is not a priority - it does not trump
readability.

>>
>> And sometimes code looks jumbled and crushed - often better spacing can
>> make a significant difference if you feel that's a problem for your
>> code. (Too much spacing is also bad, of course.)
>>>>> As a single word in flowing lowercase text, the decorated version is
>>>>> maybe easier to read.
>>>> What do you mean by "decorated" ?
>>> camelCase or under_scores, decorated. continuoustext, undecorated.
>> OK, so now you are saying that camelCase and under_scores makes things
>> more readable? Or are you saying that this only applies to "flowing
>> lowercase text", and that continuoustext identifiers are magically more
>> readable than alternatives in other circumstances?
>>
> Yes. It;s the highlighting paradox. A bit of highlighting makes text easier to
> read. But if you over-do it, then the highlighted text becomes hard to read.

Please stop your habit of using "Malcolm terms" as though they were real
things. There is no such thing as "the highlighting paradox". Before
you write something like that, next time check to see if there is a
Wikipedia entry for your term and if that matches your usage. There is
no "highlighting paradox", and what you describe is not a paradox.
(Look up that concept in Wikipedia too.)

If you are trying to say that trying to highlight everything results in
highlighting nothing, just write that. Or better - don't bother writing
it, because it is entirely obvious.

>>
> One identifier in camelCase is easier to read than the same program with
> that identiifer in camelcase. But programs don't usually contain only one
> identifier, they have hundreds of them, in close proximity to each other.
> So you're in the "over-highlighted" area of the highlighting paradox.
>

That could conceivably be a valid argument, if camel-case were being
used to highlight identifiers. But it is not - it is used to make the
identifiers easier to read.

Have you ever used a modern editor or IDE? Do you know what "syntax
highlighting" is? That applies different appearances to different parts
of the language syntax - everything is "highlighted", but it makes the
code easier to read.

>
>>> It;s not really known.
>> There are some reasonable plausibilities, as I wrote in another post.
>> You are, of course, correct that such things are rarely fully known -
>> that's the nature of history.
>>
> Your source is probably looking at classical slavery throuhg the lens of
> American slavery.

What? I'm sorry, but you really have very little idea of what you are
talking about. I am well aware of the wide range of statuses that could
apply to slaves in the classical Roman and Greek societies - with some
having significant status, their own slaves, their own property, etc.
Most scribes were /not/ high status slaves, and were certainly not privy
to the details of politics, diplomacy and other aspects of running the
societies. There were a few who were acted more as advisors and were of
higher status. But usually a good scribe was one who did not remember
or think about the things he wrote down - he was a dictation machine,
and perhaps also required to read aloud and track accounts. For many
uses, the less he understood about what was being written, the better.

> American slaves were menial workers and held in
> contempt. But a slave scribe in the Roman Empire would have been a
> fairly high status individual. He might have had a slave himself to attend
> to his personal needs. Think of Bill Gates' personal pilot for the sort of
> figure he would be.
>
>>> Hebrew texts do have spaces. It's probably something to do
>>> with different Semitic and Indo-European concepts of word building.
>> It is primarily a matter of vowels. Hebrew was traditionally written
>> with no vowels - at most, a few diacriticals as pronunciation hints.
>> Such writing without any word-break indications would be hopelessly
>> unreadable. Thus Hebrew has never (or at least, very rarely) been
>> without word breaks. When writing in the Greek and Latin alphabets,
>> however, vowels were always written - it is then feasible (though
>> difficult) to interpret the text despite a lack of spaces.
>>
> But the spaces are not part of the Torah. They are regarded as man-made, not
> God-given. So there must have been an early tradition of text without spaces.

Click here to read the complete article

Re: C vs Haskell for XML parsing

<639e8e6f-2729-476b-9a6e-0b3eb066b06an@googlegroups.com>

copy mid

https://www.rocksolidbbs.com/devel/article-flat.php?id=29539&group=comp.lang.c#29539

copy link Newsgroups: comp.lang.c

X-Received: by 2002:ad4:5591:0:b0:635:49d7:544f with SMTP id f17-20020ad45591000000b0063549d7544fmr202785qvx.4.1692888630808;
Thu, 24 Aug 2023 07:50:30 -0700 (PDT)
X-Received: by 2002:a17:90b:249:b0:26d:1f4c:a608 with SMTP id
fz9-20020a17090b024900b0026d1f4ca608mr3971093pjb.5.1692888630293; Thu, 24 Aug
2023 07:50:30 -0700 (PDT)
Path: i2pn2.org!i2pn.org!weretis.net!feeder6.news.weretis.net!news.misty.com!border-2.nntp.ord.giganews.com!nntp.giganews.com!news-out.google.com!nntp.google.com!postnews.google.com!google-groups.googlegroups.com!not-for-mail
Newsgroups: comp.lang.c
Date: Thu, 24 Aug 2023 07:50:29 -0700 (PDT)
In-Reply-To: <uc7l4o$3fp72$1@dont-email.me>
Injection-Info: google-groups.googlegroups.com; posting-host=2a00:23a8:400a:5601:2c26:aadb:e9db:4185;
posting-account=Dz2zqgkAAADlK5MFu78bw3ab-BRFV4Qn
NNTP-Posting-Host: 2a00:23a8:400a:5601:2c26:aadb:e9db:4185
References: <576801fa-2842-40dc-bf19-221a5b1cf660n@googlegroups.com>
<87jztqvhwf.fsf@bsb.me.uk> <7f9fbbd6-7f5c-4e12-a73b-c9abe91b7f5bn@googlegroups.com>
<87jztpu2iu.fsf@bsb.me.uk> <610a41a0-a3a3-4e01-a9a7-8b5e1fe31ec0n@googlegroups.com>
<87350dtive.fsf@bsb.me.uk> <ubvan6$1rb3s$1@dont-email.me> <3c87ec37-8fe1-4171-9500-609fad6701b7n@googlegroups.com>
<ubvo4d$1tm0p$1@dont-email.me> <e9853969-42ce-48db-81e1-d37c8e4da59dn@googlegroups.com>
<uc28id$2dc7f$1@dont-email.me> <b21393a6-c4f5-436a-9975-8ffedd6bf20bn@googlegroups.com>
<uc2dbv$2e4tg$1@dont-email.me> <d734d616-b18e-4e67-b858-f0eb0a636a87n@googlegroups.com>
<uc2qnl$2gh96$1@dont-email.me> <d651e08e-033d-4a90-8477-6a5fa13d30f3n@googlegroups.com>
<uc4e4t$2rdlt$1@dont-email.me> <1e79f8a1-b707-4074-b272-ce4327ee7bc0n@googlegroups.com>
<uc5dd0$30jrk$1@dont-email.me> <81879984-43e7-409a-a029-1ca6677f536dn@googlegroups.com>
<uc5mlk$32gl3$1@dont-email.me> <8eec8404-4928-4bc3-8b00-c673ea22ab60n@googlegroups.com>
<uc7l4o$3fp72$1@dont-email.me>
User-Agent: G2/1.0
MIME-Version: 1.0
Message-ID: <639e8e6f-2729-476b-9a6e-0b3eb066b06an@googlegroups.com>
Subject: Re: C vs Haskell for XML parsing
From: malcolm.arthur.mclean@gmail.com (Malcolm McLean)
Injection-Date: Thu, 24 Aug 2023 14:50:30 +0000
Content-Type: text/plain; charset="UTF-8"
Lines: 153

by: Malcolm McLean - Thu, 24 Aug 2023 14:50 UTC

On Thursday, 24 August 2023 at 14:15:21 UTC+1, David Brown wrote:
> On 24/08/2023 05:53, Malcolm McLean wrote:
> > On Wednesday, 23 August 2023 at 20:29:07 UTC+1, David Brown wrote:
> >>
> >>> No they are not. drawPath(mypath, false) conveys no information.
> >> Nor does drawRectangle(100, 200, 300, 400).
> >>
> > The function obviously draws a rectangle. And those values are almost
> > certainly pixel values. If they were rbg values they would go from 0-255,
> > and a normal raster designed for human viewing will have pixel co-ordinates
> > in that range.
> Is it (x1, y1, x2, y2) ? (x, y, w, h) ? (row, column, height, width) ?
> Is the y coordinate from the top down, or bottom up? Are we in
> pixels, millimetres, points, or something else? Are we relative to a
> window, or a screen, or an abstract desktop? Are we doing something
> very different, such as using indices into arrays of 3-D points in a
> scene definition?
>
We're not in millimeters probably, because the co-ordinates are integer
values. (I know they could be real but just written as integers, and of course
it's not physically impossible to build a device that draws on discrete millimetre
values. But we're probably nor dealing with millimeters.) It could be points
rather than pixels, fair enough (on our system, "pixel" is a bit of a fuzzy concept
because the document can have several physical representations, and so
we use points by default. But again, the points are reals.)

Whilst we know that the function draws a rectangle, we don't know the details
of the graphics system. But we do know that it caches colour at least, because
the function takes four parameters to describe a rectangle.
>
> The point is, without context and more information (such as the
> declaration of the function with parameter names, or a knowledge of its
> use), you don't know enough to interpret the meaning of the parameters.
> You can guess, but you could be wrong. This is exactly the same whether
> it is a series of "int" parameters or some "bool" parameters, or other
> generic types.
>
Yes, you can be wrong. But you've a pretty good idea what the call probably
does, just from one line, and from an example chosen to illustrate the
opposite point. And of course whilst I've spoken in terms of one line, you
don't actually have only one line of context. If the graphics system was 3D
and used indices into external arrays, you would know, for example.

>
> Such issues are well-known, as are solutions - strong typing (possible
> even in C, though the language doesn't have great support for it - a
> struct parameter with compound literals and designated literals one
> way), named parameters, chained function calls, etc. And of course,
> clear and consistent API's with good declarations and documentation are key.
>
Languages that allow named parameters are nice, and if I was extending C
I would introduce that. The problem is that there are a few basic mathematical
functions like tan() which are conventionally written tan(value), and that was
then used a model for what other functions would be like. However only a few
user-written functions calculate similar basic mathematical functions.
Generally it make sense to write payroll(employees=emp, Nemployees=N,
taxcode=1234), not payroll(emp, N, 1234);

> > Standard library functions are the most commonly called functions in C code,
> > and are always named in my style.
> No, they are not. Names with all lower case and no word separation are
> common in the C standard library for short identifiers (like "strlen"),
> while underscores are common for longer identifiers (like
> "memory_order_relaxed").
>
> It would be nice if you were to check this kind of thing before making
> pointlessly incorrect claims.
>
Ok. So I checked. Every single function in the standard library is written in
my style, with the exception of a handful which take an _r suffix (e.g. asctime_r).
I think these are recent additions.
It's always isdigit(). Never is_digit() or isDigit().
The difference between my style and theirs is that the standard library abbreviates
more aggressively than I would. e.g. strrchr where as I would name the same
function reversestringsearch. Thats because strrchr rapidly becomes familiar
to anyone using C, whilst a function written by me would be unlikely to be used
very widely except in the program it was written for.

> Short names are useful for things that are used often. Longer and more
> descriptive names are better for things that are used more rarely.
> Either way, readability is the key - for common identifiers, brevity
> improves reading speed, while for less common identifiers it is helpful
> to be clearer and more explicit. For longer identifiers, underscores
> are vital to readability (camel-case is fine in medium cases, but can be
> a poor choice when the identifier consists of several words).
>
There's an element of truth in that. But the basic reality is that some identiifers
naturally have short words associated with them (determinant, diagonalize,
identity, etc), whilst for others there is strong convention (i, x, y, N, theta,
ptr). For others, there is no single short English word which describes them
and wouldn't be misleading. So it's "multiply matrix with vector", turned somehow
into a valid identiifer.
>
> Consistency is a benefit, but it is not a priority - it does not trump
> readability.
>
You can become obsessed with consistency, I agree. But sometimes you
might reject a marginally better name because it's not consistent with
the system you've developed for the rest of the program.
>
> If you are trying to say that trying to highlight everything results in
> highlighting nothing, just write that. Or better - don't bother writing
> it, because it is entirely obvious.
>
Its obviously true. When pointed out. But not `'entirely obvious".
> Have you ever used a modern editor or IDE? Do you know what "syntax
> highlighting" is? That applies different appearances to different parts
> of the language syntax - everything is "highlighted", but it makes the
> code easier to read.
>
That is actually true. Coloured code is easier to read than non-coloured.
>
> What? I'm sorry, but you really have very little idea of what you are
> talking about. I am well aware of the wide range of statuses that could
> apply to slaves in the classical Roman and Greek societies - with some
> having significant status, their own slaves, their own property, etc.
> Most scribes were /not/ high status slaves, and were certainly not privy
> to the details of politics, diplomacy and other aspects of running the
> societies. There were a few who were acted more as advisors and were of
> higher status. But usually a good scribe was one who did not remember
> or think about the things he wrote down - he was a dictation machine,
> and perhaps also required to read aloud and track accounts. For many
> uses, the less he understood about what was being written, the better.
>
He would have had a higher status than a modern clerical worker, because
writing was a relatively rare skill, and of course associated with the
upper classes. He wouldn't necessarily have been anything better than
a secretary, of course.
The idea that you could prevent a scribe from reading what he was writing
down is ridiculous. Of course if there was a way of doing this, then the
principals would probably have taken it. But if a Roman senator got his
scribe to write down a letter of complaint to a magistrate, the scribe would
be privy to the matter. Unavoidably.
>
> > But the spaces are not part of the Torah. They are regarded as man-made, not
> > God-given. So there must have been an early tradition of text without spaces.
> I think you have no basis for that conclusion - even if we assume, for
> the sake of argument, that the letters were "God-given". The paper (or
> other material) and ink are all man-made, but found in every scroll of
> the Torah.
>
> As I understand it, Kabalists do not view spaces or punctuation as
> significant when trying to find hidden patterns in the letters of the
> Torah. But Hebrew texts were written with spaces - there is
> (apparently) no good evidence of Hebrew being written without spaces.
>
> However, my knowledge of these things is all indirect - I don't know any
> Hebrew, I only know a little about the language and its script.
>
Yes. What I am saying is that for the idea that the letters were God-given
whilst the spaces were man-made to take root, the society must have been
familiar with text written continuously.

Re: C vs Haskell for XML parsing

<uc7u52$3h9f6$1@dont-email.me>

copy mid

https://www.rocksolidbbs.com/devel/article-flat.php?id=29540&group=comp.lang.c#29540

copy link Newsgroups: comp.lang.c

Path: i2pn2.org!i2pn.org!eternal-september.org!news.eternal-september.org!.POSTED!not-for-mail
From: bc@freeuk.com (Bart)
Newsgroups: comp.lang.c
Subject: Re: C vs Haskell for XML parsing
Date: Thu, 24 Aug 2023 16:48:51 +0100
Organization: A noiseless patient Spider
Lines: 76
Message-ID: <uc7u52$3h9f6$1@dont-email.me>
References: <576801fa-2842-40dc-bf19-221a5b1cf660n@googlegroups.com>
<87jztpu2iu.fsf@bsb.me.uk>
<610a41a0-a3a3-4e01-a9a7-8b5e1fe31ec0n@googlegroups.com>
<87350dtive.fsf@bsb.me.uk> <ubvan6$1rb3s$1@dont-email.me>
<3c87ec37-8fe1-4171-9500-609fad6701b7n@googlegroups.com>
<ubvo4d$1tm0p$1@dont-email.me>
<e9853969-42ce-48db-81e1-d37c8e4da59dn@googlegroups.com>
<uc28id$2dc7f$1@dont-email.me>
<b21393a6-c4f5-436a-9975-8ffedd6bf20bn@googlegroups.com>
<uc2dbv$2e4tg$1@dont-email.me>
<d734d616-b18e-4e67-b858-f0eb0a636a87n@googlegroups.com>
<uc2qnl$2gh96$1@dont-email.me>
<d651e08e-033d-4a90-8477-6a5fa13d30f3n@googlegroups.com>
<uc4e4t$2rdlt$1@dont-email.me>
<1e79f8a1-b707-4074-b272-ce4327ee7bc0n@googlegroups.com>
<uc5dd0$30jrk$1@dont-email.me>
<81879984-43e7-409a-a029-1ca6677f536dn@googlegroups.com>
<uc5mlk$32gl3$1@dont-email.me>
<8eec8404-4928-4bc3-8b00-c673ea22ab60n@googlegroups.com>
<uc7l4o$3fp72$1@dont-email.me>
<639e8e6f-2729-476b-9a6e-0b3eb066b06an@googlegroups.com>
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8; format=flowed
Content-Transfer-Encoding: 7bit
Injection-Date: Thu, 24 Aug 2023 15:48:50 -0000 (UTC)
Injection-Info: dont-email.me; posting-host="d3c9d1d94344e52273f8899204fe5252";
logging-data="3712486"; mail-complaints-to="abuse@eternal-september.org"; posting-account="U2FsdGVkX1818NtLiqyhvdZVfuBU69xInNuECT+DOJ8="
User-Agent: Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:102.0) Gecko/20100101
Thunderbird/102.14.0
Cancel-Lock: sha1:hWuuMNTz7/6b2Mdz21QhWoOJrn4=
In-Reply-To: <639e8e6f-2729-476b-9a6e-0b3eb066b06an@googlegroups.com>

by: Bart - Thu, 24 Aug 2023 15:48 UTC

On 24/08/2023 15:50, Malcolm McLean wrote:
> On Thursday, 24 August 2023 at 14:15:21 UTC+1, David Brown wrote:
>> On 24/08/2023 05:53, Malcolm McLean wrote:
>>> On Wednesday, 23 August 2023 at 20:29:07 UTC+1, David Brown wrote:
>>>>
>>>>> No they are not. drawPath(mypath, false) conveys no information.
>>>> Nor does drawRectangle(100, 200, 300, 400).
>>>>
>>> The function obviously draws a rectangle. And those values are almost
>>> certainly pixel values. If they were rbg values they would go from 0-255,
>>> and a normal raster designed for human viewing will have pixel co-ordinates
>>> in that range.
>> Is it (x1, y1, x2, y2) ? (x, y, w, h) ? (row, column, height, width) ?
>> Is the y coordinate from the top down, or bottom up? Are we in
>> pixels, millimetres, points, or something else? Are we relative to a
>> window, or a screen, or an abstract desktop? Are we doing something
>> very different, such as using indices into arrays of 3-D points in a
>> scene definition?
>>
> We're not in millimeters probably, because the co-ordinates are integer
> values. (I know they could be real but just written as integers, and of course
> it's not physically impossible to build a device that draws on discrete millimetre
> values. But we're probably nor dealing with millimeters.) It could be points
> rather than pixels, fair enough (on our system, "pixel" is a bit of a fuzzy concept
> because the document can have several physical representations, and so
> we use points by default. But again, the points are reals.)
>
> Whilst we know that the function draws a rectangle, we don't know the details
> of the graphics system. But we do know that it caches colour at least, because
> the function takes four parameters to describe a rectangle.
>>
>> The point is, without context and more information (such as the
>> declaration of the function with parameter names, or a knowledge of its
>> use), you don't know enough to interpret the meaning of the parameters.
>> You can guess, but you could be wrong. This is exactly the same whether
>> it is a series of "int" parameters or some "bool" parameters, or other
>> generic types.
>>
> Yes, you can be wrong. But you've a pretty good idea what the call probably
> does, just from one line, and from an example chosen to illustrate the
> opposite point. And of course whilst I've spoken in terms of one line, you
> don't actually have only one line of context. If the graphics system was 3D
> and used indices into external arrays, you would know, for example.
>
>>
>> Such issues are well-known, as are solutions - strong typing (possible
>> even in C, though the language doesn't have great support for it - a
>> struct parameter with compound literals and designated literals one
>> way), named parameters, chained function calls, etc. And of course,
>> clear and consistent API's with good declarations and documentation are key.
>>
> Languages that allow named parameters are nice, and if I was extending C
> I would introduce that.

That would have been far more useful than designated initialisers, which
is the same class of feature.

With the latter, I've seen actual examples like this:

typedef struct {int x, y;} Point;

Point p = {.x = 100, .y = 200};
Point q = {.y = 200, .x = 100};

Without the feature, you would have to write:

Point p = {100, 200};
Point q = {100, 200};

Which for my example is clearer.

But I suppose people can go over the top with keyword parameters too,
and write strlen(.s = S).

I found keyword parameters were more useful when there are lots of
arguments, many optional. It needs to go hand-in-hand with default values.

On 2023-08-24, Bart <bc@freeuk.com> wrote:
> On 24/08/2023 15:50, Malcolm McLean wrote:
>> On Thursday, 24 August 2023 at 14:15:21 UTC+1, David Brown wrote:
>>> On 24/08/2023 05:53, Malcolm McLean wrote:
>>>> On Wednesday, 23 August 2023 at 20:29:07 UTC+1, David Brown wrote:
>>>>>
>>>>>> No they are not. drawPath(mypath, false) conveys no information.
>>>>> Nor does drawRectangle(100, 200, 300, 400).
>>>>>
>>>> The function obviously draws a rectangle. And those values are almost
>>>> certainly pixel values. If they were rbg values they would go from 0-255,
>>>> and a normal raster designed for human viewing will have pixel co-ordinates
>>>> in that range.
>>> Is it (x1, y1, x2, y2) ? (x, y, w, h) ? (row, column, height, width) ?
>>> Is the y coordinate from the top down, or bottom up? Are we in
>>> pixels, millimetres, points, or something else? Are we relative to a
>>> window, or a screen, or an abstract desktop? Are we doing something
>>> very different, such as using indices into arrays of 3-D points in a
>>> scene definition?
>>>
>> We're not in millimeters probably, because the co-ordinates are integer
>> values. (I know they could be real but just written as integers, and of course
>> it's not physically impossible to build a device that draws on discrete millimetre
>> values. But we're probably nor dealing with millimeters.) It could be points
>> rather than pixels, fair enough (on our system, "pixel" is a bit of a fuzzy concept
>> because the document can have several physical representations, and so
>> we use points by default. But again, the points are reals.)
>>
>> Whilst we know that the function draws a rectangle, we don't know the details
>> of the graphics system. But we do know that it caches colour at least, because
>> the function takes four parameters to describe a rectangle.
>>>
>>> The point is, without context and more information (such as the
>>> declaration of the function with parameter names, or a knowledge of its
>>> use), you don't know enough to interpret the meaning of the parameters.
>>> You can guess, but you could be wrong. This is exactly the same whether
>>> it is a series of "int" parameters or some "bool" parameters, or other
>>> generic types.
>>>
>> Yes, you can be wrong. But you've a pretty good idea what the call probably
>> does, just from one line, and from an example chosen to illustrate the
>> opposite point. And of course whilst I've spoken in terms of one line, you
>> don't actually have only one line of context. If the graphics system was 3D
>> and used indices into external arrays, you would know, for example.
>>
>>>
>>> Such issues are well-known, as are solutions - strong typing (possible
>>> even in C, though the language doesn't have great support for it - a
>>> struct parameter with compound literals and designated literals one
>>> way), named parameters, chained function calls, etc. And of course,
>>> clear and consistent API's with good declarations and documentation are key.
>>>
>> Languages that allow named parameters are nice, and if I was extending C
>> I would introduce that.
>
> That would have been far more useful than designated initialisers, which
> is the same class of feature.
>
> With the latter, I've seen actual examples like this:
>
> typedef struct {int x, y;} Point;
>
> Point p = {.x = 100, .y = 200};
> Point q = {.y = 200, .x = 100};
>
> Without the feature, you would have to write:
>
> Point p = {100, 200};
> Point q = {100, 200};
>
> Which for my example is clearer.

Designated initializers allow you to do these:

int array[1000000] = { [999999] = 42 };

union u { int i; double d; } x = { .d = 3.0 };

other than that, they aren't very useful.

The problem is that they are like keyword parameters
which are all optional.

They do not address the following problem: you want to add an important
new member to a structure, such that everyone place which defines
aninstance of the structure has to initialize it to some nonzero/nonnull
value.

To address this problem, you reach for macros

#define init_Point(x, y) { x, y }

If we convert our system to 3D, we can change this:

#define init_Point(x, y, z) { x, y, z }

because macro arguments are all required, all places which pass
only two parameters are diagnosed for us.

Of course, we could use this with designated initializers:

#define init_Point(xi, yi) { .x = xi, .y = yi }

They are not providing much value here, because we only have
one initializer: the one in the macro. That initializer appears
close to the struct definition the same file and is easily
maintained together.

If the structure has some integrity checks, they will catch
most bad initialization mistakes:

struct big_complex_thing {
unsigned head_magic;
/* ... */

unsigned tail_magic;
};

#define big_complex_thing_init(foo, bar, ...) \
{ BIG_COMPLEX_HEAD_MAGIC, foo, bar , ..., BIG_COMPLEX_TAIL_MAGIC }

If a big_complex_thing object does not have the correct tail magic,
something is wrong between the initializer macro and struct
declaration. Or else someone is not using the macro, like they should.

--
TXR Programming Language: http://nongnu.org/txr
Cygnal: Cygwin Native Application Library: http://kylheku.com/cygnal
Mastodon: @Kazinator@mstdn.ca

On 2023-08-24, Kaz Kylheku <864-117-4973@kylheku.com> wrote:
> On 2023-08-24, Bart <bc@freeuk.com> wrote:
[ large amount of quoted material ]

Here I go again. Sorry about the large amount of quoted material.

Re: C vs Haskell for XML parsing

<uc9m7v$3u797$1@dont-email.me>

copy mid

https://www.rocksolidbbs.com/devel/article-flat.php?id=29561&group=comp.lang.c#29561

copy link Newsgroups: comp.lang.c

Path: i2pn2.org!i2pn.org!eternal-september.org!news.eternal-september.org!.POSTED!not-for-mail
From: david.brown@hesbynett.no (David Brown)
Newsgroups: comp.lang.c
Subject: Re: C vs Haskell for XML parsing
Date: Fri, 25 Aug 2023 09:46:06 +0200
Organization: A noiseless patient Spider
Lines: 182
Message-ID: <uc9m7v$3u797$1@dont-email.me>
References: <576801fa-2842-40dc-bf19-221a5b1cf660n@googlegroups.com>
<87jztpu2iu.fsf@bsb.me.uk>
<610a41a0-a3a3-4e01-a9a7-8b5e1fe31ec0n@googlegroups.com>
<87350dtive.fsf@bsb.me.uk> <ubvan6$1rb3s$1@dont-email.me>
<3c87ec37-8fe1-4171-9500-609fad6701b7n@googlegroups.com>
<ubvo4d$1tm0p$1@dont-email.me>
<e9853969-42ce-48db-81e1-d37c8e4da59dn@googlegroups.com>
<uc28id$2dc7f$1@dont-email.me>
<b21393a6-c4f5-436a-9975-8ffedd6bf20bn@googlegroups.com>
<uc2dbv$2e4tg$1@dont-email.me>
<d734d616-b18e-4e67-b858-f0eb0a636a87n@googlegroups.com>
<uc2qnl$2gh96$1@dont-email.me>
<d651e08e-033d-4a90-8477-6a5fa13d30f3n@googlegroups.com>
<uc4e4t$2rdlt$1@dont-email.me>
<1e79f8a1-b707-4074-b272-ce4327ee7bc0n@googlegroups.com>
<uc5dd0$30jrk$1@dont-email.me>
<81879984-43e7-409a-a029-1ca6677f536dn@googlegroups.com>
<uc5mlk$32gl3$1@dont-email.me>
<8eec8404-4928-4bc3-8b00-c673ea22ab60n@googlegroups.com>
<uc7l4o$3fp72$1@dont-email.me>
<639e8e6f-2729-476b-9a6e-0b3eb066b06an@googlegroups.com>
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8; format=flowed
Content-Transfer-Encoding: 7bit
Injection-Date: Fri, 25 Aug 2023 07:46:07 -0000 (UTC)
Injection-Info: dont-email.me; posting-host="c244ea4cca8f9b4180792b6a4896d4d1";
logging-data="4136231"; mail-complaints-to="abuse@eternal-september.org"; posting-account="U2FsdGVkX1/RwMHAWZjGND226hScUFLXFR+Ka0YwwzM="
User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:102.0) Gecko/20100101
Thunderbird/102.9.0
Cancel-Lock: sha1:bBbwuwKA7iq9qsUdAbhELh5/wyQ=
Content-Language: en-GB
In-Reply-To: <639e8e6f-2729-476b-9a6e-0b3eb066b06an@googlegroups.com>

by: David Brown - Fri, 25 Aug 2023 07:46 UTC

On 24/08/2023 16:50, Malcolm McLean wrote:
> On Thursday, 24 August 2023 at 14:15:21 UTC+1, David Brown wrote:
>> On 24/08/2023 05:53, Malcolm McLean wrote:
>>> On Wednesday, 23 August 2023 at 20:29:07 UTC+1, David Brown wrote:
>>>>

> We're not in millimeters probably, because the co-ordinates are integer
<snip>

The discussion was not about trying to guess what these parameters might
have been - it was to demonstrate that you don't know what parameters
like that are just by looking at the call. You might be able to make
some guesses or inferences, but in general you need to know more about
the function declaration, specification, API, documentation, etc. And
that applies to all kinds of parameter types - bool types are in no way
special. You have agreed with me here, so let's leave that and move on.

>
>>
>> Such issues are well-known, as are solutions - strong typing (possible
>> even in C, though the language doesn't have great support for it - a
>> struct parameter with compound literals and designated literals one
>> way), named parameters, chained function calls, etc. And of course,
>> clear and consistent API's with good declarations and documentation are key.
>>
> Languages that allow named parameters are nice, and if I was extending C
> I would introduce that. The problem is that there are a few basic mathematical
> functions like tan() which are conventionally written tan(value), and that was
> then used a model for what other functions would be like. However only a few
> user-written functions calculate similar basic mathematical functions.
> Generally it make sense to write payroll(employees=emp, Nemployees=N,
> taxcode=1234), not payroll(emp, N, 1234);

There are good historical reasons for why named parameters are not
common in older languages, but are more common in newer ones. And there
are good technical reasons why they would be difficult to introduce to C
as an afterthought. But I agree with you that they can be helpful and
improve code readability, in languages that support them.

>
>>> Standard library functions are the most commonly called functions in C code,
>>> and are always named in my style.
>> No, they are not. Names with all lower case and no word separation are
>> common in the C standard library for short identifiers (like "strlen"),
>> while underscores are common for longer identifiers (like
>> "memory_order_relaxed").
>>
>> It would be nice if you were to check this kind of thing before making
>> pointlessly incorrect claims.
>>
> Ok. So I checked. Every single function in the standard library is written in
> my style, with the exception of a handful which take an _r suffix (e.g. asctime_r).
> I think these are recent additions.

I am looking at Annex B of the C standards. There are lots of short
identifiers without word breaks, as I said - such as "isdigit". This is
especially true of older function names, which come from a background of
the days of assemblers and linkers with very limited characters and
identifier lengths. This is part of the reason for some abbreviations
being shorter than you (or I) might have liked. (We should be grateful
that they use small letters and not all-caps!) There are maybe a dozen
at most that have more than 10 characters, such as "isgreaterequal".

Once you get to longer names, they are split with underscore - it is
"atomic_flag_clear_explicit", not "atomicflagclearexplicit", which would
be illegible. The type identifiers have underscores, the macros for
limits, atomic memory access types, file IO constants - all have
underscores. They are all over the place!

No, you have /not/ checked - unless you think I was talking about K&R C
and restricting identifiers to function names only.

And the clear pattern of underscores being more common in newer
identifiers should be a clue to you - just as the world moved on and
embraced word divisions in Latin prose, so the programming world moved
on and embraced word divisions in identifier names. We no longer
restrict our filenames to 8.3 patterns - we no longer write highly
condensed identifiers without word divisions.

> It's always isdigit(). Never is_digit() or isDigit().
> The difference between my style and theirs is that the standard library abbreviates
> more aggressively than I would. e.g. strrchr where as I would name the same
> function reversestringsearch. Thats because strrchr rapidly becomes familiar
> to anyone using C, whilst a function written by me would be unlikely to be used
> very widely except in the program it was written for.
>
>> Short names are useful for things that are used often. Longer and more
>> descriptive names are better for things that are used more rarely.
>> Either way, readability is the key - for common identifiers, brevity
>> improves reading speed, while for less common identifiers it is helpful
>> to be clearer and more explicit. For longer identifiers, underscores
>> are vital to readability (camel-case is fine in medium cases, but can be
>> a poor choice when the identifier consists of several words).
>>
> There's an element of truth in that.

More than an element.

> But the basic reality is that some identiifers
> naturally have short words associated with them (determinant, diagonalize,
> identity, etc), whilst for others there is strong convention (i, x, y, N, theta,
> ptr). For others, there is no single short English word which describes them
> and wouldn't be misleading. So it's "multiply matrix with vector", turned somehow
> into a valid identiifer.

That's all true enough, and in no way contradicts what I said.

If "det" is clear in the context (such as a local variable in a function
that works with matrices), then use "det". If you need "multiply matrix
with vector" to be clear (perhaps its a rarely-used global function),
call it "multiply_matrix_with_vector".

The only thing /wrong/ - and pretty much everyone thinks it is wrong -
would be to call it "multiplymatrixwithvector".

>>
>> Consistency is a benefit, but it is not a priority - it does not trump
>> readability.
>>
> You can become obsessed with consistency, I agree. But sometimes you
> might reject a marginally better name because it's not consistent with
> the system you've developed for the rest of the program.

Think of it as an optimisation problem - different characteristics are
given different weights, and your job as programmer is to pick the
identifier that gives the best total outcome. Consistency has weight 1,
readability has weight 10.

>>> But the spaces are not part of the Torah. They are regarded as man-made, not
>>> God-given. So there must have been an early tradition of text without spaces.
>> I think you have no basis for that conclusion - even if we assume, for
>> the sake of argument, that the letters were "God-given". The paper (or
>> other material) and ink are all man-made, but found in every scroll of
>> the Torah.
>>
>> As I understand it, Kabalists do not view spaces or punctuation as
>> significant when trying to find hidden patterns in the letters of the
>> Torah. But Hebrew texts were written with spaces - there is
>> (apparently) no good evidence of Hebrew being written without spaces.
>>
>> However, my knowledge of these things is all indirect - I don't know any
>> Hebrew, I only know a little about the language and its script.
>>
> Yes. What I am saying is that for the idea that the letters were God-given
> whilst the spaces were man-made to take root, the society must have been
> familiar with text written continuously.
>

No - again, you have no basis for that conclusion.

And you could equally logically have concluded that society must have
been familiar with text written without letters, only with spaces. When
the same logic can be used to "prove" something that silly, you should
be suspicious of your arguments.

I think part of your confusion comes from your idea that the Torah is a
written work, and that written works consist of letters and spaces.
This is wrong on both accounts.

First, the Torah is a spoken work. It was spoken long before it was
written down. Rabbis memorize it - they read the scrolls as a memory
aid. The letters of the Torah come from the spoken word, and spoken
word does not have space characters. Kabbalah comes from the oral
tradition, not the written version of the Torah (though it moved and
adapted to the written work).

Secondly, the writing consists of putting the letters down on the page.
Spacing is like the manuscript material and ink, or the script - it is
not the text, but it is other things needed to make the text legible.
Indeed, spacing is "lower" than the paper and ink, because it consists
of nothing at all.

Click here to read the complete article

Re: C vs Haskell for XML parsing

<uc9n1r$3ubvm$1@dont-email.me>

copy mid

https://www.rocksolidbbs.com/devel/article-flat.php?id=29562&group=comp.lang.c#29562

copy link Newsgroups: comp.lang.c

Path: i2pn2.org!i2pn.org!eternal-september.org!news.eternal-september.org!.POSTED!not-for-mail
From: david.brown@hesbynett.no (David Brown)
Newsgroups: comp.lang.c
Subject: Re: C vs Haskell for XML parsing
Date: Fri, 25 Aug 2023 09:59:54 +0200
Organization: A noiseless patient Spider
Lines: 145
Message-ID: <uc9n1r$3ubvm$1@dont-email.me>
References: <576801fa-2842-40dc-bf19-221a5b1cf660n@googlegroups.com>
<610a41a0-a3a3-4e01-a9a7-8b5e1fe31ec0n@googlegroups.com>
<87350dtive.fsf@bsb.me.uk> <ubvan6$1rb3s$1@dont-email.me>
<3c87ec37-8fe1-4171-9500-609fad6701b7n@googlegroups.com>
<ubvo4d$1tm0p$1@dont-email.me>
<e9853969-42ce-48db-81e1-d37c8e4da59dn@googlegroups.com>
<uc28id$2dc7f$1@dont-email.me>
<b21393a6-c4f5-436a-9975-8ffedd6bf20bn@googlegroups.com>
<uc2dbv$2e4tg$1@dont-email.me>
<d734d616-b18e-4e67-b858-f0eb0a636a87n@googlegroups.com>
<uc2qnl$2gh96$1@dont-email.me>
<d651e08e-033d-4a90-8477-6a5fa13d30f3n@googlegroups.com>
<uc4e4t$2rdlt$1@dont-email.me>
<1e79f8a1-b707-4074-b272-ce4327ee7bc0n@googlegroups.com>
<uc5dd0$30jrk$1@dont-email.me>
<81879984-43e7-409a-a029-1ca6677f536dn@googlegroups.com>
<uc5mlk$32gl3$1@dont-email.me>
<8eec8404-4928-4bc3-8b00-c673ea22ab60n@googlegroups.com>
<uc7l4o$3fp72$1@dont-email.me>
<639e8e6f-2729-476b-9a6e-0b3eb066b06an@googlegroups.com>
<uc7u52$3h9f6$1@dont-email.me>
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8; format=flowed
Content-Transfer-Encoding: 8bit
Injection-Date: Fri, 25 Aug 2023 07:59:55 -0000 (UTC)
Injection-Info: dont-email.me; posting-host="c244ea4cca8f9b4180792b6a4896d4d1";
logging-data="4141046"; mail-complaints-to="abuse@eternal-september.org"; posting-account="U2FsdGVkX1+/gYrxZTNQVFBacU46cZgr6+Po8eA7j/Q="
User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:102.0) Gecko/20100101
Thunderbird/102.9.0
Cancel-Lock: sha1:JMU7gNa98B889pkOxnlbFxiB2XA=
Content-Language: en-GB
In-Reply-To: <uc7u52$3h9f6$1@dont-email.me>

by: David Brown - Fri, 25 Aug 2023 07:59 UTC

On 24/08/2023 17:48, Bart wrote:
> On 24/08/2023 15:50, Malcolm McLean wrote:
>> On Thursday, 24 August 2023 at 14:15:21 UTC+1, David Brown wrote:
>>> On 24/08/2023 05:53, Malcolm McLean wrote:
>>>> On Wednesday, 23 August 2023 at 20:29:07 UTC+1, David Brown wrote:
>>>>>
>>>>>> No they are not. drawPath(mypath, false) conveys no information.
>>>>> Nor does drawRectangle(100, 200, 300, 400).
>>>>>
>>>> The function obviously draws a rectangle. And those values are almost
>>>> certainly pixel values. If they were rbg values they would go from
>>>> 0-255,
>>>> and a normal raster designed for human viewing will have pixel
>>>> co-ordinates
>>>> in that range.
>>> Is it (x1, y1, x2, y2) ? (x, y, w, h) ? (row, column, height, width) ?
>>> Is the y coordinate from the top down, or bottom up? Are we in
>>> pixels, millimetres, points, or something else? Are we relative to a
>>> window, or a screen, or an abstract desktop? Are we doing something
>>> very different, such as using indices into arrays of 3-D points in a
>>> scene definition?
>>>
>> We're not in millimeters probably, because the co-ordinates are integer
>> values. (I know they could be real but just written as integers, and
>> of course
>> it's not physically impossible to build a device that draws on
>> discrete millimetre
>> values. But we're probably nor dealing with millimeters.) It could be
>> points
>> rather than pixels, fair enough (on our system, "pixel" is a bit of a
>> fuzzy concept
>> because the document can have several physical representations, and so
>> we use points by default. But again, the points are reals.)
>>
>> Whilst we know that the function draws a rectangle, we don't know the
>> details
>> of the graphics system. But we do know that it caches colour at least,
>> because
>> the function takes four parameters to describe a rectangle.
>>>
>>> The point is, without context and more information (such as the
>>> declaration of the function with parameter names, or a knowledge of its
>>> use), you don't know enough to interpret the meaning of the parameters.
>>> You can guess, but you could be wrong. This is exactly the same whether
>>> it is a series of "int" parameters or some "bool" parameters, or other
>>> generic types.
>>>
>> Yes, you can be wrong. But you've a pretty good idea what the call
>> probably
>> does, just from one line, and from an example chosen to illustrate the
>> opposite point. And of course whilst I've spoken in terms of one line,
>> you
>> don't actually have only one line of context. If the graphics system
>> was 3D
>> and used indices into external arrays, you would know, for example.
>>
>>>
>>> Such issues are well-known, as are solutions - strong typing (possible
>>> even in C, though the language doesn't have great support for it - a
>>> struct parameter with compound literals and designated literals one
>>> way), named parameters, chained function calls, etc. And of course,
>>> clear and consistent API's with good declarations and documentation
>>> are key.
>>>
>> Languages that allow named parameters are nice, and if I was extending C
>> I would introduce that.
>
> That would have been far more useful than designated initialisers, which
> is the same class of feature.
>
> With the latter, I've seen actual examples like this:
>
>     typedef struct {int x, y;} Point;
>
>     Point p = {.x = 100, .y = 200};
>     Point q = {.y = 200, .x = 100};
>

Re-ordering like that is not often done in real code, but it is
certainly legal in C - and I agree it leads to unclear code.

> Without the feature, you would have to write:
>
> Point p = {100, 200};
> Point q = {100, 200};
>
> Which for my example is clearer.

Agreed, in this case. But arguably it is even better to write :

Point p = {.x = 100, .y = 200};
Point q = {.x = 100, .y = 200};

Using designated initialisers, but without jumbling the order, gives the
clearest result. (Of course in the case of a Point, the x, y ordering
is so ingrained that the identifiers are not really helpful.)

>
> But I suppose people can go over the top with keyword parameters too,
> and write strlen(.s = S).
>
> I found keyword parameters were more useful when there are lots of
> arguments, many optional. It needs to go hand-in-hand with default values.

Agreed.

One of the objections people have to trying to add named parameters to C
or C++ is they feel it will encourage people to make functions with too
many parameters, which is generally a bad idea anyway. I think that
objection is silly - sometimes people write functions with lots of
parameters, for good reasons or bad reasons, and it would be nice to
make such cases easier to read and easier to get correct (and harder to
get wrong).

In C, you can write something like this :

int foo(int a, int b, int c);

int test_foo(int a, int b, int c) {
return foo(a, b , c);
}

typedef struct { int a; int b; int c; } bar_params;
int bar(bar_params p);

int test_bar(int a, int b, int c) {
return bar((bar_params) { .a = a, .b = b, .c = c });
}

int test_bar2(int a, int c) {
return bar((bar_params) { .c = c, .a = a });
}

This gives you something akin to named parameters (and any parameters
you omit will default to 0). But it's ugly to write, and can be
significantly less efficient for the function call. (Usually if you've
got a function with lots of parameters, the function itself will be big,
so the call overhead is unlikely to matter in practice.)

Re: C vs Haskell for XML parsing

<d3c6df71-7ef5-4fd3-83be-8b9a4315f0c4n@googlegroups.com>

copy mid

https://www.rocksolidbbs.com/devel/article-flat.php?id=29565&group=comp.lang.c#29565

copy link Newsgroups: comp.lang.c

X-Received: by 2002:a05:6214:1911:b0:64f:6971:fda7 with SMTP id er17-20020a056214191100b0064f6971fda7mr243219qvb.7.1692952658937;
Fri, 25 Aug 2023 01:37:38 -0700 (PDT)
X-Received: by 2002:ad4:4aed:0:b0:63c:f28e:3472 with SMTP id
cp13-20020ad44aed000000b0063cf28e3472mr400798qvb.10.1692952658730; Fri, 25
Aug 2023 01:37:38 -0700 (PDT)
Path: i2pn2.org!i2pn.org!usenet.blueworldhosting.com!diablo1.usenet.blueworldhosting.com!peer03.iad!feed-me.highwinds-media.com!news.highwinds-media.com!news-out.google.com!nntp.google.com!postnews.google.com!google-groups.googlegroups.com!not-for-mail
Newsgroups: comp.lang.c
Date: Fri, 25 Aug 2023 01:37:38 -0700 (PDT)
In-Reply-To: <uc9m7v$3u797$1@dont-email.me>
Injection-Info: google-groups.googlegroups.com; posting-host=2a00:23a8:400a:5601:1c5f:c8af:ef99:3315;
posting-account=Dz2zqgkAAADlK5MFu78bw3ab-BRFV4Qn
NNTP-Posting-Host: 2a00:23a8:400a:5601:1c5f:c8af:ef99:3315
References: <576801fa-2842-40dc-bf19-221a5b1cf660n@googlegroups.com>
<87jztpu2iu.fsf@bsb.me.uk> <610a41a0-a3a3-4e01-a9a7-8b5e1fe31ec0n@googlegroups.com>
<87350dtive.fsf@bsb.me.uk> <ubvan6$1rb3s$1@dont-email.me> <3c87ec37-8fe1-4171-9500-609fad6701b7n@googlegroups.com>
<ubvo4d$1tm0p$1@dont-email.me> <e9853969-42ce-48db-81e1-d37c8e4da59dn@googlegroups.com>
<uc28id$2dc7f$1@dont-email.me> <b21393a6-c4f5-436a-9975-8ffedd6bf20bn@googlegroups.com>
<uc2dbv$2e4tg$1@dont-email.me> <d734d616-b18e-4e67-b858-f0eb0a636a87n@googlegroups.com>
<uc2qnl$2gh96$1@dont-email.me> <d651e08e-033d-4a90-8477-6a5fa13d30f3n@googlegroups.com>
<uc4e4t$2rdlt$1@dont-email.me> <1e79f8a1-b707-4074-b272-ce4327ee7bc0n@googlegroups.com>
<uc5dd0$30jrk$1@dont-email.me> <81879984-43e7-409a-a029-1ca6677f536dn@googlegroups.com>
<uc5mlk$32gl3$1@dont-email.me> <8eec8404-4928-4bc3-8b00-c673ea22ab60n@googlegroups.com>
<uc7l4o$3fp72$1@dont-email.me> <639e8e6f-2729-476b-9a6e-0b3eb066b06an@googlegroups.com>
<uc9m7v$3u797$1@dont-email.me>
User-Agent: G2/1.0
MIME-Version: 1.0
Message-ID: <d3c6df71-7ef5-4fd3-83be-8b9a4315f0c4n@googlegroups.com>
Subject: Re: C vs Haskell for XML parsing
From: malcolm.arthur.mclean@gmail.com (Malcolm McLean)
Injection-Date: Fri, 25 Aug 2023 08:37:38 +0000
Content-Type: text/plain; charset="UTF-8"
X-Received-Bytes: 11272

by: Malcolm McLean - Fri, 25 Aug 2023 08:37 UTC

On Friday, 25 August 2023 at 08:46:22 UTC+1, David Brown wrote:
> On 24/08/2023 16:50, Malcolm McLean wrote:
> > On Thursday, 24 August 2023 at 14:15:21 UTC+1, David Brown wrote:
> >> On 24/08/2023 05:53, Malcolm McLean wrote:
> >>> On Wednesday, 23 August 2023 at 20:29:07 UTC+1, David Brown wrote:
> >>>>
>
>
> > We're not in millimeters probably, because the co-ordinates are integer
> <snip>
>
> The discussion was not about trying to guess what these parameters might
> have been - it was to demonstrate that you don't know what parameters
> like that are just by looking at the call. You might be able to make
> some guesses or inferences, but in general you need to know more about
> the function declaration, specification, API, documentation, etc. And
> that applies to all kinds of parameter types - bool types are in no way
> special. You have agreed with me here, so let's leave that and move on.
>
Ah but you do. As long as by "know" we mean "know beyond reasonable
doubt". Of course someone could write a function called "drawRectangle"
to draw, i.e. remove a rectangle from a scene. But the chance of that is so
low that it can be dismissed.
In this case, there were two possibilities, either an x,y width,height system or
a top left, bottom right system. An dthe co-ordinates were almost certianly
pixel values because they were integers, and because of their magnitude.
However there was no colour, so the sytsem obviously has state. And therefore
the co-ordinates might be transformed.
You can tell all that, just by looking at one line, with no other context. Because
the parameters are scalars, therefore they either have names or they have
values, and you can draw reasonable conclusions from that.
>
> There are good historical reasons for why named parameters are not
> common in older languages, but are more common in newer ones. And there
> are good technical reasons why they would be difficult to introduce to C
> as an afterthought. But I agree with you that they can be helpful and
> improve code readability, in languages that support them.
>
I don't really see what the technical problem is. The header would have to
contain the names of the parameters (most do already). Then instead
of putting parameter 1 into register a and parameter 2 into register b,
the compiler matches them up. If its void foo(int x, int y) and the
call is foo(y=1, x=2), then the compiler puts a 2 into register a and a 1
into register b.
I haven't tried to modify a compiler to do this so there might be something I
haven't thought of. But it would be a fairly simple, contained change with no
implications for the linker.
> >>> Standard library functions are the most commonly called functions in C code,
> >>> and are always named in my style.
> >> No, they are not. Names with all lower case and no word separation are
> >> common in the C standard library for short identifiers (like "strlen"),
> >> while underscores are common for longer identifiers (like
> >> "memory_order_relaxed").
> >>
> >> It would be nice if you were to check this kind of thing before making
> >> pointlessly incorrect claims.
> >>
> > Ok. So I checked. Every single function in the standard library is written in
> > my style, with the exception of a handful which take an _r suffix (e.g. asctime_r).
> > I think these are recent additions.
> I am looking at Annex B of the C standards. There are lots of short
> identifiers without word breaks, as I said - such as "isdigit". This is
> especially true of older function names, which come from a background of
> the days of assemblers and linkers with very limited characters and
> identifier lengths. This is part of the reason for some abbreviations
> being shorter than you (or I) might have liked. (We should be grateful
> that they use small letters and not all-caps!) There are maybe a dozen
> at most that have more than 10 characters, such as "isgreaterequal".
>
> Once you get to longer names, they are split with underscore - it is
> "atomic_flag_clear_explicit", not "atomicflagclearexplicit", which would
> be illegible. The type identifiers have underscores, the macros for
> limits, atomic memory access types, file IO constants - all have
> underscores. They are all over the place!
>
> No, you have /not/ checked - unless you think I was talking about K&R C
> and restricting identifiers to function names only.
>
I said every function. You were talking about identifiers. Of course size_t has
an underscore. Everyone knows that. And most people agree that it is horrible.
And it wasn't part of C as orginally designed.
>
> And the clear pattern of underscores being more common in newer
> identifiers should be a clue to you - just as the world moved on and
> embraced word divisions in Latin prose, so the programming world moved
> on and embraced word divisions in identifier names. We no longer
> restrict our filenames to 8.3 patterns - we no longer write highly
> condensed identifiers without word divisions.
>
Ritchie was a genius who knew how to design a programming language.
The people who came after him were lesser men.
>
> > There's an element of truth in that.
> More than an element.
You choose names which describe the thing you are naming. Primarily.
And that description might be a short English word, a long English word, or
a short English phrase. That bears no relation to whether the identiifer is used
a lot or used relatively infrequently.
But the element of truth is that, if an indentifier is used frequently, you can get
away with a short English word which, outside of context, would be ambiguous.
So there is an element of truth in what you say.

>
> The only thing /wrong/ - and pretty much everyone thinks it is wrong -
> would be to call it "multiplymatrixwithvector".
>
Well Caesar disagreed.
Denis Ritchie disagreed.
Unfortunately I can't find it, but some reasearch on human-readable urls disagreed.

Whilst all you can offer is "I disagree". But you automatically disagree with things
some people say. So how much weight should we give to that?
>
> Think of it as an optimisation problem - different characteristics are
> given different weights, and your job as programmer is to pick the
> identifier that gives the best total outcome. Consistency has weight 1,
> readability has weight 10.
>
Cnsistency aids readability. Readbility of a single identifier aids readability of
that identifier, but may damage the readability of the text as a whole, largely
because it breaks consistency.
But consistency isn't an absolute, I agree there.
>
> > Yes. What I am saying is that for the idea that the letters were God-given
> > whilst the spaces were man-made to take root, the society must have been
> > familiar with text written continuously.
> >
> No - again, you have no basis for that conclusion.
>
> And you could equally logically have concluded that society must have
> been familiar with text written without letters, only with spaces. When
> the same logic can be used to "prove" something that silly, you should
> be suspicious of your arguments.
>
Are you advancing an objection that weak?
>
> I think part of your confusion comes from your idea that the Torah is a
> written work, and that written works consist of letters and spaces.
> This is wrong on both accounts.
>
> First, the Torah is a spoken work. It was spoken long before it was
> written down. Rabbis memorize it - they read the scrolls as a memory
> aid. The letters of the Torah come from the spoken word, and spoken
> word does not have space characters. Kabbalah comes from the oral
> tradition, not the written version of the Torah (though it moved and
> adapted to the written work).
>
What was important to the Jews was that it was written. God's word, written
down. Of course modern scholars are sceptical of that. But Jews themselves
didn't believe that Torah had come about by a process of redaction by scribes
working from oral tradtions in Babylon.
>
> Secondly, the writing consists of putting the letters down on the page.
> Spacing is like the manuscript material and ink, or the script - it is
> not the text, but it is other things needed to make the text legible.
> Indeed, spacing is "lower" than the paper and ink, because it consists
> of nothing at all.
>
> You do not need to have read texts with no nothingness in order to think
> that added nothingness is not of mystical importance. There are not
> aspects of the Torah scrolls divided into "God-given" and "man-made" -
> that is a false dichotomy. There is the "God-given" words and letters,
> and who cares about any of the rest of it?
>
The letters are given by God, the spaces added by man. So yes, there is a
distinction. It doesn't matter much, but it does matter. As you yourself
pointed out, when searching for "Torah codes" the spaces are excluded.
Since they were not given by God, a similar search which included spaces
would be invalid.

Click here to read the complete article

Re: C vs Haskell for XML parsing

<j=2ifExxFfXU05g0F@bongo-ra.co>

copy mid

https://www.rocksolidbbs.com/devel/article-flat.php?id=29566&group=comp.lang.c#29566

copy link Newsgroups: comp.lang.c

Path: i2pn2.org!i2pn.org!paganini.bofh.team!not-for-mail
From: spibou@gmail.com (Spiros Bousbouras)
Newsgroups: comp.lang.c
Subject: Re: C vs Haskell for XML parsing
Date: Fri, 25 Aug 2023 08:50:12 -0000 (UTC)
Organization: To protect and to server
Message-ID: <j=2ifExxFfXU05g0F@bongo-ra.co>
References: <576801fa-2842-40dc-bf19-221a5b1cf660n@googlegroups.com> <87jztpu2iu.fsf@bsb.me.uk> <610a41a0-a3a3-4e01-a9a7-8b5e1fe31ec0n@googlegroups.com>
<87350dtive.fsf@bsb.me.uk> <ubvan6$1rb3s$1@dont-email.me> <3c87ec37-8fe1-4171-9500-609fad6701b7n@googlegroups.com>
<ubvo4d$1tm0p$1@dont-email.me> <e9853969-42ce-48db-81e1-d37c8e4da59dn@googlegroups.com> <uc28id$2dc7f$1@dont-email.me>
<b21393a6-c4f5-436a-9975-8ffedd6bf20bn@googlegroups.com> <uc2dbv$2e4tg$1@dont-email.me> <d734d616-b18e-4e67-b858-f0eb0a636a87n@googlegroups.com>
<uc2qnl$2gh96$1@dont-email.me> <d651e08e-033d-4a90-8477-6a5fa13d30f3n@googlegroups.com> <uc4e4t$2rdlt$1@dont-email.me>
<1e79f8a1-b707-4074-b272-ce4327ee7bc0n@googlegroups.com> <uc5dd0$30jrk$1@dont-email.me> <81879984-43e7-409a-a029-1ca6677f536dn@googlegroups.com>
<uc5mlk$32gl3$1@dont-email.me> <8eec8404-4928-4bc3-8b00-c673ea22ab60n@googlegroups.com> <uc7l4o$3fp72$1@dont-email.me>
<639e8e6f-2729-476b-9a6e-0b3eb066b06an@googlegroups.com> <uc9m7v$3u797$1@dont-email.me> <d3c6df71-7ef5-4fd3-83be-8b9a4315f0c4n@googlegroups.com>
Mime-Version: 1.0
Content-Type: text/plain; charset=ISO-8859-1
Content-Transfer-Encoding: 8bit
Injection-Date: Fri, 25 Aug 2023 08:50:12 -0000 (UTC)
Injection-Info: paganini.bofh.team; logging-data="1405095"; posting-host="9H7U5kayiTdk7VIdYU44Rw.user.paganini.bofh.team"; mail-complaints-to="usenet@bofh.team"; posting-account="9dIQLXBM7WM9KzA+yjdR4A";
Cancel-Lock: sha256:qgxroRxJNVFkNaR4xCObPWfNSzzaEXP6iEwIb/B2dC8=
X-Organisation: Weyland-Yutani
X-Server-Commands: nowebcancel
X-Notice: Filtered by postfilter v. 0.9.3

by: Spiros Bousbouras - Fri, 25 Aug 2023 08:50 UTC

On Fri, 25 Aug 2023 01:37:38 -0700 (PDT)
Malcolm McLean <malcolm.arthur.mclean@gmail.com> wrote:
> Of course size_t has an underscore. Everyone knows that. And most people
> agree that it is horrible.

"Most people" meaning you made it up.

Re: C vs Haskell for XML parsing

<ae3d2d4e-71bd-4c6c-99aa-cf8b6b652175n@googlegroups.com>

copy mid

https://www.rocksolidbbs.com/devel/article-flat.php?id=29567&group=comp.lang.c#29567

copy link Newsgroups: comp.lang.c

X-Received: by 2002:ad4:55d0:0:b0:649:9ae9:2924 with SMTP id bt16-20020ad455d0000000b006499ae92924mr371773qvb.11.1692953637993;
Fri, 25 Aug 2023 01:53:57 -0700 (PDT)
X-Received: by 2002:a05:6214:b29:b0:649:e869:ec6f with SMTP id
w9-20020a0562140b2900b00649e869ec6fmr335004qvj.9.1692953637801; Fri, 25 Aug
2023 01:53:57 -0700 (PDT)
Path: i2pn2.org!i2pn.org!usenet.blueworldhosting.com!diablo1.usenet.blueworldhosting.com!peer02.iad!feed-me.highwinds-media.com!news.highwinds-media.com!news-out.google.com!nntp.google.com!postnews.google.com!google-groups.googlegroups.com!not-for-mail
Newsgroups: comp.lang.c
Date: Fri, 25 Aug 2023 01:53:57 -0700 (PDT)
In-Reply-To: <j=2ifExxFfXU05g0F@bongo-ra.co>
Injection-Info: google-groups.googlegroups.com; posting-host=2a00:23a8:400a:5601:a5f9:166e:840e:d1f9;
posting-account=Dz2zqgkAAADlK5MFu78bw3ab-BRFV4Qn
NNTP-Posting-Host: 2a00:23a8:400a:5601:a5f9:166e:840e:d1f9
References: <576801fa-2842-40dc-bf19-221a5b1cf660n@googlegroups.com>
<87jztpu2iu.fsf@bsb.me.uk> <610a41a0-a3a3-4e01-a9a7-8b5e1fe31ec0n@googlegroups.com>
<87350dtive.fsf@bsb.me.uk> <ubvan6$1rb3s$1@dont-email.me> <3c87ec37-8fe1-4171-9500-609fad6701b7n@googlegroups.com>
<ubvo4d$1tm0p$1@dont-email.me> <e9853969-42ce-48db-81e1-d37c8e4da59dn@googlegroups.com>
<uc28id$2dc7f$1@dont-email.me> <b21393a6-c4f5-436a-9975-8ffedd6bf20bn@googlegroups.com>
<uc2dbv$2e4tg$1@dont-email.me> <d734d616-b18e-4e67-b858-f0eb0a636a87n@googlegroups.com>
<uc2qnl$2gh96$1@dont-email.me> <d651e08e-033d-4a90-8477-6a5fa13d30f3n@googlegroups.com>
<uc4e4t$2rdlt$1@dont-email.me> <1e79f8a1-b707-4074-b272-ce4327ee7bc0n@googlegroups.com>
<uc5dd0$30jrk$1@dont-email.me> <81879984-43e7-409a-a029-1ca6677f536dn@googlegroups.com>
<uc5mlk$32gl3$1@dont-email.me> <8eec8404-4928-4bc3-8b00-c673ea22ab60n@googlegroups.com>
<uc7l4o$3fp72$1@dont-email.me> <639e8e6f-2729-476b-9a6e-0b3eb066b06an@googlegroups.com>
<uc9m7v$3u797$1@dont-email.me> <d3c6df71-7ef5-4fd3-83be-8b9a4315f0c4n@googlegroups.com>
<j=2ifExxFfXU05g0F@bongo-ra.co>
User-Agent: G2/1.0
MIME-Version: 1.0
Message-ID: <ae3d2d4e-71bd-4c6c-99aa-cf8b6b652175n@googlegroups.com>
Subject: Re: C vs Haskell for XML parsing
From: malcolm.arthur.mclean@gmail.com (Malcolm McLean)
Injection-Date: Fri, 25 Aug 2023 08:53:57 +0000
Content-Type: text/plain; charset="UTF-8"
X-Received-Bytes: 2779

by: Malcolm McLean - Fri, 25 Aug 2023 08:53 UTC

On Friday, 25 August 2023 at 09:50:32 UTC+1, Spiros Bousbouras wrote:
> On Fri, 25 Aug 2023 01:37:38 -0700 (PDT)
> Malcolm McLean <malcolm.ar...@gmail.com> wrote:
> > Of course size_t has an underscore. Everyone knows that. And most people
> > agree that it is horrible.
> "Most people" meaning you made it up.
>
We could do a straw poll.
How many people like the underscore in size_t and how many think it is horrible?
There are some people who think that the comittee can do no wrong, so it
won't be unanimous. But I think I know where the majority will be.

Underscores in type names (was : C vs Haskell for XML parsing)

<BltvTW49qZtb8Sjsd@bongo-ra.co>

copy mid

https://www.rocksolidbbs.com/devel/article-flat.php?id=29568&group=comp.lang.c#29568

copy link Newsgroups: comp.lang.c

Path: i2pn2.org!i2pn.org!eternal-september.org!news.eternal-september.org!.POSTED!not-for-mail
From: spibou@gmail.com (Spiros Bousbouras)
Newsgroups: comp.lang.c
Subject: Underscores in type names (was : C vs Haskell for XML parsing)
Date: Fri, 25 Aug 2023 09:17:02 -0000 (UTC)
Organization: A noiseless patient Spider
Lines: 22
Message-ID: <BltvTW49qZtb8Sjsd@bongo-ra.co>
References: <576801fa-2842-40dc-bf19-221a5b1cf660n@googlegroups.com> <87jztpu2iu.fsf@bsb.me.uk> <610a41a0-a3a3-4e01-a9a7-8b5e1fe31ec0n@googlegroups.com>
<87350dtive.fsf@bsb.me.uk> <ubvan6$1rb3s$1@dont-email.me> <3c87ec37-8fe1-4171-9500-609fad6701b7n@googlegroups.com>
<ubvo4d$1tm0p$1@dont-email.me> <e9853969-42ce-48db-81e1-d37c8e4da59dn@googlegroups.com> <uc28id$2dc7f$1@dont-email.me>
<b21393a6-c4f5-436a-9975-8ffedd6bf20bn@googlegroups.com> <uc2dbv$2e4tg$1@dont-email.me> <d734d616-b18e-4e67-b858-f0eb0a636a87n@googlegroups.com>
<uc2qnl$2gh96$1@dont-email.me> <d651e08e-033d-4a90-8477-6a5fa13d30f3n@googlegroups.com> <uc4e4t$2rdlt$1@dont-email.me>
<1e79f8a1-b707-4074-b272-ce4327ee7bc0n@googlegroups.com> <uc5dd0$30jrk$1@dont-email.me> <81879984-43e7-409a-a029-1ca6677f536dn@googlegroups.com>
<uc5mlk$32gl3$1@dont-email.me> <8eec8404-4928-4bc3-8b00-c673ea22ab60n@googlegroups.com> <uc7l4o$3fp72$1@dont-email.me>
<639e8e6f-2729-476b-9a6e-0b3eb066b06an@googlegroups.com> <uc9m7v$3u797$1@dont-email.me> <d3c6df71-7ef5-4fd3-83be-8b9a4315f0c4n@googlegroups.com>
<j=2ifExxFfXU05g0F@bongo-ra.co> <ae3d2d4e-71bd-4c6c-99aa-cf8b6b652175n@googlegroups.com>
MIME-Version: 1.0
Content-Type: text/plain; charset=ISO-8859-1
Content-Transfer-Encoding: 8bit
Injection-Date: Fri, 25 Aug 2023 09:17:02 -0000 (UTC)
Injection-Info: dont-email.me; posting-host="8ebd16a256b2490518de4f551bf0edd8";
logging-data="4170387"; mail-complaints-to="abuse@eternal-september.org"; posting-account="U2FsdGVkX1+j8IJBNpUtEgh4QxwYJnCw"
Cancel-Lock: sha1:5WPpvtipWPlhj4QMbA4vFyOyjbg=
X-Server-Commands: nowebcancel
X-Organisation: Weyland-Yutani
In-Reply-To: <ae3d2d4e-71bd-4c6c-99aa-cf8b6b652175n@googlegroups.com>

by: Spiros Bousbouras - Fri, 25 Aug 2023 09:17 UTC

On Fri, 25 Aug 2023 01:53:57 -0700 (PDT)
Malcolm McLean <malcolm.arthur.mclean@gmail.com> wrote:
> On Friday, 25 August 2023 at 09:50:32 UTC+1, Spiros Bousbouras wrote:
> > On Fri, 25 Aug 2023 01:37:38 -0700 (PDT)
> > Malcolm McLean <malcolm.ar...@gmail.com> wrote:
> > > Of course size_t has an underscore. Everyone knows that. And most people
> > > agree that it is horrible.
> > "Most people" meaning you made it up.
> >
> We could do a straw poll.
> How many people like the underscore in size_t and how many think it is horrible?

I like it. sizetype would also have been reasonable but not sizet which
would be mystifying. Note that the POSIX standard also uses the same template
for naming types like off_t or pid_t .

> There are some people who think that the comittee can do no wrong,

Can you name one who thinks so ?

> so it
> won't be unanimous. But I think I know where the majority will be.

Re: Underscores in type names (was : C vs Haskell for XML parsing)

<uca05k$328$1@dont-email.me>

copy mid

https://www.rocksolidbbs.com/devel/article-flat.php?id=29572&group=comp.lang.c#29572

copy link Newsgroups: comp.lang.c

Path: i2pn2.org!i2pn.org!eternal-september.org!news.eternal-september.org!.POSTED!not-for-mail
From: richard.nospam@gmail.com (Richard Harnden)
Newsgroups: comp.lang.c
Subject: Re: Underscores in type names (was : C vs Haskell for XML parsing)
Date: Fri, 25 Aug 2023 11:35:29 +0100
Organization: A noiseless patient Spider
Lines: 31
Message-ID: <uca05k$328$1@dont-email.me>
References: <576801fa-2842-40dc-bf19-221a5b1cf660n@googlegroups.com>
<ubvo4d$1tm0p$1@dont-email.me>
<e9853969-42ce-48db-81e1-d37c8e4da59dn@googlegroups.com>
<uc28id$2dc7f$1@dont-email.me>
<b21393a6-c4f5-436a-9975-8ffedd6bf20bn@googlegroups.com>
<uc2dbv$2e4tg$1@dont-email.me>
<d734d616-b18e-4e67-b858-f0eb0a636a87n@googlegroups.com>
<uc2qnl$2gh96$1@dont-email.me>
<d651e08e-033d-4a90-8477-6a5fa13d30f3n@googlegroups.com>
<uc4e4t$2rdlt$1@dont-email.me>
<1e79f8a1-b707-4074-b272-ce4327ee7bc0n@googlegroups.com>
<uc5dd0$30jrk$1@dont-email.me>
<81879984-43e7-409a-a029-1ca6677f536dn@googlegroups.com>
<uc5mlk$32gl3$1@dont-email.me>
<8eec8404-4928-4bc3-8b00-c673ea22ab60n@googlegroups.com>
<uc7l4o$3fp72$1@dont-email.me>
<639e8e6f-2729-476b-9a6e-0b3eb066b06an@googlegroups.com>
<uc9m7v$3u797$1@dont-email.me>
<d3c6df71-7ef5-4fd3-83be-8b9a4315f0c4n@googlegroups.com>
<j=2ifExxFfXU05g0F@bongo-ra.co>
<ae3d2d4e-71bd-4c6c-99aa-cf8b6b652175n@googlegroups.com>
<BltvTW49qZtb8Sjsd@bongo-ra.co>
Reply-To: nospam.harnden@gmail.com
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8; format=flowed
Content-Transfer-Encoding: 7bit
Injection-Date: Fri, 25 Aug 2023 10:35:32 -0000 (UTC)
Injection-Info: dont-email.me; posting-host="f0a2744cc52e8366932cdee9b2cf3858";
logging-data="3144"; mail-complaints-to="abuse@eternal-september.org"; posting-account="U2FsdGVkX1+jLBdy2x6hyo8qFpYk7s4Q3bCX7LbR0a0="
User-Agent: Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:102.0) Gecko/20100101
Thunderbird/102.14.0
Cancel-Lock: sha1:8UwHM11jWy907mOVu/otErSM74s=
In-Reply-To: <BltvTW49qZtb8Sjsd@bongo-ra.co>

by: Richard Harnden - Fri, 25 Aug 2023 10:35 UTC

On 25/08/2023 10:17, Spiros Bousbouras wrote:
> On Fri, 25 Aug 2023 01:53:57 -0700 (PDT)
> Malcolm McLean <malcolm.arthur.mclean@gmail.com> wrote:
>> On Friday, 25 August 2023 at 09:50:32 UTC+1, Spiros Bousbouras wrote:
>>> On Fri, 25 Aug 2023 01:37:38 -0700 (PDT)
>>> Malcolm McLean <malcolm.ar...@gmail.com> wrote:
>>>> Of course size_t has an underscore. Everyone knows that. And most people
>>>> agree that it is horrible.
>>> "Most people" meaning you made it up.
>>>
>> We could do a straw poll.
>> How many people like the underscore in size_t and how many think it is horrible?
>
> I like it. sizetype would also have been reasonable but not sizet which
> would be mystifying. Note that the POSIX standard also uses the same template
> for naming types like off_t or pid_t .

'size' has to be a common variable name, sizet is horrid, so size_t for
the typedef is pretty much required.

>
>> There are some people who think that the comittee can do no wrong,
>
> Can you name one who thinks so ?
>
>> so it
>> won't be unanimous. But I think I know where the majority will be.

I like '_t's for typedefs, I don't like '_s's for structs (for _u for
unions, _e for enums)

Malcolm McLean <malcolm.arthur.mclean@gmail.com> writes:
> On Friday, 25 August 2023 at 08:46:22 UTC+1, David Brown wrote:
[...]
>> The only thing /wrong/ - and pretty much everyone thinks it is wrong -
>> would be to call it "multiplymatrixwithvector".
>>
> Well Caesar disagreed.

Let's drop the lengthy discussions of ancient writing systems, shall we?
(And David, please stop encouraging them.)

> Denis Ritchie disagreed.

I doubt that. Dennis Ritchie worked in environments that limited
external identifiers to 6 characters. That's why we have "strcpy",
not because it's easier to read.

[...]

--
Keith Thompson (The_Other_Keith) Keith.S.Thompson+u@gmail.com
Will write code for food.
void Void(void) { Void(); } /* The recursive call of the void */

Re: C vs Haskell for XML parsing

<uca3sh$p9c$1@dont-email.me>

copy mid

https://www.rocksolidbbs.com/devel/article-flat.php?id=29580&group=comp.lang.c#29580

copy link Newsgroups: comp.lang.c

Path: i2pn2.org!i2pn.org!eternal-september.org!news.eternal-september.org!.POSTED!not-for-mail
From: david.brown@hesbynett.no (David Brown)
Newsgroups: comp.lang.c
Subject: Re: C vs Haskell for XML parsing
Date: Fri, 25 Aug 2023 13:38:57 +0200
Organization: A noiseless patient Spider
Lines: 239
Message-ID: <uca3sh$p9c$1@dont-email.me>
References: <576801fa-2842-40dc-bf19-221a5b1cf660n@googlegroups.com>
<87350dtive.fsf@bsb.me.uk> <ubvan6$1rb3s$1@dont-email.me>
<3c87ec37-8fe1-4171-9500-609fad6701b7n@googlegroups.com>
<ubvo4d$1tm0p$1@dont-email.me>
<e9853969-42ce-48db-81e1-d37c8e4da59dn@googlegroups.com>
<uc28id$2dc7f$1@dont-email.me>
<b21393a6-c4f5-436a-9975-8ffedd6bf20bn@googlegroups.com>
<uc2dbv$2e4tg$1@dont-email.me>
<d734d616-b18e-4e67-b858-f0eb0a636a87n@googlegroups.com>
<uc2qnl$2gh96$1@dont-email.me>
<d651e08e-033d-4a90-8477-6a5fa13d30f3n@googlegroups.com>
<uc4e4t$2rdlt$1@dont-email.me>
<1e79f8a1-b707-4074-b272-ce4327ee7bc0n@googlegroups.com>
<uc5dd0$30jrk$1@dont-email.me>
<81879984-43e7-409a-a029-1ca6677f536dn@googlegroups.com>
<uc5mlk$32gl3$1@dont-email.me>
<8eec8404-4928-4bc3-8b00-c673ea22ab60n@googlegroups.com>
<uc7l4o$3fp72$1@dont-email.me>
<639e8e6f-2729-476b-9a6e-0b3eb066b06an@googlegroups.com>
<uc9m7v$3u797$1@dont-email.me>
<d3c6df71-7ef5-4fd3-83be-8b9a4315f0c4n@googlegroups.com>
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8; format=flowed
Content-Transfer-Encoding: 7bit
Injection-Date: Fri, 25 Aug 2023 11:38:57 -0000 (UTC)
Injection-Info: dont-email.me; posting-host="c244ea4cca8f9b4180792b6a4896d4d1";
logging-data="25900"; mail-complaints-to="abuse@eternal-september.org"; posting-account="U2FsdGVkX1+j96KAZKOF3tFoOXD/a6SYOTAQdRqXPcQ="
User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:102.0) Gecko/20100101
Thunderbird/102.9.0
Cancel-Lock: sha1:eN0TttzURR02rxyWJ9Z2X26J1aI=
Content-Language: en-GB
In-Reply-To: <d3c6df71-7ef5-4fd3-83be-8b9a4315f0c4n@googlegroups.com>

by: David Brown - Fri, 25 Aug 2023 11:38 UTC

On 25/08/2023 10:37, Malcolm McLean wrote:
> On Friday, 25 August 2023 at 08:46:22 UTC+1, David Brown wrote:
>> On 24/08/2023 16:50, Malcolm McLean wrote:
>>> On Thursday, 24 August 2023 at 14:15:21 UTC+1, David Brown wrote:
>>>> On 24/08/2023 05:53, Malcolm McLean wrote:
>>>>> On Wednesday, 23 August 2023 at 20:29:07 UTC+1, David Brown wrote:
>>>>>>
>>
>> There are good historical reasons for why named parameters are not
>> common in older languages, but are more common in newer ones. And there
>> are good technical reasons why they would be difficult to introduce to C
>> as an afterthought. But I agree with you that they can be helpful and
>> improve code readability, in languages that support them.
>>
> I don't really see what the technical problem is. The header would have to
> contain the names of the parameters (most do already). Then instead
> of putting parameter 1 into register a and parameter 2 into register b,
> the compiler matches them up. If its void foo(int x, int y) and the
> call is foo(y=1, x=2), then the compiler puts a 2 into register a and a 1
> into register b.
> I haven't tried to modify a compiler to do this so there might be something I
> haven't thought of. But it would be a fairly simple, contained change with no
> implications for the linker.

The key technical problem for C is that the parameter names are not part
of the signature of the function. It is not uncommon for declarations
to have missing parameter names, or different names from those uses in
the definition of the function. (Sometimes there are good reasons for
this, sometimes it is laziness.) If a function is defined as "void
foo(int a, int b) { .. }", but the declaration used to call it has "void
foo(int b, int a);", then how should named parameters be resolved?

Any resolution here would place restrictions on which functions could
support named parameters - such as consistent declarations and
definitions, but it would be impossible (in general) to check them.

>>>>> Standard library functions are the most commonly called functions in C code,
>>>>> and are always named in my style.
>>>> No, they are not. Names with all lower case and no word separation are
>>>> common in the C standard library for short identifiers (like "strlen"),
>>>> while underscores are common for longer identifiers (like
>>>> "memory_order_relaxed").
>>>>
>>>> It would be nice if you were to check this kind of thing before making
>>>> pointlessly incorrect claims.
>>>>
>>> Ok. So I checked. Every single function in the standard library is written in
>>> my style, with the exception of a handful which take an _r suffix (e.g. asctime_r).
>>> I think these are recent additions.
>> I am looking at Annex B of the C standards. There are lots of short
>> identifiers without word breaks, as I said - such as "isdigit". This is
>> especially true of older function names, which come from a background of
>> the days of assemblers and linkers with very limited characters and
>> identifier lengths. This is part of the reason for some abbreviations
>> being shorter than you (or I) might have liked. (We should be grateful
>> that they use small letters and not all-caps!) There are maybe a dozen
>> at most that have more than 10 characters, such as "isgreaterequal".
>>
>> Once you get to longer names, they are split with underscore - it is
>> "atomic_flag_clear_explicit", not "atomicflagclearexplicit", which would
>> be illegible. The type identifiers have underscores, the macros for
>> limits, atomic memory access types, file IO constants - all have
>> underscores. They are all over the place!
>>
>> No, you have /not/ checked - unless you think I was talking about K&R C
>> and restricting identifiers to function names only.
>>
> I said every function. You were talking about identifiers.

I was always talking about identifiers - why on earth would we be
restricting the kind of identifier? Function names are not the only
identifiers that need to be read!

> Of course size_t has
> an underscore. Everyone knows that. And most people agree that it is horrible.

No, most people do /not/ agree that. /Some/ people think it is
horrible, others like it, and others don't really care.

> And it wasn't part of C as orginally designed.

This is a C group, for people who program in C. History can be
fascinating - both you and I are interested in it. People program in
C99 or C11, not pre-K&R C. History is only of relevance when comparing
older identifiers that had no underscores with newer ones that /have/
underscores to improve readability.

It is incomprehensible to me that you still don't understand this. Even
if you were right (which you are not) in your belief that the C standard
does not use underscores, it would not change the very simple fact that
long run-together names are harder to read than those with clear word
divisions. They are so much easier to read that it totally outweighs
any real or imagined thoughts of consistency.

>>
>> And the clear pattern of underscores being more common in newer
>> identifiers should be a clue to you - just as the world moved on and
>> embraced word divisions in Latin prose, so the programming world moved
>> on and embraced word divisions in identifier names. We no longer
>> restrict our filenames to 8.3 patterns - we no longer write highly
>> condensed identifiers without word divisions.
>>
> Ritchie was a genius who knew how to design a programming language.
> The people who came after him were lesser men.

Hero worship is almost always a sign that you haven't thought things
through yourself.

Ritchie made a pretty good language for his needs at the time, with the
technology of the time, the understanding of programming at the time,
and the fads and habits of the time. A combination of good design,
luck, and the work of people afterwards, means that we still use the
distant descendent of his original language. C has never been, and
never will be, a "perfect" language, for any purpose. It succeeds
because it is good enough for many things, maintaining its usefulness
today through momentum despite alternative languages that are better for
at least some purposes.

>>
>> The only thing /wrong/ - and pretty much everyone thinks it is wrong -
>> would be to call it "multiplymatrixwithvector".
>>
> Well Caesar disagreed.

Caesar didn't write programs.

> Denis Ritchie disagreed.

Ritchie didn't use long identifiers much - and when he did, he used
underscores.

Not that either of these appeals to authority have any relevance.

> Unfortunately I can't find it, but some reasearch on human-readable urls disagreed.

You'll not find it, because long URLs without any breaks are not
human-readable. And even if they were, they are not program identifiers.

It's true that there are some domain names that have longer run-together
names (some have dots or hyphens as word separators). For the most
part, you don't type these - you click on them. And even if you type
them once, your browser remembers them for later. They are generally
considered risky if the site is popular, because they are easy prey for
scammers and malware peddlers to register domains that are a typo away.

>
> Whilst all you can offer is "I disagree". But you automatically disagree with things
> some people say. So how much weight should we give to that?

I don't automatically disagree with other people. But I rarely write
"me too!" posts about things that are generally agreed upon -
discussions thrive on disagreement. Would you rather we chatted about
the weather or the price of bread, or complained about the youth of today?

>>
>> Think of it as an optimisation problem - different characteristics are
>> given different weights, and your job as programmer is to pick the
>> identifier that gives the best total outcome. Consistency has weight 1,
>> readability has weight 10.
>>
> Cnsistency aids readability. Readbility of a single identifier aids readability of
> that identifier, but may damage the readability of the text as a whole, largely
> because it breaks consistency.
> But consistency isn't an absolute, I agree there.
>>
>>> Yes. What I am saying is that for the idea that the letters were God-given
>>> whilst the spaces were man-made to take root, the society must have been
>>> familiar with text written continuously.
>>>
>> No - again, you have no basis for that conclusion.
>>
>> And you could equally logically have concluded that society must have
>> been familiar with text written without letters, only with spaces. When
>> the same logic can be used to "prove" something that silly, you should
>> be suspicious of your arguments.
>>
> Are you advancing an objection that weak?

Click here to read the complete article

Re: C vs Haskell for XML parsing

<uca43t$p9c$2@dont-email.me>

copy mid

https://www.rocksolidbbs.com/devel/article-flat.php?id=29581&group=comp.lang.c#29581

copy link Newsgroups: comp.lang.c

Path: i2pn2.org!i2pn.org!eternal-september.org!news.eternal-september.org!.POSTED!not-for-mail
From: david.brown@hesbynett.no (David Brown)
Newsgroups: comp.lang.c
Subject: Re: C vs Haskell for XML parsing
Date: Fri, 25 Aug 2023 13:42:52 +0200
Organization: A noiseless patient Spider
Lines: 30
Message-ID: <uca43t$p9c$2@dont-email.me>
References: <576801fa-2842-40dc-bf19-221a5b1cf660n@googlegroups.com>
<ubvo4d$1tm0p$1@dont-email.me>
<e9853969-42ce-48db-81e1-d37c8e4da59dn@googlegroups.com>
<uc28id$2dc7f$1@dont-email.me>
<b21393a6-c4f5-436a-9975-8ffedd6bf20bn@googlegroups.com>
<uc2dbv$2e4tg$1@dont-email.me>
<d734d616-b18e-4e67-b858-f0eb0a636a87n@googlegroups.com>
<uc2qnl$2gh96$1@dont-email.me>
<d651e08e-033d-4a90-8477-6a5fa13d30f3n@googlegroups.com>
<uc4e4t$2rdlt$1@dont-email.me>
<1e79f8a1-b707-4074-b272-ce4327ee7bc0n@googlegroups.com>
<uc5dd0$30jrk$1@dont-email.me>
<81879984-43e7-409a-a029-1ca6677f536dn@googlegroups.com>
<uc5mlk$32gl3$1@dont-email.me>
<8eec8404-4928-4bc3-8b00-c673ea22ab60n@googlegroups.com>
<uc7l4o$3fp72$1@dont-email.me>
<639e8e6f-2729-476b-9a6e-0b3eb066b06an@googlegroups.com>
<uc9m7v$3u797$1@dont-email.me>
<d3c6df71-7ef5-4fd3-83be-8b9a4315f0c4n@googlegroups.com>
<j=2ifExxFfXU05g0F@bongo-ra.co>
<ae3d2d4e-71bd-4c6c-99aa-cf8b6b652175n@googlegroups.com>
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8; format=flowed
Content-Transfer-Encoding: 7bit
Injection-Date: Fri, 25 Aug 2023 11:42:53 -0000 (UTC)
Injection-Info: dont-email.me; posting-host="c244ea4cca8f9b4180792b6a4896d4d1";
logging-data="25900"; mail-complaints-to="abuse@eternal-september.org"; posting-account="U2FsdGVkX18RLOAUD6PHvCzcLgEOvawKjw1VPEBgudE="
User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:102.0) Gecko/20100101
Thunderbird/102.9.0
Cancel-Lock: sha1:o37fx25em07oijYs1yjBI0wUghI=
In-Reply-To: <ae3d2d4e-71bd-4c6c-99aa-cf8b6b652175n@googlegroups.com>
Content-Language: en-GB

by: David Brown - Fri, 25 Aug 2023 11:42 UTC

Subject	Author
C vs Haskell for XML parsing	Malcolm McLean
Re: C vs Haskell for XML parsing	Bart
Re: C vs Haskell for XML parsing	Ben Bacarisse
Re: C vs Haskell for XML parsing	Malcolm McLean
Re: C vs Haskell for XML parsing	Ben Bacarisse
Re: C vs Haskell for XML parsing	Malcolm McLean
Re: C vs Haskell for XML parsing	Ben Bacarisse
Re: C vs Haskell for XML parsing	Malcolm McLean
Re: C vs Haskell for XML parsing	Ben Bacarisse
Re: C vs Haskell for XML parsing	Malcolm McLean
Re: C vs Haskell for XML parsing	Lew Pitcher
Re: C vs Haskell for XML parsing	Scott Lurndal
Re: C vs Haskell for XML parsing	Lew Pitcher
Re: C vs Haskell for XML parsing	Lew Pitcher
Re: C vs Haskell for XML parsing	Scott Lurndal
Re: C vs Haskell for XML parsing	Ben Bacarisse
Re: C vs Haskell for XML parsing	Scott Lurndal
Re: C vs Haskell for XML parsing	Ben Bacarisse
Re: C vs Haskell for XML parsing	Malcolm McLean
Re: C vs Haskell for XML parsing	Richard Damon
Re: C vs Haskell for XML parsing	Ben Bacarisse
Re: C vs Haskell for XML parsing	Malcolm McLean
Re: C vs Haskell for XML parsing	Keith Thompson
Re: C vs Haskell for XML parsing	Ben Bacarisse
Re: C vs Haskell for XML parsing	Malcolm McLean
Re: C vs Haskell for XML parsing	Ben Bacarisse
Re: C vs Haskell for XML parsing	David Brown
Re: C vs Haskell for XML parsing	Malcolm McLean
Re: C vs Haskell for XML parsing	David Brown
Re: C vs Haskell for XML parsing	Malcolm McLean
Re: C vs Haskell for XML parsing	David Brown
Re: C vs Haskell for XML parsing	Malcolm McLean
Re: C vs Haskell for XML parsing	David Brown
Re: C vs Haskell for XML parsing	Malcolm McLean
Re: C vs Haskell for XML parsing	David Brown
Re: C vs Haskell for XML parsing	Malcolm McLean
Re: C vs Haskell for XML parsing	David Brown
Re: C vs Haskell for XML parsing	Malcolm McLean
Re: C vs Haskell for XML parsing	Bart
Re: C vs Haskell for XML parsing	Malcolm McLean
Re: C vs Haskell for XML parsing	David Brown
Re: C vs Haskell for XML parsing	David Brown
Re: C vs Haskell for XML parsing	Malcolm McLean
Re: C vs Haskell for XML parsing	Scott Lurndal
Re: C vs Haskell for XML parsing	David Brown
Re: C vs Haskell for XML parsing	Malcolm McLean
Re: C vs Haskell for XML parsing	David Brown
Re: C vs Haskell for XML parsing	Malcolm McLean
Re: C vs Haskell for XML parsing	Bart
Re: C vs Haskell for XML parsing	Kaz Kylheku
Re: C vs Haskell for XML parsing	Kaz Kylheku
Re: C vs Haskell for XML parsing	David Brown
Re: C vs Haskell for XML parsing	David Brown
Re: C vs Haskell for XML parsing	Malcolm McLean
Re: C vs Haskell for XML parsing	Keith Thompson
Re: C vs Haskell for XML parsing	Scott Lurndal
Re: C vs Haskell for XML parsing	Lew Pitcher
Re: C vs Haskell for XML parsing	Keith Thompson
Re: C vs Haskell for XML parsing	Lew Pitcher
Re: C vs Haskell for XML parsing	Malcolm McLean
Re: C vs Haskell for XML parsing	Keith Thompson
Re: C vs Haskell for XML parsing	Malcolm McLean
Re: C vs Haskell for XML parsing	David Brown
Re: C vs Haskell for XML parsing	Malcolm McLean
Re: C vs Haskell for XML parsing	Scott Lurndal
Re: C vs Haskell for XML parsing	Malcolm McLean
Re: C vs Haskell for XML parsing	David Brown
Re: C vs Haskell for XML parsing	James Kuyper
Re: C vs Haskell for XML parsing	Malcolm McLean
Re: C vs Haskell for XML parsing	David Brown
Re: C vs Haskell for XML parsing	Malcolm McLean
Re: C vs Haskell for XML parsing	David Brown
Re: C vs Haskell for XML parsing	Kaz Kylheku
Re: C vs Haskell for XML parsing	Malcolm McLean
Re: C vs Haskell for XML parsing	David Brown
Re: C vs Haskell for XML parsing	Malcolm McLean
Re: C vs Haskell for XML parsing	David Brown
Re: C vs Haskell for XML parsing	Malcolm McLean
Re: C vs Haskell for XML parsing	Tim Rentsch
Re: C vs Haskell for XML parsing	Kaz Kylheku
Re: C vs Haskell for XML parsing	Malcolm McLean
Re: C vs Haskell for XML parsing	Keith Thompson
Re: C vs Haskell for XML parsing	Keith Thompson
Re: C vs Haskell for XML parsing	Scott Lurndal
Re: C vs Haskell for XML parsing	David Brown
Re: C vs Haskell for XML parsing	Malcolm McLean
Re: C vs Haskell for XML parsing	Chris M. Thomasson
Re: C vs Haskell for XML parsing	David Brown
Re: C vs Haskell for XML parsing	David Brown
Re: C vs Haskell for XML parsing	Keith Thompson
Re: C vs Haskell for XML parsing	Richard Damon
Re: C vs Haskell for XML parsing	Keith Thompson
Re: C vs Haskell for XML parsing	Ben Bacarisse
Re: C vs Haskell for XML parsing	Richard Damon
Re: C vs Haskell for XML parsing	Keith Thompson
Re: C vs Haskell for XML parsing	Richard Damon
Re: C vs Haskell for XML parsing	Keith Thompson
Re: C vs Haskell for XML parsing	Richard Damon
Re: C vs Haskell for XML parsing	Keith Thompson
Re: C vs Haskell for XML parsing	Richard Damon
Re: C vs Haskell for XML parsing	Keith Thompson
Re: C vs Haskell for XML parsing	David Brown
Re: C vs Haskell for XML parsing	Bart
Re: C vs Haskell for XML parsing	David Brown
Re: C vs Haskell for XML parsing	Malcolm McLean
Re: C vs Haskell for XML parsing	Spiros Bousbouras
Re: C vs Haskell for XML parsing	Malcolm McLean
Underscores in type names (was : C vs Haskell for XML parsing)	Spiros Bousbouras
Re: C vs Haskell for XML parsing	Bart
Re: C vs Haskell for XML parsing	Keith Thompson
Re: C vs Haskell for XML parsing	Scott Lurndal
Re: C vs Haskell for XML parsing	Bart
Re: C vs Haskell for XML parsing	Ben Bacarisse
Re: C vs Haskell for XML parsing	fir
Re: C vs Haskell for XML parsing	Kaz Kylheku
Re: C vs Haskell for XML parsing	Ben Bacarisse
Re: C vs Haskell for XML parsing	fir
Re: C vs Haskell for XML parsing	fir

19 May, 2024: Line wrapping has been changed to be more consistent with Usenet standards. If you find that it is broken please let me know here rocksolid.nodes.help

devel / comp.lang.c / Re: C vs Haskell for XML parsing

devel / comp.lang.c / Re: C vs Haskell for XML parsing

19 May, 2024: Line wrapping has been changed to be more consistent with Usenet standards.
If you find that it is broken please let me know here rocksolid.nodes.help