Rocksolid Light

Welcome to RetroBBS

mail  files  register  newsreader  groups  login

Message-ID:  

"Beware of programmers carrying screwdrivers." -- Chip Salzenberg


devel / comp.lang.misc / Re: String Literals

SubjectAuthor
* String LiteralsStefan Ram
+* Re: String LiteralsBart
|`* Re: String LiteralsJames Harris
| `* Re: String LiteralsStefan Ram
|  +- Re: String LiteralsBart
|  `- Re: String LiteralsJames Harris
+- Re: String LiteralsDavid Brown
`- Re: String LiteralsRod Pemberton

1
String Literals

<String-Literals-20210929202427@ram.dialup.fu-berlin.de>

  copy mid

https://www.rocksolidbbs.com/devel/article-flat.php?id=924&group=comp.lang.misc#924

  copy link   Newsgroups: comp.lang.misc
Path: i2pn2.org!i2pn.org!news.swapon.de!fu-berlin.de!uni-berlin.de!not-for-mail
From: ram@zedat.fu-berlin.de (Stefan Ram)
Newsgroups: comp.lang.misc
Subject: String Literals
Date: 29 Sep 2021 19:46:07 GMT
Organization: Stefan Ram
Lines: 43
Expires: 1 Dec 2021 11:59:58 GMT
Message-ID: <String-Literals-20210929202427@ram.dialup.fu-berlin.de>
Mime-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit
X-Trace: news.uni-berlin.de heVP5CYCpp/0HZ02TJSqAARDCsUYq0c03eRtwEP937zDmd
X-Copyright: (C) Copyright 2021 Stefan Ram. All rights reserved.
Distribution through any means other than regular usenet
channels is forbidden. It is forbidden to publish this
article in the Web, to change URIs of this article into links,
and to transfer the body without this notice, but quotations
of parts in other Usenet posts are allowed.
X-No-Archive: Yes
Archive: no
X-No-Archive-Readme: "X-No-Archive" is set, because this prevents some
services to mirror the article in the web. But the article may
be kept on a Usenet archive server with only NNTP access.
X-No-Html: yes
Content-Language: en-US
Accept-Language: de-DE, en-US, it, fr-FR
 by: Stefan Ram - Wed, 29 Sep 2021 19:46 UTC

I have the following ideas for string literals in a new language
(first the string, then the string literal is given):

String literals start with an opening bracket and end with
a closing bracket.

abc
[abc]

Brackets within the string literal are allowed when properly
nested.

abc[def]ghi
[abc[def]gih]

A single opening or closing bracket is written as "[`]" or
"[]`", respectively. This rule has higher precedence than the
preceding rule: whenever there is a "[`]" or "[]`" within
a string literal, it means "[" and "]", with no exceptions.

abc[def
[abc[`]def]

abc]def
[abc[]`def]

abc[`]def
[abc[`]`[]`def]

abc[]`def
[abc[`][]``def]

The notation for "[`]" and "[]`" within a string is awkward,
but is antecipated to be required only rarely. Most texts will
contain brackets that are properly nested, and this was made
to be easy.

So, are there any problems with this specification I have missed?
Strings that are impossible to encode or string literals whose
interpretation is ambiguous? Cases where frequent strings are
cumbersome to encode? TIA!

Re: String Literals

<sj2ifb$cju$1@dont-email.me>

  copy mid

https://www.rocksolidbbs.com/devel/article-flat.php?id=925&group=comp.lang.misc#925

  copy link   Newsgroups: comp.lang.misc
Path: i2pn2.org!i2pn.org!eternal-september.org!reader02.eternal-september.org!.POSTED!not-for-mail
From: bc@freeuk.com (Bart)
Newsgroups: comp.lang.misc
Subject: Re: String Literals
Date: Wed, 29 Sep 2021 21:31:30 +0100
Organization: A noiseless patient Spider
Lines: 90
Message-ID: <sj2ifb$cju$1@dont-email.me>
References: <String-Literals-20210929202427@ram.dialup.fu-berlin.de>
Mime-Version: 1.0
Content-Type: text/plain; charset=utf-8; format=flowed
Content-Transfer-Encoding: 7bit
Injection-Date: Wed, 29 Sep 2021 20:31:39 -0000 (UTC)
Injection-Info: reader02.eternal-september.org; posting-host="df84eaf1d188b1c0c8059190bf4dd358";
logging-data="12926"; mail-complaints-to="abuse@eternal-september.org"; posting-account="U2FsdGVkX1++rkBRlh+jTNnoA3oagyb2Du0o97k7Z3s="
User-Agent: Mozilla/5.0 (Windows NT 6.1; Win64; x64; rv:78.0) Gecko/20100101
Thunderbird/78.11.0
Cancel-Lock: sha1:qAW4JpvlechVQ7/zCBO+m7gNR8I=
In-Reply-To: <String-Literals-20210929202427@ram.dialup.fu-berlin.de>
X-Antivirus-Status: Clean
Content-Language: en-GB
X-Antivirus: AVG (VPS 210929-2, 29/9/2021), Outbound message
 by: Bart - Wed, 29 Sep 2021 20:31 UTC

On 29/09/2021 20:46, Stefan Ram wrote:
> I have the following ideas for string literals in a new language
> (first the string, then the string literal is given):
>
> String literals start with an opening bracket and end with
> a closing bracket.
>
> abc
> [abc]
>
> Brackets within the string literal are allowed when properly
> nested.
>
> abc[def]ghi
> [abc[def]gih]
>
> A single opening or closing bracket is written as "[`]" or
> "[]`", respectively. This rule has higher precedence than the
> preceding rule: whenever there is a "[`]" or "[]`" within
> a string literal, it means "[" and "]", with no exceptions.
>
> abc[def
> [abc[`]def]
>
> abc]def
> [abc[]`def]
>
> abc[`]def
> [abc[`]`[]`def]
>
> abc[]`def
> [abc[`][]``def]
>
> The notation for "[`]" and "[]`" within a string is awkward,
> but is antecipated to be required only rarely. Most texts will
> contain brackets that are properly nested, and this was made
> to be easy.
>
> So, are there any problems with this specification I have missed?
> Strings that are impossible to encode or string literals whose
> interpretation is ambiguous? Cases where frequent strings are
> cumbersome to encode? TIA!

I don't know if some strings are impossible to code. But it looks
near-impossible to write or read any strings that contain square
brackets or single quotes.

How do you deal with the usual non-printable characters that need escape
sequences such as CR, LF, TAB, BELL, etc?

With the usual "..." delimiters, your examples reduce to:

abc
"abc"

abc"def"ghi
"abc""def""ghi" # or the more common:
"abc\"def\"ghi" # (I allow both)

I'm not sure how you came up with these puzzling 3-character sequences:

[ [`]
] []`

If introducing ` as some sort of escape symbol, why not have it precede
the escaped character:

[ `[
] `]

Your examples, if still using [..] to delimit strings, and allowing
embedded ...[...]... without needing escapes, become:

abc[def]ghi
[abc[def]gih]

abc[def
[abc`[def]

abc]def
[abc`]def]

abc[`]def
[abc[``]def]

abc[]`def
[abc[]``def]

Here it needs `` to represent one `.

Re: String Literals

<sj72vf$5kh$1@dont-email.me>

  copy mid

https://www.rocksolidbbs.com/devel/article-flat.php?id=926&group=comp.lang.misc#926

  copy link   Newsgroups: comp.lang.misc
Path: i2pn2.org!i2pn.org!eternal-september.org!reader02.eternal-september.org!.POSTED!not-for-mail
From: james.harris.1@gmail.com (James Harris)
Newsgroups: comp.lang.misc
Subject: Re: String Literals
Date: Fri, 1 Oct 2021 14:37:50 +0100
Organization: A noiseless patient Spider
Lines: 162
Message-ID: <sj72vf$5kh$1@dont-email.me>
References: <String-Literals-20210929202427@ram.dialup.fu-berlin.de>
<sj2ifb$cju$1@dont-email.me>
Mime-Version: 1.0
Content-Type: text/plain; charset=utf-8; format=flowed
Content-Transfer-Encoding: 8bit
Injection-Date: Fri, 1 Oct 2021 13:37:51 -0000 (UTC)
Injection-Info: reader02.eternal-september.org; posting-host="061336b880c8f47975e479ac9fc01433";
logging-data="5777"; mail-complaints-to="abuse@eternal-september.org"; posting-account="U2FsdGVkX18TZKh/aBXR41KvriXu5mXxhxi2cKefQkA="
User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:78.0) Gecko/20100101
Thunderbird/78.13.0
Cancel-Lock: sha1:FKlz7VeDGT0PjoAKvaNhEs6j3jA=
In-Reply-To: <sj2ifb$cju$1@dont-email.me>
Content-Language: en-GB
 by: James Harris - Fri, 1 Oct 2021 13:37 UTC

On 29/09/2021 21:31, Bart wrote:
> On 29/09/2021 20:46, Stefan Ram wrote:
>>    I have the following ideas for string literals in a new language
>>    (first the string, then the string literal is given):

Stefan, I'll respond to your idea here as Bart has already made some of
the points I would have made.

>>
>>    String literals start with an opening bracket and end with
>>    a closing bracket.
>>
>> abc
>> [abc]

I'd be interested to see where you get to with this as I experimented
with braces (rather than brackets) which have the same feature of the
closing delimiter (and, hence, the string terminator) being different
from the opening delimiter.

>>
>>    Brackets within the string literal are allowed when properly
>>    nested.
>>
>> abc[def]ghi
>> [abc[def]gih]
>>
>>    A single opening or closing bracket is written as "[`]" or
>>    "[]`", respectively. This rule has higher precedence than the
>>    preceding rule: whenever there is a "[`]" or "[]`" within
>>    a string literal, it means "[" and "]", with no exceptions.
>>
>> abc[def
>> [abc[`]def]
>>
>> abc]def
>> [abc[]`def]
>>
>> abc[`]def
>> [abc[`]`[]`def]
>>
>> abc[]`def
>> [abc[`][]``def]
>>
>>    The notation for "[`]" and "[]`" within a string is awkward,

Yes, it's very awkward.

>>    but is antecipated to be required only rarely. Most texts will
>>    contain brackets that are properly nested, and this was made
>>    to be easy.
>>
>>    So, are there any problems with this specification I have missed?
>>    Strings that are impossible to encode or string literals whose
>>    interpretation is ambiguous? Cases where frequent strings are
>>    cumbersome to encode?                                        TIA!
>
>
> I don't know if some strings are impossible to code. But it looks
> near-impossible to write or read any strings that contain square
> brackets or single quotes.
>
> How do you deal with the usual non-printable characters that need escape
> sequences such as CR, LF, TAB, BELL, etc?

That was my main question. AISI, if Stefan uses an escape sequence for
LF etc then a string's opening and closing delimiters could be escaped
in order to embed them.

>
> With the usual "..." delimiters, your examples reduce to:
>
>  abc
>  "abc"
>
>  abc"def"ghi
>  "abc""def""ghi"        # or the more common:
>  "abc\"def\"ghi"        # (I allow both)

I chose to match an opening \ with a closing / so that string would be
one of these

"abc\Q/def\Q/ghi"
"abc\q/def\q/ghi"
"abc\"/def\"/ghi"

Not sure which, yet, but because what comes after \ is not limited to
one character other quote marks could be specified by name, e.g.

\q66/ opening slanted speech mark
\q99/ closing slanted speech mark
\q9/ normal slanted apostrophe
\q<</ France etc opening speech mark
etc

https://en.wikipedia.org/wiki/Guillemet

>
> I'm not sure how you came up with these puzzling 3-character sequences:
>
>  [    [`]
>  ]    []`
>
> If introducing ` as some sort of escape symbol, why not have it precede
> the escaped character:
>
>  [    `[
>  ]    `]
>
> Your examples, if still using [..] to delimit strings, and allowing
> embedded ...[...]... without needing escapes, become:
>
>  abc[def]ghi
>  [abc[def]gih]
>
>  abc[def
>  [abc`[def]
>
>  abc]def
>  [abc`]def]
>
>  abc[`]def
>  [abc[``]def]
>
>  abc[]`def
>  [abc[]``def]
>
> Here it needs `` to represent one `.

However and whenever I try to encode such such strings they end up to be
similarly difficult to read.

One option, perhaps, is to allow greater spacing. Considering the last one,

abc[]`def

if the punctuation characters need special treatment how about spacing
them out. For example,

"abc" + LBRACKET + RBRACKET + BACKAPOSTROPHE + "def"

or

"abc\ [ /\ ] /\ ` /def"

or

"abc" + "\[/" + "\]/" + "\`/" + "def"

or

"abc\ [ ] ` /def"

That last one's arguably not too bad a way to embed three consecutive
special characters.

--
James Harris

Re: String Literals

<sj73a2$845$1@dont-email.me>

  copy mid

https://www.rocksolidbbs.com/devel/article-flat.php?id=927&group=comp.lang.misc#927

  copy link   Newsgroups: comp.lang.misc
Path: i2pn2.org!i2pn.org!eternal-september.org!reader02.eternal-september.org!.POSTED!not-for-mail
From: david.brown@hesbynett.no (David Brown)
Newsgroups: comp.lang.misc
Subject: Re: String Literals
Date: Fri, 1 Oct 2021 15:43:29 +0200
Organization: A noiseless patient Spider
Lines: 14
Message-ID: <sj73a2$845$1@dont-email.me>
References: <String-Literals-20210929202427@ram.dialup.fu-berlin.de>
Mime-Version: 1.0
Content-Type: text/plain; charset=utf-8
Content-Transfer-Encoding: 7bit
Injection-Date: Fri, 1 Oct 2021 13:43:30 -0000 (UTC)
Injection-Info: reader02.eternal-september.org; posting-host="b5f0ab6d56179f9e01a087c9f2a3444a";
logging-data="8325"; mail-complaints-to="abuse@eternal-september.org"; posting-account="U2FsdGVkX1//a8eHIFcWrfwoZIplJI5iDVMAxIEFHOc="
User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:78.0) Gecko/20100101
Thunderbird/78.11.0
Cancel-Lock: sha1:5BYkjbL2rlE/xMncGPIl782sE0A=
In-Reply-To: <String-Literals-20210929202427@ram.dialup.fu-berlin.de>
Content-Language: en-GB
 by: David Brown - Fri, 1 Oct 2021 13:43 UTC

On 29/09/2021 21:46, Stefan Ram wrote:
> I have the following ideas for string literals in a new language
> (first the string, then the string literal is given):
>
> String literals start with an opening bracket and end with
> a closing bracket.
>

Others have answered here, but have missed the elephant in the room -
/why/? What possible advantages would this brackets mess have over
quotation marks that are used by almost every programming language (and
many human languages)?

Re: String Literals

<brackets-20211001150643@ram.dialup.fu-berlin.de>

  copy mid

https://www.rocksolidbbs.com/devel/article-flat.php?id=928&group=comp.lang.misc#928

  copy link   Newsgroups: comp.lang.misc
Path: i2pn2.org!i2pn.org!news.swapon.de!fu-berlin.de!uni-berlin.de!not-for-mail
From: ram@zedat.fu-berlin.de (Stefan Ram)
Newsgroups: comp.lang.misc
Subject: Re: String Literals
Date: 1 Oct 2021 14:12:42 GMT
Organization: Stefan Ram
Lines: 41
Expires: 1 Dec 2021 11:59:58 GMT
Message-ID: <brackets-20211001150643@ram.dialup.fu-berlin.de>
References: <String-Literals-20210929202427@ram.dialup.fu-berlin.de> <sj2ifb$cju$1@dont-email.me> <sj72vf$5kh$1@dont-email.me>
Mime-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit
X-Trace: news.uni-berlin.de wFdyDDsoWSVx0jYaRhF7YQ633m3pR5gfFdZXPK/N4Y0/X4
X-Copyright: (C) Copyright 2021 Stefan Ram. All rights reserved.
Distribution through any means other than regular usenet
channels is forbidden. It is forbidden to publish this
article in the Web, to change URIs of this article into links,
and to transfer the body without this notice, but quotations
of parts in other Usenet posts are allowed.
X-No-Archive: Yes
Archive: no
X-No-Archive-Readme: "X-No-Archive" is set, because this prevents some
services to mirror the article in the web. But the article may
be kept on a Usenet archive server with only NNTP access.
X-No-Html: yes
Content-Language: en-US
Accept-Language: de-DE, en-US, it, fr-FR
 by: Stefan Ram - Fri, 1 Oct 2021 14:12 UTC

James Harris <james.harris.1@gmail.com> writes:
>On 29/09/2021 21:31, Bart wrote:
>>On 29/09/2021 20:46, Stefan Ram wrote:
>>How do you deal with the usual non-printable characters that need escape
>>sequences such as CR, LF, TAB, BELL, etc?
>That was my main question. AISI, if Stefan uses an escape sequence for
>LF etc then a string's opening and closing delimiters could be escaped
>in order to embed them.

CR, LF, TAB, and BELL do not need escape sequences in my
notation as they can be included either literally or via
the embedding language if need be.

[ a bracketed string
can span several lines,
and it may
contain literal tab
characters if need be.
BELL signs are antecipated to be rarely needed.]

>"abc" + LBRACKET + RBRACKET + BACKAPOSTROPHE + "def"

If these strings are part of a languages with string
concatenation operators (which is intended indeed) this
would be possible. I plan to realize concatenation of
strings by mere concatenation of expressions, so
"abc\adef" could be written [abc]*BELL[def], that is
a sequence of a string literal, a name, and another
string literal (names would have to be marked in this
language, I used an asterisk in this post as an example
for a marker for a reference by name).

I decided to use []` for the closing bracket as part of the
text, as I wrote. If I had decided to use `] for the closing
bracket as part of the text, this would mean that a backtick
cannot be the last character in a string. So, I could have
used ]` instead, but using []` instead means that my strings
always have properly nested brackets, which helps when using
editors with functions to find matching brackets.

Re: String Literals

<sj77ru$aig$1@dont-email.me>

  copy mid

https://www.rocksolidbbs.com/devel/article-flat.php?id=929&group=comp.lang.misc#929

  copy link   Newsgroups: comp.lang.misc
Path: i2pn2.org!i2pn.org!eternal-september.org!reader02.eternal-september.org!.POSTED!not-for-mail
From: bc@freeuk.com (Bart)
Newsgroups: comp.lang.misc
Subject: Re: String Literals
Date: Fri, 1 Oct 2021 16:01:16 +0100
Organization: A noiseless patient Spider
Lines: 52
Message-ID: <sj77ru$aig$1@dont-email.me>
References: <String-Literals-20210929202427@ram.dialup.fu-berlin.de>
<sj2ifb$cju$1@dont-email.me> <sj72vf$5kh$1@dont-email.me>
<brackets-20211001150643@ram.dialup.fu-berlin.de>
Mime-Version: 1.0
Content-Type: text/plain; charset=utf-8; format=flowed
Content-Transfer-Encoding: 7bit
Injection-Date: Fri, 1 Oct 2021 15:01:18 -0000 (UTC)
Injection-Info: reader02.eternal-september.org; posting-host="3c8cc88710cf1674d2bfd3ac7a214761";
logging-data="10832"; mail-complaints-to="abuse@eternal-september.org"; posting-account="U2FsdGVkX191j1sKuPiJYzkEP/BxFd9txBUcxNgQ+fI="
User-Agent: Mozilla/5.0 (Windows NT 6.1; Win64; x64; rv:78.0) Gecko/20100101
Thunderbird/78.11.0
Cancel-Lock: sha1:TPsTDe4wzYX8KutYSr6DA45uPYw=
In-Reply-To: <brackets-20211001150643@ram.dialup.fu-berlin.de>
X-Antivirus-Status: Clean
Content-Language: en-GB
X-Antivirus: AVG (VPS 211001-2, 1/10/2021), Outbound message
 by: Bart - Fri, 1 Oct 2021 15:01 UTC

On 01/10/2021 15:12, Stefan Ram wrote:
> James Harris <james.harris.1@gmail.com> writes:
>> On 29/09/2021 21:31, Bart wrote:
>>> On 29/09/2021 20:46, Stefan Ram wrote:
>>> How do you deal with the usual non-printable characters that need escape
>>> sequences such as CR, LF, TAB, BELL, etc?
>> That was my main question. AISI, if Stefan uses an escape sequence for
>> LF etc then a string's opening and closing delimiters could be escaped
>> in order to embed them.
>
> CR, LF, TAB, and BELL do not need escape sequences in my
> notation as they can be included either literally or via
> the embedding language if need be.
>
> [ a bracketed string
> can span several lines,
> and it may
> contain literal tab
> characters if need be.
> BELL signs are antecipated to be rarely needed.]

That won't work well in general because newline sequences depend on both
the OS and the editor, or even on the source of the text if it was
pasted elsewhere.

Newlines may be CR, CRLF, LF, something else entirely, or may not even
exist. (In my editor, newlines do not exist while editing and displaying
text, which is a list of strings. They are discarded when reading from
disk, and added back again when writing to a file.)

It means that that string can contain have unknown sequences, and what
are superfically the same strings in two source files, may not compare
equal.

Literal tabs are another problem, as they are so often expanded. Then
they turn into spaces, but now a fixed number of spaces.

Yet another, is that without delimiters before the editor's natural
end-of-line, there can be trailing spaces (and tabs) that are now invisible.

Two bonus problems: this makes it impossible to have those intermediate
lines ending with a comment, and you can't indent this text to bring it
(literally) into line with the surrounding code.

>> "abc" + LBRACKET + RBRACKET + BACKAPOSTROPHE + "def"
>
> If these strings are part of a languages with string
> concatenation operators (which is intended indeed) this
> would be possible.

In this case why bother with trying to represent embedded [ and ] at all?

Re: String Literals

<sj788l$dhn$1@dont-email.me>

  copy mid

https://www.rocksolidbbs.com/devel/article-flat.php?id=930&group=comp.lang.misc#930

  copy link   Newsgroups: comp.lang.misc
Path: i2pn2.org!i2pn.org!eternal-september.org!reader02.eternal-september.org!.POSTED!not-for-mail
From: james.harris.1@gmail.com (James Harris)
Newsgroups: comp.lang.misc
Subject: Re: String Literals
Date: Fri, 1 Oct 2021 16:08:04 +0100
Organization: A noiseless patient Spider
Lines: 80
Message-ID: <sj788l$dhn$1@dont-email.me>
References: <String-Literals-20210929202427@ram.dialup.fu-berlin.de>
<sj2ifb$cju$1@dont-email.me> <sj72vf$5kh$1@dont-email.me>
<brackets-20211001150643@ram.dialup.fu-berlin.de>
Mime-Version: 1.0
Content-Type: text/plain; charset=utf-8; format=flowed
Content-Transfer-Encoding: 7bit
Injection-Date: Fri, 1 Oct 2021 15:08:05 -0000 (UTC)
Injection-Info: reader02.eternal-september.org; posting-host="061336b880c8f47975e479ac9fc01433";
logging-data="13879"; mail-complaints-to="abuse@eternal-september.org"; posting-account="U2FsdGVkX18uE8i6o2yfLohvEndqgigKHLLNFhcXeKI="
User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:78.0) Gecko/20100101
Thunderbird/78.13.0
Cancel-Lock: sha1:dTPfl2dllNeFiXD10t2zk+lCuBY=
In-Reply-To: <brackets-20211001150643@ram.dialup.fu-berlin.de>
Content-Language: en-GB
 by: James Harris - Fri, 1 Oct 2021 15:08 UTC

On 01/10/2021 15:12, Stefan Ram wrote:
> James Harris <james.harris.1@gmail.com> writes:
>> On 29/09/2021 21:31, Bart wrote:
>>> On 29/09/2021 20:46, Stefan Ram wrote:
>>> How do you deal with the usual non-printable characters that need escape
>>> sequences such as CR, LF, TAB, BELL, etc?
>> That was my main question. AISI, if Stefan uses an escape sequence for
>> LF etc then a string's opening and closing delimiters could be escaped
>> in order to embed them.
>
> CR, LF, TAB, and BELL do not need escape sequences in my
> notation as they can be included either literally or via
> the embedding language if need be.
>
> [ a bracketed string
> can span several lines,
> and it may
> contain literal tab
> characters if need be.
> BELL signs are antecipated to be rarely needed.]

Those four may be covered but do you not need to handle any other
nonprinting characters such as backspace or del?

You may also want to have a plan for ending lines with something other
than the line endings which happen to be present in the particular
editor you are using (which is what the above text would naturally
include).

What if someone writing one of your strings wanted to include a trailing
space on one line but not another? In the above, trailing blanks would
not be evident in the source.

An escape arrangement would allow such issues to be addressed as well as
providing a way of embedding (or, de-signifying) string delimiters.

Something else to consider is where text has to be entered in lines but
the encoded text should omit the line breaks.

>
>> "abc" + LBRACKET + RBRACKET + BACKAPOSTROPHE + "def"
>
> If these strings are part of a languages with string
> concatenation operators (which is intended indeed) this
> would be possible. I plan to realize concatenation of
> strings by mere concatenation of expressions, so
> "abc\adef" could be written [abc]*BELL[def], that is
> a sequence of a string literal, a name, and another
> string literal (names would have to be marked in this
> language, I used an asterisk in this post as an example
> for a marker for a reference by name).

That's interesting. I tried the same. I found it would work especially
well and usefully for a trailing newline. In your syntax:

[abc] ;Just the three letters abc
[abc]*n ;abc and newline

>
> I decided to use []` for the closing bracket as part of the
> text, as I wrote. If I had decided to use `] for the closing
> bracket as part of the text, this would mean that a backtick
> cannot be the last character in a string. So, I could have
> used ]` instead, but using []` instead means that my strings
> always have properly nested brackets, which helps when using
> editors with functions to find matching brackets.

Understood, but AIUI your idea of having

[`

for a de-signified opening bracket would also make it hard to put such a
backtick at the /beginning/ of a string.

All told, escapes are not the worst idea in the world.

--
James Harris

Re: String Literals

<sju568$ead$1@gioia.aioe.org>

  copy mid

https://www.rocksolidbbs.com/devel/article-flat.php?id=1041&group=comp.lang.misc#1041

  copy link   Newsgroups: comp.lang.misc
Path: i2pn2.org!i2pn.org!aioe.org!JKehOyGOGgs2f2NKLRXdGg.user.46.165.242.75.POSTED!not-for-mail
From: noemail@basdxcqvbe.com (Rod Pemberton)
Newsgroups: comp.lang.misc
Subject: Re: String Literals
Date: Sun, 10 Oct 2021 03:38:04 -0500
Organization: Aioe.org NNTP Server
Message-ID: <sju568$ead$1@gioia.aioe.org>
References: <String-Literals-20210929202427@ram.dialup.fu-berlin.de>
Mime-Version: 1.0
Content-Type: text/plain; charset=US-ASCII
Content-Transfer-Encoding: 7bit
Injection-Info: gioia.aioe.org; logging-data="14669"; posting-host="JKehOyGOGgs2f2NKLRXdGg.user.gioia.aioe.org"; mail-complaints-to="abuse@aioe.org";
X-Notice: Filtered by postfilter v. 0.9.2
 by: Rod Pemberton - Sun, 10 Oct 2021 08:38 UTC

On 29 Sep 2021 19:46:07 GMT
ram@zedat.fu-berlin.de (Stefan Ram) wrote:

> I have the following ideas for string literals in a new language
> (first the string, then the string literal is given):
>
> String literals start with an opening bracket and end with
> a closing bracket.
>
> abc
> [abc]

Having different initial and terminal delimiters makes it slightly
easier to parse the string than using quotes, but this typically
requires escapes too.

My advice would be to pick delimiters that would not normally be needed
within typical typed text e.g., for ASCII, possibly a backquote `,
backslash \, caret ^, tilde ~, or quote ". I would avoid brackets [],
braces {}, parens (), guillemets <>, as string delimiters due to their
usefulness in pairing items within the language. The other ASCII
symbols are used for punctuation, mathematics, or accounting.

> Brackets within the string literal are allowed when properly
> nested.
>
> abc[def]ghi
> [abc[def]gih]
>

Why would you need to nest string delimiters? ...

In other words, why are you nesting a string within a string?
(IMO, that's the biggest elephant in the room ...)

So, I'm beginning to think that you may mean something different by the
term "string literal" that what I understand a "string literal" to be:
https://en.wikipedia.org/wiki/String_literal

Or, is the usage of nesting just a way to embed non-delimiter brackets
within the string without using escapes? ... If so, your choice of
brackets as delimiters is probably non-optimal. Pick something else.

> A single opening or closing bracket is written as "[`]" or
> "[]`", respectively. This rule has higher precedence than the
> preceding rule: whenever there is a "[`]" or "[]`" within
> a string literal, it means "[" and "]", with no exceptions.

The backquote ` is acting as an escape, but since it comes after the
character being escaped, your lexer would need look-back. AIUI, the
majority of lexers use look-ahead. What does yours do? Is this a
concern?

> abc[def
> [abc[`]def]
>
> abc]def
> [abc[]`def]
>
> abc[`]def
> [abc[`]`[]`def]
>
> abc[]`def
> [abc[`][]``def]
>
> The notation for "[`]" and "[]`" within a string is awkward,
> but is antecipated to be required only rarely. Most texts will
> contain brackets that are properly nested, and this was made
> to be easy.
>
> So, are there any problems with this specification I have missed?
> Strings that are impossible to encode or string literals whose
> interpretation is ambiguous? Cases where frequent strings are
> cumbersome to encode? TIA!
>

I'm really not sure why the nesting of strings is needed, assuming
(probably incorrectly) that's what is being done here, so I'd personally
eliminate the nesting, or change the delimiters, if not. That would
eliminate some or all of the need for escapes (like C) or string
concatenation (like BASIC). If you need an escape, use an escape, or
select different terminators to reduce/eliminate the need for escapes.

--
Things are only going to become worse for Joe Biden. His only chance
at salvation will come from the thing he hates the most: Donald Trump.

1
server_pubkey.txt

rocksolid light 0.9.81
clearnet tor