RetroBBS - comp.lang.tcl - thread or fork for an network server, what is the better way ?

thread or fork for an network server, what is the better way ?

<nnd$6857ba2e$5daa124d@4b2b7e80c9126ac9>

https://www.rocksolidbbs.com/devel/article-flat.php?id=19451&group=comp.lang.tcl#19451

From: michael@niehren.de (Michael Niehren)
Subject: thread or fork for an network server, what is the better way ?
Newsgroups: comp.lang.tcl
Reply-To: michael@niehren.de
Date: Wed, 01 Jun 2022 08:38:08 +0200
User-Agent: KNode/0.10.9
MIME-Version: 1.0
Content-Type: text/plain; charset=us-ascii
Content-Transfer-Encoding: 7Bit
Message-ID: <nnd$6857ba2e$5daa124d@4b2b7e80c9126ac9>
Organization: www.abavia.com
Path: i2pn2.org!i2pn.org!aioe.org!news.uzoreto.com!feeder.usenetexpress.com!tr2.eu1.usenetexpress.com!94.232.112.244.MISMATCH!feed.abavia.com!abe004.abavia.com!abp001.abavia.com!reseller!not-for-mail
Lines: 26
Injection-Date: Wed, 01 Jun 2022 08:38:08 +0200
Injection-Info: reseller; mail-complaints-to="abuse@abavia.com"

by: Michael Niehren - Wed, 1 Jun 2022 06:38 UTC

Hi together,

i am currently using a tcl-network-server under linux, that do a fork for
every new connection, which then handles the request.

As i see the tasks-Modul from ET here on the list, i'm thinking about to
switch to threads to handle the request in a thread-pool and hope to get
speed improvements.

So, what do you think of it, what is the better way ?
Is it expected to have speed improvements when using a thread pool in
contrast to fork on every new incoming connection and will the improvements
be big enough to legitimate the effort for switching ?

Currently i have 1 binary with all procedures for handling the request
included, which then forked. As far as i know, if i switch to threads, i
have to import all my defined procedures in every new starting thread, so
i have to split my binary into 2 parts. Is that right or is there a simple
way to define all procedures of the current running process in a thread
that this process starts ?

best regards
Michael

Re: thread or fork for an network server, what is the better way ?

<t77p6h$3fe$1@dont-email.me>

copy mid

https://www.rocksolidbbs.com/devel/article-flat.php?id=19455&group=comp.lang.tcl#19455

copy link Newsgroups: comp.lang.tcl

Path: i2pn2.org!i2pn.org!eternal-september.org!reader02.eternal-september.org!.POSTED!not-for-mail
From: rich@example.invalid (Rich)
Newsgroups: comp.lang.tcl
Subject: Re: thread or fork for an network server, what is the better way ?
Date: Wed, 1 Jun 2022 13:22:25 -0000 (UTC)
Organization: A noiseless patient Spider
Lines: 46
Message-ID: <t77p6h$3fe$1@dont-email.me>
References: <nnd$6857ba2e$5daa124d@4b2b7e80c9126ac9>
Injection-Date: Wed, 1 Jun 2022 13:22:25 -0000 (UTC)
Injection-Info: reader02.eternal-september.org; posting-host="133e24ee072356229ee0745be63d42de";
logging-data="3566"; mail-complaints-to="abuse@eternal-september.org"; posting-account="U2FsdGVkX1+6DOVdgws1W43mgwyjT8z+"
User-Agent: tin/2.0.1-20111224 ("Achenvoir") (UNIX) (Linux/3.10.17 (x86_64))
Cancel-Lock: sha1:euopXMzRakG77ux2NWQgGpXgTNQ=

by: Rich - Wed, 1 Jun 2022 13:22 UTC

Michael Niehren <michael@niehren.de> wrote:
> Hi together,
>
> As i see the tasks-Modul from ET here on the list, i'm thinking about
> to switch to threads to handle the request in a thread-pool and hope
> to get speed improvements.

The only way to really know is to put in the effort for a rewrite and
measure the difference.

However, with Linux, the time difference between launching a new thread
and forking (at the C level) is minimal (vs say in windows where the
windows equivalent to fork is order of magnitude slower than launching
a thread). The result of this minimal difference under Linux likely
means that Tcl interpreter overhead will dominate both versions to the
extent that you see little measurable difference in speed.

> So, what do you think of it, what is the better way ?
> Is it expected to have speed improvements when using a thread pool in
> contrast to fork on every new incoming connection and will the improvements
> be big enough to legitimate the effort for switching ?

Without putting in the effort to rewrite, and then measuring, no one
can know. But given my paragraph above, I predict you'd not see a huge
difference. With one exception. If requirements changed such that the
current forked processes suddenly now need to share a data structure.
Shared access to a common data structure will likely be much faster in
Tcl with the Threads package and the TSV (Thread Shared Variables)
allowing sharing of that data vs. trying to share among forked
processes.

> Currently i have 1 binary with all procedures for handling the request
> included, which then forked. As far as i know, if i switch to threads, i
> have to import all my defined procedures in every new starting thread,

Thread pool's allow you to supply an init script that is executed each
time a new pool member is spun up. It should "initialize" the thread
to the point that it is a productive member of the pool.

If you are using the raw thread::create call, then you have to do
whatever initialization (module loading, defining procs/objects, etc.)
is necessary. thread::create just hands you a raw interpreter without
any initialization/module loading that you might have done in another
thread. With that said, thread::create does take a "script" argument
that is meant to be the "initalize this thread" script that performs
that work. But you do have to repeat the work with each new thread.

Re: thread or fork for an network server, what is the better way ?

<t78coi$tn$1@gioia.aioe.org>

copy mid

https://www.rocksolidbbs.com/devel/article-flat.php?id=19461&group=comp.lang.tcl#19461

copy link Newsgroups: comp.lang.tcl

Path: i2pn2.org!i2pn.org!aioe.org!YN2ulY6LKp1eoOUw2OJ8ig.user.46.165.242.91.POSTED!not-for-mail
From: tclnews@rocketship1.me (et4)
Newsgroups: comp.lang.tcl
Subject: Re: thread or fork for an network server, what is the better way ?
Date: Wed, 1 Jun 2022 11:56:18 -0700
Organization: Aioe.org NNTP Server
Message-ID: <t78coi$tn$1@gioia.aioe.org>
References: <nnd$6857ba2e$5daa124d@4b2b7e80c9126ac9>
Mime-Version: 1.0
Content-Type: text/plain; charset=UTF-8; format=flowed
Content-Transfer-Encoding: 7bit
Injection-Info: gioia.aioe.org; logging-data="951"; posting-host="YN2ulY6LKp1eoOUw2OJ8ig.user.gioia.aioe.org"; mail-complaints-to="abuse@aioe.org";
User-Agent: Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:91.0) Gecko/20100101
Thunderbird/91.7.0
X-Notice: Filtered by postfilter v. 0.9.2
Content-Language: en-US

by: et4 - Wed, 1 Jun 2022 18:56 UTC

On 5/31/2022 11:38 PM, Michael Niehren wrote:
> Hi together,

>
> Currently i have 1 binary with all procedures for handling the request
> included, which then forked. As far as i know, if i switch to threads, i
> have to import all my defined procedures in every new starting thread, so
> i have to split my binary into 2 parts. Is that right or is there a simple
> way to define all procedures of the current running process in a thread
> that this process starts ?
>

The first question I'd ask is: are you faced with performance issues
now? Is your program running out of steam or are you just looking to
improve something that's already working.

One difference between thread pools and tasks, is with tpool it has an
upper and lower boundary on the number of threads in the pool. Tasks
allocate but one set at startup. While it is possible to add more or
reduce the number of them, there is no support for that at present and
none likely in the future. That was one complication I decided was not
worth the trouble, but that's just my opinion.

If you decide to use tpool, you can set the upper and lower to the same
number and it will not (afaik) allocate any more or kill off any of the
existing threads, and so there won't be any new importing of code into a
new thread, since you'll just be reusing the ones you have. Then it
should work like tasks. And what sort of importing do you think you are
going to need?

Tasks have a proc re-constructor, and it can take several. If you
specify just * as one of the elements in the import list argument, it
will use [info proc *] and reconstruct each proc. Likewise if you have
these in namespaces, so you could do name::* as an element. If you have
other inits to do, say TCLOO, then you would have to import them
differently. I've often wondered if TCLOO can be completely introspected
so it can be imported into a thread. I don't know enough about it, and I
personally don't use TCLOO so I can't speculate on that.

With tasks, you can have a script variable, i.e. set script {...} and
then specify -$script as one of the initializers. Tasks allow any number
of these along with any number of wildcards that an [info proc pattern]
can take. Tpool has a single argument for that, but you could probably
easily build several into a single one.

I've not used ttrace, but it would appear that it's purpose is similar,
but seems to do other things as well.

As to performance, do you fork off a process for each connection or do
you keep them around for additional ones? What does each fork do? Do
they talk to each other?

As to resources, I've estimated the cost of a new thread in a rather
crude method: On 32 bit windows, I could only do about 150 given the 2gb
address space limit. So, on the order of 10-20 mb per thread. You could
do some easy tpool tests. On 64 bit this likely won't matter, what with
cheap ram these days.

I know that tsv is reasonably fast, because I've measured the amount of
time it takes to give a task work (and it does it via tsv), and it's on
the order of 50 microsecs, where a proc call is about 1 microsecs (on my
4ghz 4090k intel chip). How reasonable this is depends on how much work
you do in each call. It would not be worthwhile to use tasks to compute
anything that can be done with a single proc call in say, 100 or less
microsecs. I also found that using thread::send sync was about 1/3 the
cost of doing task calls.

One thing you might do if it won't cause your program to crash is to
have your forked processes simply bypass any workload. Sort of like
putting a return at the top of some proc you want to measure, and run it
both ways, one time to do the real job and another to reduce it to no
work, so you can measure the overhead.

This can work as long as you don't need to compute anything. Another way
is to just have a canned answer to simulate your workload.

Anyway, you likely need to know the cost of each connection vs. the cost
of what you do in each connection. With that, you can then probably know
if it's worth switching to threads or tasks. If you do a lot of work in
each connection, I'd stay with what you got since it works. And as Rich
said, it will also depend on any inter thread/process communication you
are doing, if any.

Re: thread or fork for an network server, what is the better way ?

<nnd$3df323fb$0c8cadc8@84b62bd6cf0a1a0b>

copy mid

https://www.rocksolidbbs.com/devel/article-flat.php?id=19463&group=comp.lang.tcl#19463

copy link Newsgroups: comp.lang.tcl

From: michael@niehren.de (Michael Niehren)
Subject: Re: thread or fork for an network server, what is the better way ?
Newsgroups: comp.lang.tcl
Reply-To: michael@niehren.de
Date: Wed, 01 Jun 2022 23:23:56 +0200
References: <nnd$6857ba2e$5daa124d@4b2b7e80c9126ac9> <t78coi$tn$1@gioia.aioe.org>
User-Agent: KNode/0.10.9
MIME-Version: 1.0
Content-Type: text/plain; charset=us-ascii
Content-Transfer-Encoding: 7Bit
Message-ID: <nnd$3df323fb$0c8cadc8@84b62bd6cf0a1a0b>
Organization: www.abavia.com
Path: i2pn2.org!i2pn.org!aioe.org!feeder1.feed.usenet.farm!feed.usenet.farm!feeder.usenetexpress.com!tr3.eu1.usenetexpress.com!94.232.112.246.MISMATCH!abe006.abavia.com!abp001.abavia.com!reseller!not-for-mail
Lines: 99
Injection-Date: Wed, 01 Jun 2022 23:23:56 +0200
Injection-Info: reseller; mail-complaints-to="abuse@abavia.com"

by: Michael Niehren - Wed, 1 Jun 2022 21:23 UTC

Hi ET and Rich,

many thank's for your remarks.

My network server with fork run's currently very well without performance
issues. So there is no need to switch to thread or tasks. It was only an
thought for an improvement as i read about your tasks module.

I think, if i find time, i will take my netserver-code with fork, slim it to
an minimum and then measure 5 connections. After that i try to change my
code for using tasks or thread and measure again 5 connections. So i can
realy compare both.

> On 5/31/2022 11:38 PM, Michael Niehren wrote:
>> Hi together,
>
>>
>> Currently i have 1 binary with all procedures for handling the request
>> included, which then forked. As far as i know, if i switch to threads, i
>> have to import all my defined procedures in every new starting thread, so
>> i have to split my binary into 2 parts. Is that right or is there a
>> simple way to define all procedures of the current running process in a
>> thread that this process starts ?
>>
>
> The first question I'd ask is: are you faced with performance issues
> now? Is your program running out of steam or are you just looking to
> improve something that's already working.
>
> One difference between thread pools and tasks, is with tpool it has an
> upper and lower boundary on the number of threads in the pool. Tasks
> allocate but one set at startup. While it is possible to add more or
> reduce the number of them, there is no support for that at present and
> none likely in the future. That was one complication I decided was not
> worth the trouble, but that's just my opinion.
>
> If you decide to use tpool, you can set the upper and lower to the same
> number and it will not (afaik) allocate any more or kill off any of the
> existing threads, and so there won't be any new importing of code into a
> new thread, since you'll just be reusing the ones you have. Then it
> should work like tasks. And what sort of importing do you think you are
> going to need?
>
> Tasks have a proc re-constructor, and it can take several. If you
> specify just * as one of the elements in the import list argument, it
> will use [info proc *] and reconstruct each proc. Likewise if you have
> these in namespaces, so you could do name::* as an element. If you have
> other inits to do, say TCLOO, then you would have to import them
> differently. I've often wondered if TCLOO can be completely introspected
> so it can be imported into a thread. I don't know enough about it, and I
> personally don't use TCLOO so I can't speculate on that.
>
> With tasks, you can have a script variable, i.e. set script {...} and
> then specify -$script as one of the initializers. Tasks allow any number
> of these along with any number of wildcards that an [info proc pattern]
> can take. Tpool has a single argument for that, but you could probably
> easily build several into a single one.
>
> I've not used ttrace, but it would appear that it's purpose is similar,
> but seems to do other things as well.
>
> As to performance, do you fork off a process for each connection or do
> you keep them around for additional ones? What does each fork do? Do
> they talk to each other?
>
> As to resources, I've estimated the cost of a new thread in a rather
> crude method: On 32 bit windows, I could only do about 150 given the 2gb
> address space limit. So, on the order of 10-20 mb per thread. You could
> do some easy tpool tests. On 64 bit this likely won't matter, what with
> cheap ram these days.
>
> I know that tsv is reasonably fast, because I've measured the amount of
> time it takes to give a task work (and it does it via tsv), and it's on
> the order of 50 microsecs, where a proc call is about 1 microsecs (on my
> 4ghz 4090k intel chip). How reasonable this is depends on how much work
> you do in each call. It would not be worthwhile to use tasks to compute
> anything that can be done with a single proc call in say, 100 or less
> microsecs. I also found that using thread::send sync was about 1/3 the
> cost of doing task calls.
>
> One thing you might do if it won't cause your program to crash is to
> have your forked processes simply bypass any workload. Sort of like
> putting a return at the top of some proc you want to measure, and run it
> both ways, one time to do the real job and another to reduce it to no
> work, so you can measure the overhead.
>
> This can work as long as you don't need to compute anything. Another way
> is to just have a canned answer to simulate your workload.
>
> Anyway, you likely need to know the cost of each connection vs. the cost
> of what you do in each connection. With that, you can then probably know
> if it's worth switching to threads or tasks. If you do a lot of work in
> each connection, I'd stay with what you got since it works. And as Rich
> said, it will also depend on any inter thread/process communication you
> are doing, if any.

Re: thread or fork for an network server, what is the better way ?

<ygabkvbmq6o.fsf@akutech.de>

copy mid

https://www.rocksolidbbs.com/devel/article-flat.php?id=19465&group=comp.lang.tcl#19465

copy link Newsgroups: comp.lang.tcl

Path: i2pn2.org!i2pn.org!news.swapon.de!fu-berlin.de!uni-berlin.de!individual.net!not-for-mail
From: ralfixx@gmx.de (Ralf Fassel)
Newsgroups: comp.lang.tcl
Subject: Re: thread or fork for an network server, what is the better way ?
Date: Thu, 02 Jun 2022 10:21:03 +0200
Lines: 12
Message-ID: <ygabkvbmq6o.fsf@akutech.de>
References: <nnd$6857ba2e$5daa124d@4b2b7e80c9126ac9>
<t78coi$tn$1@gioia.aioe.org> <nnd$3df323fb$0c8cadc8@84b62bd6cf0a1a0b>
Mime-Version: 1.0
Content-Type: text/plain
X-Trace: individual.net jfWkYqQmaYmgM4LMlQkXoQlP5J56A3O54hXSAx2qeC+kkH9Ag=
Cancel-Lock: sha1:vu97RakqPjHDUpVSc9ADuOulSZw= sha1:1zrg2KC3KjsMErJsz8kfBgZzOo0=
User-Agent: Gnus/5.13 (Gnus v5.13) Emacs/25.3 (gnu/linux)

by: Ralf Fassel - Thu, 2 Jun 2022 08:21 UTC

On a clear disk you can seek forever.

devel / comp.lang.tcl / thread or fork for an network server, what is the better way ?

devel / comp.lang.tcl / thread or fork for an network server, what is the better way ?

Subject	Author
thread or fork for an network server, what is the better way ?	Michael Niehren
Re: thread or fork for an network server, what is the better way ?	Rich
Re: thread or fork for an network server, what is the better way ?	et4
Re: thread or fork for an network server, what is the better way ?	Michael Niehren
Re: thread or fork for an network server, what is the better way ?	Ralf Fassel