Rocksolid Light

Welcome to RetroBBS

mail  files  register  newsreader  groups  login

Message-ID:  

APL hackers do it in the quad.


computers / alt.os.linux.slackware / Re: System crashing: I need help

SubjectAuthor
* System crashing: I need helproot
+* Re: System crashing: I need helpRich
|`* Re: System crashing: I need helproot
| `* Re: System crashing: I need helpRich
|  `* Re: System crashing: I need helproot
|   `* Re: System crashing: I need helpRich
|    `- Re: System crashing: I need helproot
`* Re: System crashing: I need helpHenrik Carlqvist
 `* Re: System crashing: I need helproot
  `- Re: System crashing: I need helpRich

1
System crashing: I need help

<sr9mpb$378$1@dont-email.me>

  copy mid

https://www.rocksolidbbs.com/computers/article-flat.php?id=831&group=alt.os.linux.slackware#831

  copy link   Newsgroups: alt.os.linux.slackware
Path: i2pn2.org!i2pn.org!usenet.goja.nl.eu.org!news.freedyn.de!eternal-september.org!reader02.eternal-september.org!.POSTED!not-for-mail
From: NoEMail@home.org (root)
Newsgroups: alt.os.linux.slackware
Subject: System crashing: I need help
Date: Fri, 7 Jan 2022 15:38:19 -0000 (UTC)
Organization: Linux Advocacy
Lines: 43
Message-ID: <sr9mpb$378$1@dont-email.me>
Injection-Date: Fri, 7 Jan 2022 15:38:19 -0000 (UTC)
Injection-Info: reader02.eternal-september.org; posting-host="f6d1677708a6ca7a65d1fa476ce2d526";
logging-data="3304"; mail-complaints-to="abuse@eternal-september.org"; posting-account="U2FsdGVkX19/19drlLNUxy/FcMlxt/pwZXytUyhB93s="
User-Agent: slrn/1.0.2 (Linux)
Cancel-Lock: sha1:cW8mLQlVEjAatT3fxS2oGHOwCdc=
Mail-Copies-To: nobody
 by: root - Fri, 7 Jan 2022 15:38 UTC

I have a server that runs Slack64 14.2 and has done so since
before 14.2. A few weeks ago the system started crashing.
For most of the crashes the kernel was still running
and would respond to pings, and there was a display
but the server would not accept keyboard or mouse input.

The system would run for a few days and crash again.

I swapped out the power supply with a brand new 750w unit.
The crashes continued.

I swapped out the motherboard/cpu/memory with one
from a working machine. The crashes continued.

I updated the 10 year old bios on the motherboard.
I tried different kernels.
I updated everything with slackpkg.
I updated Chrome to the latest version. Chrome runs all the time.

Only the computer case and NVidia graphics card remain
from the original system, and still the crashes persist.

When I got up this morning, the system had crashed
during the night. After rebooting I looked at the syslog
and I found a stream of:
rcu_sched self-detected stall on CPU
errors which continued until I rebooted the system

This seems to be related to a kernel overload as if
there were too many tasks for the system to keep up.
The cpu is Intel Core I7 3.4GHz with 16GB of memory.

Among other Call Traces in the syslog I see something
that must have originated within Chrome, and another
crash from kswapd, when I have no swap partition.

I am pretty much out of ideas and would appreciate
any suggestions.

Thanks.

Re: System crashing: I need help

<sr9qnn$v8f$1@dont-email.me>

  copy mid

https://www.rocksolidbbs.com/computers/article-flat.php?id=832&group=alt.os.linux.slackware#832

  copy link   Newsgroups: alt.os.linux.slackware
Path: i2pn2.org!i2pn.org!aioe.org!eternal-september.org!reader02.eternal-september.org!.POSTED!not-for-mail
From: rich@example.invalid (Rich)
Newsgroups: alt.os.linux.slackware
Subject: Re: System crashing: I need help
Date: Fri, 7 Jan 2022 16:45:43 -0000 (UTC)
Organization: A noiseless patient Spider
Lines: 34
Message-ID: <sr9qnn$v8f$1@dont-email.me>
References: <sr9mpb$378$1@dont-email.me>
Injection-Date: Fri, 7 Jan 2022 16:45:43 -0000 (UTC)
Injection-Info: reader02.eternal-september.org; posting-host="1065ac07e1ed0f386a89486036d2d4e0";
logging-data="32015"; mail-complaints-to="abuse@eternal-september.org"; posting-account="U2FsdGVkX1/fNmiwavGpRjT4jUFx4okx"
User-Agent: tin/2.0.1-20111224 ("Achenvoir") (UNIX) (Linux/3.10.17 (x86_64))
Cancel-Lock: sha1:B5yBbkohSCnqBpQSJgmXClcq2Eo=
 by: Rich - Fri, 7 Jan 2022 16:45 UTC

root <NoEMail@home.org> wrote:
> The system would run for a few days and crash again.
>
> I swapped out the power supply with a brand new 750w unit.
> The crashes continued.
>
> I swapped out the motherboard/cpu/memory with one
> from a working machine. The crashes continued.
>
> I updated the 10 year old bios on the motherboard.
> I tried different kernels.
> I updated everything with slackpkg.
> I updated Chrome to the latest version. Chrome runs all the time.
>
> Only the computer case and NVidia graphics card remain
> from the original system, and still the crashes persist.

You've given us very little info with which to help. But...

Are you running the NVidia closed-source driver or the open source
driver?

If closed-source driver, then try the open source driver.

> Among other Call Traces in the syslog I see something
> that must have originated within Chrome, and another
> crash from kswapd, when I have no swap partition.

Hmm... How much RAM?

Chrome is a known memory hog, and if you have no swap, then anytime
chrome trys to grow beyond the free memory left in the system's ram
after everything else that is loaded, things will go bad very fast.

Re: System crashing: I need help

<sr9r0o$pl6$2@dont-email.me>

  copy mid

https://www.rocksolidbbs.com/computers/article-flat.php?id=833&group=alt.os.linux.slackware#833

  copy link   Newsgroups: alt.os.linux.slackware
Path: i2pn2.org!i2pn.org!eternal-september.org!reader02.eternal-september.org!.POSTED!not-for-mail
From: Henrik.Carlqvist@deadspam.com (Henrik Carlqvist)
Newsgroups: alt.os.linux.slackware
Subject: Re: System crashing: I need help
Date: Fri, 7 Jan 2022 16:50:32 -0000 (UTC)
Organization: A noiseless patient Spider
Lines: 40
Message-ID: <sr9r0o$pl6$2@dont-email.me>
References: <sr9mpb$378$1@dont-email.me>
Mime-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit
Injection-Date: Fri, 7 Jan 2022 16:50:32 -0000 (UTC)
Injection-Info: reader02.eternal-september.org; posting-host="40460fc54f12cbb5fbed3edb5438a61f";
logging-data="26278"; mail-complaints-to="abuse@eternal-september.org"; posting-account="U2FsdGVkX1+NID1Q8+DKSA0/EAk8MieM"
User-Agent: Pan/0.139 (Sexual Chocolate; GIT bf56508
git://git.gnome.org/pan2)
Cancel-Lock: sha1:Ea4fOFqFPpQMY2zrXo0F+xGHBMs=
 by: Henrik Carlqvist - Fri, 7 Jan 2022 16:50 UTC

On Fri, 07 Jan 2022 15:38:19 +0000, root wrote:
> Only the computer case and NVidia graphics card remain from the original
> system, and still the crashes persist.
From those two I would first try to replace the nVidia card. :-)
> When I got up this morning, the system had crashed during the night.
> After rebooting I looked at the syslog and I found a stream of:
> rcu_sched self-detected stall on CPU
> errors which continued until I rebooted the system

Could you in the syslog see anything special just before those messages
started flooding the syslog?
> This seems to be related to a kernel overload as if there were too many
> tasks for the system to keep up.

Do you have any other machines in the network? If so, you might be able
to use one of those to monitor your problematic machine.

Mostly, when a machine crashes it will lose the interesting last part of
system logs. With syslog configured to send logs to a log server those
interesting log messages can sometimes be saved.

Having snmpd running on the problematic machine and something like mrgt
monitor the machine might give useful graphs to inspect even though mrtg
only samples the machine every 5 minutes. On those graphs you can see if
any partition fills up, machine load, CPU usage, memory usage. If you
also provide data from lmsensors to snmpd you can make graphs of
temperatures and fan speeds. I wrote the project
https://sourceforge.net/projects/nvgpu-smi-snmp/ to also be able to
monitor nVidia GPU usage with snmp/mrtg.

Another thing to try when the system responds to ping but keyboard and
mouse has hung is to see if you can ssh into the system. Once logged in
you might get clues about what is going on by studying the output from
commands like dmesg and top.

regards Henrik

Re: System crashing: I need help

<sr9vbj$3sl$1@dont-email.me>

  copy mid

https://www.rocksolidbbs.com/computers/article-flat.php?id=834&group=alt.os.linux.slackware#834

  copy link   Newsgroups: alt.os.linux.slackware
Path: i2pn2.org!i2pn.org!eternal-september.org!reader02.eternal-september.org!.POSTED!not-for-mail
From: NoEMail@home.org (root)
Newsgroups: alt.os.linux.slackware
Subject: Re: System crashing: I need help
Date: Fri, 7 Jan 2022 18:04:35 -0000 (UTC)
Organization: Linux Advocacy
Lines: 49
Message-ID: <sr9vbj$3sl$1@dont-email.me>
References: <sr9mpb$378$1@dont-email.me> <sr9qnn$v8f$1@dont-email.me>
Injection-Date: Fri, 7 Jan 2022 18:04:35 -0000 (UTC)
Injection-Info: reader02.eternal-september.org; posting-host="f6d1677708a6ca7a65d1fa476ce2d526";
logging-data="3989"; mail-complaints-to="abuse@eternal-september.org"; posting-account="U2FsdGVkX18pX0CHwrml32HbkzKGztSy8p2gyqshC8M="
User-Agent: slrn/1.0.2 (Linux)
Cancel-Lock: sha1:1/Ghhg3fo/o+FwLoon5VVu7RFeg=
 by: root - Fri, 7 Jan 2022 18:04 UTC

Rich <rich@example.invalid> wrote:
> root <NoEMail@home.org> wrote:
>> The system would run for a few days and crash again.
>>
>> I swapped out the power supply with a brand new 750w unit.
>> The crashes continued.
>>
>> I swapped out the motherboard/cpu/memory with one
>> from a working machine. The crashes continued.
>>
>> I updated the 10 year old bios on the motherboard.
>> I tried different kernels.
>> I updated everything with slackpkg.
>> I updated Chrome to the latest version. Chrome runs all the time.
>>
>> Only the computer case and NVidia graphics card remain
>> from the original system, and still the crashes persist.
>
> You've given us very little info with which to help. But...
>
> Are you running the NVidia closed-source driver or the open source
> driver?
>
> If closed-source driver, then try the open source driver.
>
>> Among other Call Traces in the syslog I see something
>> that must have originated within Chrome, and another
>> crash from kswapd, when I have no swap partition.
>
> Hmm... How much RAM?
>
> Chrome is a known memory hog, and if you have no swap, then anytime
> chrome trys to grow beyond the free memory left in the system's ram
> after everything else that is loaded, things will go bad very fast.
>

Thanks for responding.

I am running the NVidia driver. Nouveau, the alternative, does not
support driving two different displays, did not support HDMI sound
when I last checked, and may not support 4K video.

The system has 16GB of ram. Up until a few weeks ago the system
ran 24/7 and only was taken down to change SATA drives. Uptime
of several months was the norm.

I am beginning to think that this isn't a hardware problem.
The server functions at two levels: to gather, clean-up, and
source online data, and as a A/V server.

Re: System crashing: I need help

<sr9vvd$3sl$2@dont-email.me>

  copy mid

https://www.rocksolidbbs.com/computers/article-flat.php?id=835&group=alt.os.linux.slackware#835

  copy link   Newsgroups: alt.os.linux.slackware
Path: i2pn2.org!i2pn.org!aioe.org!eternal-september.org!reader02.eternal-september.org!.POSTED!not-for-mail
From: NoEMail@home.org (root)
Newsgroups: alt.os.linux.slackware
Subject: Re: System crashing: I need help
Date: Fri, 7 Jan 2022 18:15:09 -0000 (UTC)
Organization: Linux Advocacy
Lines: 58
Message-ID: <sr9vvd$3sl$2@dont-email.me>
References: <sr9mpb$378$1@dont-email.me> <sr9r0o$pl6$2@dont-email.me>
Injection-Date: Fri, 7 Jan 2022 18:15:09 -0000 (UTC)
Injection-Info: reader02.eternal-september.org; posting-host="f6d1677708a6ca7a65d1fa476ce2d526";
logging-data="3989"; mail-complaints-to="abuse@eternal-september.org"; posting-account="U2FsdGVkX1+V0Qz6CRVJG9WU5nJ65d9N6TDzzElpJIQ="
User-Agent: slrn/1.0.2 (Linux)
Cancel-Lock: sha1:s6EA13fWKU11qXX30rwmzGrBCgU=
 by: root - Fri, 7 Jan 2022 18:15 UTC

Henrik Carlqvist <Henrik.Carlqvist@deadspam.com> wrote:
> On Fri, 07 Jan 2022 15:38:19 +0000, root wrote:
>> Only the computer case and NVidia graphics card remain from the original
>> system, and still the crashes persist.
>
> From those two I would first try to replace the nVidia card. :-)
>
>> When I got up this morning, the system had crashed during the night.
>> After rebooting I looked at the syslog and I found a stream of:
>> rcu_sched self-detected stall on CPU
>> errors which continued until I rebooted the system
>
> Could you in the syslog see anything special just before those messages
> started flooding the syslog?
>
>> This seems to be related to a kernel overload as if there were too many
>> tasks for the system to keep up.
>
> Do you have any other machines in the network? If so, you might be able
> to use one of those to monitor your problematic machine.
>
> Mostly, when a machine crashes it will lose the interesting last part of
> system logs. With syslog configured to send logs to a log server those
> interesting log messages can sometimes be saved.
>
> Having snmpd running on the problematic machine and something like mrgt
> monitor the machine might give useful graphs to inspect even though mrtg
> only samples the machine every 5 minutes. On those graphs you can see if
> any partition fills up, machine load, CPU usage, memory usage. If you
> also provide data from lmsensors to snmpd you can make graphs of
> temperatures and fan speeds. I wrote the project
> https://sourceforge.net/projects/nvgpu-smi-snmp/ to also be able to
> monitor nVidia GPU usage with snmp/mrtg.
>
> Another thing to try when the system responds to ping but keyboard and
> mouse has hung is to see if you can ssh into the system. Once logged in
> you might get clues about what is going on by studying the output from
> commands like dmesg and top.
>
> regards Henrik
>

Thanks for responding. I was able to ssh in and verify that the kernel
was still running at the last "crash". As I said, syslog revealed
that the kernel was overloaded. I have been focussed entirely on
hardware, now I think I have to look into software. I'm not
entirely sure, but I think the trouble started after a change was
made to my data gathering/cleaning software. Some routines to
process the online data were written by me in (c) many years ago.
About the problems started, I replaced what I had written by
software written by an accomplished javascript programmer. That
opened me up to a vulnerability of the javascript as well as node.js.

For the time being, I have shutdown the data gathering/processing
tasks to see if the problems go away. I will give that a few
days to see if the problems persist. In case the problems
persist I will swap out the graphics card. The Nouveau driver
is a non-starter.

Re: System crashing: I need help

<sra105$dq4$1@dont-email.me>

  copy mid

https://www.rocksolidbbs.com/computers/article-flat.php?id=836&group=alt.os.linux.slackware#836

  copy link   Newsgroups: alt.os.linux.slackware
Path: i2pn2.org!i2pn.org!eternal-september.org!reader02.eternal-september.org!.POSTED!not-for-mail
From: rich@example.invalid (Rich)
Newsgroups: alt.os.linux.slackware
Subject: Re: System crashing: I need help
Date: Fri, 7 Jan 2022 18:32:37 -0000 (UTC)
Organization: A noiseless patient Spider
Lines: 81
Message-ID: <sra105$dq4$1@dont-email.me>
References: <sr9mpb$378$1@dont-email.me> <sr9qnn$v8f$1@dont-email.me> <sr9vbj$3sl$1@dont-email.me>
Injection-Date: Fri, 7 Jan 2022 18:32:37 -0000 (UTC)
Injection-Info: reader02.eternal-september.org; posting-host="1065ac07e1ed0f386a89486036d2d4e0";
logging-data="14148"; mail-complaints-to="abuse@eternal-september.org"; posting-account="U2FsdGVkX1/dw/PobYnohSbhaMiJyx8T"
User-Agent: tin/2.0.1-20111224 ("Achenvoir") (UNIX) (Linux/3.10.17 (x86_64))
Cancel-Lock: sha1:8Y/vuh4vZyMTem0i3mF5MIv3tlA=
 by: Rich - Fri, 7 Jan 2022 18:32 UTC

root <NoEMail@home.org> wrote:
> Rich <rich@example.invalid> wrote:
>> root <NoEMail@home.org> wrote:
>>> The system would run for a few days and crash again.
>>>
>>> I swapped out the power supply with a brand new 750w unit.
>>> The crashes continued.
>>>
>>> I swapped out the motherboard/cpu/memory with one
>>> from a working machine. The crashes continued.
>>>
>>> I updated the 10 year old bios on the motherboard.
>>> I tried different kernels.
>>> I updated everything with slackpkg.
>>> I updated Chrome to the latest version. Chrome runs all the time.
>>>
>>> Only the computer case and NVidia graphics card remain
>>> from the original system, and still the crashes persist.
>>
>> You've given us very little info with which to help. But...
>>
>> Are you running the NVidia closed-source driver or the open source
>> driver?
>>
>> If closed-source driver, then try the open source driver.
>>
>>> Among other Call Traces in the syslog I see something
>>> that must have originated within Chrome, and another
>>> crash from kswapd, when I have no swap partition.
>>
>> Hmm... How much RAM?
>>
>> Chrome is a known memory hog, and if you have no swap, then anytime
>> chrome trys to grow beyond the free memory left in the system's ram
>> after everything else that is loaded, things will go bad very fast.
>>
>
> Thanks for responding.
>
> I am running the NVidia driver. Nouveau, the alternative, does not
> support driving two different displays,

I am presently typing this on a system running Nouveau with an Nvidia
card driving two displays, so Nouveau does support multiple displays.

> The system has 16GB of ram. Up until a few weeks ago the system
> ran 24/7 and only was taken down to change SATA drives. Uptime
> of several months was the norm.

But, a Chrome process consuming 12+GB of ram is not unheard of. You
could, possibly, still have an "out of memory, with no swap" situation.

> I am beginning to think that this isn't a hardware problem. The
> server functions at two levels: to gather, clean-up, and source
> online data, and as a A/V server.

You state you've replaced everything except the Nvidia card. So if it
is hardware, the only common hardware is the Nvidia card itself.
Which, of course, because it is in common, /could/ be the culprit
(i.e., you have not ruled it out, neither have you confirmed it to be
the culprit).

One way this /could/ be a hardware problem, and only just now manifest
itself, is:

1) filter capacitors on the on-board voltage generators for the Nvidia
card have been slowly degrading, and have now reached the point
where their filtering is allowing just a bit too much ripple through
- which would cause a "runs fine for years, then starts failing"
situation.

2) cooling for the nvidia card has gotten poor (i.e., dust clogging
card) - which would also cause a "runs for years, then starts
failing situation".

Both are guesses. And it could be an out of memory with no swap issue
(which would not be the Nvidia card). Which is also a guess.

We depend upon you to test, and tell us which guess is ruled out.

Re: System crashing: I need help

<sra1hk$dq4$2@dont-email.me>

  copy mid

https://www.rocksolidbbs.com/computers/article-flat.php?id=837&group=alt.os.linux.slackware#837

  copy link   Newsgroups: alt.os.linux.slackware
Path: i2pn2.org!i2pn.org!aioe.org!eternal-september.org!reader02.eternal-september.org!.POSTED!not-for-mail
From: rich@example.invalid (Rich)
Newsgroups: alt.os.linux.slackware
Subject: Re: System crashing: I need help
Date: Fri, 7 Jan 2022 18:41:56 -0000 (UTC)
Organization: A noiseless patient Spider
Lines: 18
Message-ID: <sra1hk$dq4$2@dont-email.me>
References: <sr9mpb$378$1@dont-email.me> <sr9r0o$pl6$2@dont-email.me> <sr9vvd$3sl$2@dont-email.me>
Injection-Date: Fri, 7 Jan 2022 18:41:56 -0000 (UTC)
Injection-Info: reader02.eternal-september.org; posting-host="1065ac07e1ed0f386a89486036d2d4e0";
logging-data="14148"; mail-complaints-to="abuse@eternal-september.org"; posting-account="U2FsdGVkX1+pmyq651ToD4JPO1oHplm2"
User-Agent: tin/2.0.1-20111224 ("Achenvoir") (UNIX) (Linux/3.10.17 (x86_64))
Cancel-Lock: sha1:+DFPQj9gjJll1AvME52kiVlvPsk=
 by: Rich - Fri, 7 Jan 2022 18:41 UTC

root <NoEMail@home.org> wrote:
> Thanks for responding. I was able to ssh in and verify that the
> kernel was still running at the last "crash". As I said, syslog
> revealed that the kernel was overloaded. I have been focussed
> entirely on hardware, now I think I have to look into software. I'm
> not entirely sure, but I think the trouble started after a change was
> made to my data gathering/cleaning software. Some routines to
> process the online data were written by me in (c) many years ago.
> About the problems started, I replaced what I had written by software
> written by an accomplished javascript programmer. That opened me up
> to a vulnerability of the javascript as well as node.js.

Hmm, this would give more weight to an "out of memory -- with no swap"
situation being what you are seeing. A JS variant, running in Node, of
your C "data gathering" routines would likely be much more memory
hungry, and that combined with the memory hog that is chrome, might
have pushed the system past its present memory amount.

Re: System crashing: I need help

<sra69o$pd8$1@dont-email.me>

  copy mid

https://www.rocksolidbbs.com/computers/article-flat.php?id=838&group=alt.os.linux.slackware#838

  copy link   Newsgroups: alt.os.linux.slackware
Path: i2pn2.org!i2pn.org!aioe.org!eternal-september.org!reader02.eternal-september.org!.POSTED!not-for-mail
From: NoEMail@home.org (root)
Newsgroups: alt.os.linux.slackware
Subject: Re: System crashing: I need help
Date: Fri, 7 Jan 2022 20:03:05 -0000 (UTC)
Organization: Linux Advocacy
Lines: 41
Message-ID: <sra69o$pd8$1@dont-email.me>
References: <sr9mpb$378$1@dont-email.me> <sr9qnn$v8f$1@dont-email.me>
<sr9vbj$3sl$1@dont-email.me> <sra105$dq4$1@dont-email.me>
Injection-Date: Fri, 7 Jan 2022 20:03:05 -0000 (UTC)
Injection-Info: reader02.eternal-september.org; posting-host="f6d1677708a6ca7a65d1fa476ce2d526";
logging-data="26024"; mail-complaints-to="abuse@eternal-september.org"; posting-account="U2FsdGVkX1/8aV1wP/KXn9OMVlT+28GMOOLWfiEM9H0="
User-Agent: slrn/1.0.2 (Linux)
Cancel-Lock: sha1:qPpbsSIGOXwpC4Bm98m1sFqJXiE=
 by: root - Fri, 7 Jan 2022 20:03 UTC

Rich <rich@example.invalid> wrote:
> But, a Chrome process consuming 12+GB of ram is not unheard of. You
> could, possibly, still have an "out of memory, with no swap" situation.
>
>
> You state you've replaced everything except the Nvidia card. So if it
> is hardware, the only common hardware is the Nvidia card itself.
> Which, of course, because it is in common, /could/ be the culprit
> (i.e., you have not ruled it out, neither have you confirmed it to be
> the culprit).
>
> One way this /could/ be a hardware problem, and only just now manifest
> itself, is:
>
> 1) filter capacitors on the on-board voltage generators for the Nvidia
> card have been slowly degrading, and have now reached the point
> where their filtering is allowing just a bit too much ripple through
> - which would cause a "runs fine for years, then starts failing"
> situation.

>
> 2) cooling for the nvidia card has gotten poor (i.e., dust clogging
> card) - which would also cause a "runs for years, then starts
> failing situation".

The video card is only 2 years old. It is a fanless card with a
large finned heat sink.

>
> Both are guesses. And it could be an out of memory with no swap issue
> (which would not be the Nvidia card). Which is also a guess.
>
> We depend upon you to test, and tell us which guess is ruled out.
>
>

Thanks for responding. I am currently working on the hypothesis that
it isn't a hardware problem. I am looking into the software side.

Re: System crashing: I need help

<sralce$8hj$1@dont-email.me>

  copy mid

https://www.rocksolidbbs.com/computers/article-flat.php?id=839&group=alt.os.linux.slackware#839

  copy link   Newsgroups: alt.os.linux.slackware
Path: i2pn2.org!i2pn.org!weretis.net!feeder8.news.weretis.net!eternal-september.org!reader02.eternal-september.org!.POSTED!not-for-mail
From: rich@example.invalid (Rich)
Newsgroups: alt.os.linux.slackware
Subject: Re: System crashing: I need help
Date: Sat, 8 Jan 2022 00:20:30 -0000 (UTC)
Organization: A noiseless patient Spider
Lines: 42
Message-ID: <sralce$8hj$1@dont-email.me>
References: <sr9mpb$378$1@dont-email.me> <sr9qnn$v8f$1@dont-email.me> <sr9vbj$3sl$1@dont-email.me> <sra105$dq4$1@dont-email.me> <sra69o$pd8$1@dont-email.me>
Injection-Date: Sat, 8 Jan 2022 00:20:30 -0000 (UTC)
Injection-Info: reader02.eternal-september.org; posting-host="624459f164b872a06591d303cf203435";
logging-data="8755"; mail-complaints-to="abuse@eternal-september.org"; posting-account="U2FsdGVkX1/Lxyb9/n7e45MwFMfx+NWo"
User-Agent: tin/2.0.1-20111224 ("Achenvoir") (UNIX) (Linux/3.10.17 (x86_64))
Cancel-Lock: sha1:ZZGo5IgdvOj7Ki9z4zdLyRk9rlM=
 by: Rich - Sat, 8 Jan 2022 00:20 UTC

root <NoEMail@home.org> wrote:
> Rich <rich@example.invalid> wrote:
>> But, a Chrome process consuming 12+GB of ram is not unheard of. You
>> could, possibly, still have an "out of memory, with no swap" situation.
>>
>>
>> You state you've replaced everything except the Nvidia card. So if it
>> is hardware, the only common hardware is the Nvidia card itself.
>> Which, of course, because it is in common, /could/ be the culprit
>> (i.e., you have not ruled it out, neither have you confirmed it to be
>> the culprit).
>>
>> One way this /could/ be a hardware problem, and only just now manifest
>> itself, is:
>>
>> 1) filter capacitors on the on-board voltage generators for the Nvidia
>> card have been slowly degrading, and have now reached the point
>> where their filtering is allowing just a bit too much ripple through
>> - which would cause a "runs fine for years, then starts failing"
>> situation.
>>
>> 2) cooling for the nvidia card has gotten poor (i.e., dust clogging
>> card) - which would also cause a "runs for years, then starts
>> failing situation".
>
> The video card is only 2 years old. It is a fanless card with a
> large finned heat sink.

In that case, does it get enough airflow inside the case for its
cooling needs? A clogged PSU fan, reducing airflow through the box in
total, would also reduce airflow for the fanless card.

>> Both are guesses. And it could be an out of memory with no swap issue
>> (which would not be the Nvidia card). Which is also a guess.
>>
>> We depend upon you to test, and tell us which guess is ruled out.
>
> Thanks for responding. I am currently working on the hypothesis that
> it isn't a hardware problem. I am looking into the software side.

Your other post about switching to a node.js application from your own
C code implies that this switch might be the cause.

Re: System crashing: I need help

<srb3dl$oep$1@dont-email.me>

  copy mid

https://www.rocksolidbbs.com/computers/article-flat.php?id=840&group=alt.os.linux.slackware#840

  copy link   Newsgroups: alt.os.linux.slackware
Path: i2pn2.org!i2pn.org!eternal-september.org!reader02.eternal-september.org!.POSTED!not-for-mail
From: NoEMail@home.org (root)
Newsgroups: alt.os.linux.slackware
Subject: Re: System crashing: I need help
Date: Sat, 8 Jan 2022 04:20:05 -0000 (UTC)
Organization: Linux Advocacy
Lines: 55
Message-ID: <srb3dl$oep$1@dont-email.me>
References: <sr9mpb$378$1@dont-email.me> <sr9qnn$v8f$1@dont-email.me>
<sr9vbj$3sl$1@dont-email.me> <sra105$dq4$1@dont-email.me>
<sra69o$pd8$1@dont-email.me> <sralce$8hj$1@dont-email.me>
Injection-Date: Sat, 8 Jan 2022 04:20:05 -0000 (UTC)
Injection-Info: reader02.eternal-september.org; posting-host="d2286fa10e6b8e2d0f1472cf43769645";
logging-data="25049"; mail-complaints-to="abuse@eternal-september.org"; posting-account="U2FsdGVkX19j1f30AD6RJkkTyzU1HIptjUEPcLN9WE0="
User-Agent: slrn/1.0.2 (Linux)
Cancel-Lock: sha1:Gnq+Pj7EovkGuTP1DseSx7m+2UQ=
 by: root - Sat, 8 Jan 2022 04:20 UTC

Rich <rich@example.invalid> wrote:
> root <NoEMail@home.org> wrote:
>> Rich <rich@example.invalid> wrote:
>>> But, a Chrome process consuming 12+GB of ram is not unheard of. You
>>> could, possibly, still have an "out of memory, with no swap" situation.
>>>
>>>
>>> You state you've replaced everything except the Nvidia card. So if it
>>> is hardware, the only common hardware is the Nvidia card itself.
>>> Which, of course, because it is in common, /could/ be the culprit
>>> (i.e., you have not ruled it out, neither have you confirmed it to be
>>> the culprit).
>>>
>>> One way this /could/ be a hardware problem, and only just now manifest
>>> itself, is:
>>>
>>> 1) filter capacitors on the on-board voltage generators for the Nvidia
>>> card have been slowly degrading, and have now reached the point
>>> where their filtering is allowing just a bit too much ripple through
>>> - which would cause a "runs fine for years, then starts failing"
>>> situation.
>>>
>>> 2) cooling for the nvidia card has gotten poor (i.e., dust clogging
>>> card) - which would also cause a "runs for years, then starts
>>> failing situation".
>>
>> The video card is only 2 years old. It is a fanless card with a
>> large finned heat sink.
>
> In that case, does it get enough airflow inside the case for its
> cooling needs? A clogged PSU fan, reducing airflow through the box in
> total, would also reduce airflow for the fanless card.

The box has 4 4-inch fans, the and one three inch fan, along
with the cpu fan. During the day the cpu temps stay about
37 degrees.

>
>>> Both are guesses. And it could be an out of memory with no swap issue
>>> (which would not be the Nvidia card). Which is also a guess.
>>>
>>> We depend upon you to test, and tell us which guess is ruled out.
>>
>> Thanks for responding. I am currently working on the hypothesis that
>> it isn't a hardware problem. I am looking into the software side.
>
> Your other post about switching to a node.js application from your own
> C code implies that this switch might be the cause.

I'm looking into that possibiity even as we speak. I have disabled
the .js code and now have to wait for the system to crash. It
may take several days.

Thanks for following.

1
server_pubkey.txt

rocksolid light 0.9.8
clearnet tor