Message-ID:

A programming language is low level when its programs require attention to the irrelevant.

devel / comp.lang.php / poor performance while processing a file one byte a time

poor performance while processing a file one byte a time

<ssbqrm$46r$1@gioia.aioe.org>

https://www.rocksolidbbs.com/devel/article-flat.php?id=416&group=comp.lang.php#416

Path: i2pn2.org!i2pn.org!aioe.org!299gYy2nqWB43X4cCBV6zg.user.46.165.242.75.POSTED!not-for-mail
From: mateusz@xyz.invalid (Mateusz Viste)
Newsgroups: comp.lang.php
Subject: poor performance while processing a file one byte a time
Date: Thu, 20 Jan 2022 15:16:22 +0100
Organization: . . .
Message-ID: <ssbqrm$46r$1@gioia.aioe.org>
Mime-Version: 1.0
Content-Type: text/plain; charset=US-ASCII
Content-Transfer-Encoding: 7bit
Injection-Info: gioia.aioe.org; logging-data="4315"; posting-host="299gYy2nqWB43X4cCBV6zg.user.gioia.aioe.org"; mail-complaints-to="abuse@aioe.org";
X-Notice: Filtered by postfilter v. 0.9.2

by: Mateusz Viste - Thu, 20 Jan 2022 14:16 UTC

Hello,

I am processing some files using php. Basically I read every byte of
the file and perform a simple operation on it to compute a sum.

My initial implementation was in C, but now I am trying re-doing the
same in PHP. This is how my PHP code looks like:

function fn($fname) {
$fd = fopen($fname, 'rb');
if ($fd === false) return(0);

$result = 0;

while (!feof($fd)) {

$buff = fread($fd, 1024 * 1024);

foreach (str_split($buff) as $b) {
$result += ord($b);
$result &= 0xffff;
}
}

fclose($fd);
return($result);
}

It works, but it is really slow (approximately 100x slower than the
original C code). I know that I should not expect much performance
from interpreted PHP code, but still - is there any trick I could use to
speed this up?

I have also tried to replace str_split() and ord() with unpack('C*'),
but it was even slower. Anything else I could try?

Mateusz

Re: poor performance while processing a file one byte a time

<j4tfnoFu1qqU1@mid.individual.net>

copy mid

https://www.rocksolidbbs.com/devel/article-flat.php?id=417&group=comp.lang.php#417

copy link Newsgroups: comp.lang.php

Path: i2pn2.org!i2pn.org!news.swapon.de!fu-berlin.de!uni-berlin.de!individual.net!not-for-mail
From: jpstewart@personalprojects.net (John-Paul Stewart)
Newsgroups: comp.lang.php
Subject: Re: poor performance while processing a file one byte a time
Date: Thu, 20 Jan 2022 11:08:23 -0500
Lines: 51
Message-ID: <j4tfnoFu1qqU1@mid.individual.net>
References: <ssbqrm$46r$1@gioia.aioe.org>
Mime-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 7bit
X-Trace: individual.net 55I63kZtpJCUvOTF5Uu7LASBh1Wud3KwHX2/6gwyQKKLlNpIaH
Cancel-Lock: sha1:pPRQEVeqYvL0mCnRVMjaS1QBVPg=
User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:91.0) Gecko/20100101
Thunderbird/91.5.0
Content-Language: en-CA
In-Reply-To: <ssbqrm$46r$1@gioia.aioe.org>

by: John-Paul Stewart - Thu, 20 Jan 2022 16:08 UTC

On 2022-01-20 09:16, Mateusz Viste wrote:
> Hello,
>
> I am processing some files using php. Basically I read every byte of
> the file and perform a simple operation on it to compute a sum.
>
> My initial implementation was in C, but now I am trying re-doing the
> same in PHP. This is how my PHP code looks like:
>
>
> function fn($fname) {
> $fd = fopen($fname, 'rb');
> if ($fd === false) return(0);
>
> $result = 0;
>
> while (!feof($fd)) {
>
> $buff = fread($fd, 1024 * 1024);
>
> foreach (str_split($buff) as $b) {
> $result += ord($b);
> $result &= 0xffff;
> }
> }
>
> fclose($fd);
> return($result);
> }
>
> It works, but it is really slow (approximately 100x slower than the
> original C code). I know that I should not expect much performance
> from interpreted PHP code, but still - is there any trick I could use to
> speed this up?

Your foreach(str_split) line is an obvious place to start. str_split()
creates an array from a string. In your case, the input buffer is
1024*1024 bytes long, so you're splitting that megabyte string and
(re-)creating an array of more than a million elements for _each
iteration of the loop_. (Which will be a million+ times.) Why? The
very first thing you should consider is pulling that out and doing it
just once:

$buff = fread($fd, 1024 * 1024);
$whatever = str_split($buff);
foreach ($whatever as $b)
....

(There's nothing specific to PHP about that advice either. It's equally
applicable to C, although modern C compilers _may_ make that
optimization for you.)

Re: poor performance while processing a file one byte a time

<ssc2a3$1v7f$1@gioia.aioe.org>

copy mid

https://www.rocksolidbbs.com/devel/article-flat.php?id=418&group=comp.lang.php#418

copy link Newsgroups: comp.lang.php

Path: i2pn2.org!i2pn.org!aioe.org!299gYy2nqWB43X4cCBV6zg.user.46.165.242.75.POSTED!not-for-mail
From: mateusz@xyz.invalid (Mateusz Viste)
Newsgroups: comp.lang.php
Subject: Re: poor performance while processing a file one byte a time
Date: Thu, 20 Jan 2022 17:23:31 +0100
Organization: . . .
Message-ID: <ssc2a3$1v7f$1@gioia.aioe.org>
References: <ssbqrm$46r$1@gioia.aioe.org>
<j4tfnoFu1qqU1@mid.individual.net>
Mime-Version: 1.0
Content-Type: text/plain; charset=US-ASCII
Content-Transfer-Encoding: 7bit
Injection-Info: gioia.aioe.org; logging-data="64751"; posting-host="299gYy2nqWB43X4cCBV6zg.user.gioia.aioe.org"; mail-complaints-to="abuse@aioe.org";
X-Notice: Filtered by postfilter v. 0.9.2

by: Mateusz Viste - Thu, 20 Jan 2022 16:23 UTC

On Thu, 20 Jan 2022 11:08:23 -0500
John-Paul Stewart <jpstewart@personalprojects.net> wrote:

> > function fn($fname) {
> > $fd = fopen($fname, 'rb');
> > if ($fd === false) return(0);
> >
> > $result = 0;
> >
> > while (!feof($fd)) {
> >
> > $buff = fread($fd, 1024 * 1024);
> >
> > foreach (str_split($buff) as $b) {
> > $result += ord($b);
> > $result &= 0xffff;
> > }
> > }
> >
> > fclose($fd);
> > return($result);
> > }
>
> Your foreach(str_split) line is an obvious place to start.
> str_split() creates an array from a string. In your case, the input
> buffer is 1024*1024 bytes long, so you're splitting that megabyte
> string and (re-)creating an array of more than a million elements for
> _each iteration of the loop_. (Which will be a million+ times.)

Are you really sure about that? My understanding is that the foreach()
argument is processed only once... Isn't that the case?

> Why? The very first thing you should consider is pulling that out
> and doing it just once:
>
> $buff = fread($fd, 1024 * 1024);
> $whatever = str_split($buff);
> foreach ($whatever as $b)

Initially I had such version of the code, but it was throwing such
error on the str_split() line:

"PHP Fatal error: Allowed memory size of 134217728 bytes exhausted
(tried to allocate 4096 bytes)"

I reduced the fread() buffer to 64K and the error went away. But the
speed is still the same. Code is this:

while (!feof($fd)) {

$buffstr = fread($fd, 64 * 1024);
$buffarr = str_split($buffstr);

foreach ($buffarr as $b) {
$result += ord($b);
$result &= 0xffff;
}
}

> (There's nothing specific to PHP about that advice either. It's
> equally applicable to C, although modern C compilers _may_ make that
> optimization for you.)

That's hardly applicable in C, since there is no "foreach" in C. There
is a "for", and it's initialization argument is processed exactly once.

Mateusz

Re: poor performance while processing a file one byte a time

<ssc33f$1v7f$2@gioia.aioe.org>

copy mid

https://www.rocksolidbbs.com/devel/article-flat.php?id=419&group=comp.lang.php#419

copy link Newsgroups: comp.lang.php

Path: i2pn2.org!i2pn.org!aioe.org!299gYy2nqWB43X4cCBV6zg.user.46.165.242.75.POSTED!not-for-mail
From: mateusz@xyz.invalid (Mateusz Viste)
Newsgroups: comp.lang.php
Subject: Re: poor performance while processing a file one byte a time
Date: Thu, 20 Jan 2022 17:37:03 +0100
Organization: . . .
Message-ID: <ssc33f$1v7f$2@gioia.aioe.org>
References: <ssbqrm$46r$1@gioia.aioe.org>
<j4tfnoFu1qqU1@mid.individual.net>
Mime-Version: 1.0
Content-Type: text/plain; charset=US-ASCII
Content-Transfer-Encoding: 7bit
Injection-Info: gioia.aioe.org; logging-data="64751"; posting-host="299gYy2nqWB43X4cCBV6zg.user.gioia.aioe.org"; mail-complaints-to="abuse@aioe.org";
X-Notice: Filtered by postfilter v. 0.9.2

by: Mateusz Viste - Thu, 20 Jan 2022 16:37 UTC

On Thu, 20 Jan 2022 11:08:23 -0500
John-Paul Stewart <jpstewart@personalprojects.net> wrote:

> Your foreach(str_split) line is an obvious place to start.
> str_split() creates an array from a string. In your case, the input
> buffer is 1024*1024 bytes long, so you're splitting that megabyte
> string and (re-)creating an array of more than a million elements for
> _each iteration of the loop_. (Which will be a million+ times.)

Okay, I have checked it, and I can confirm now that you are mistaken in
your belief. See this program:

<?php

function getArr() {
echo "getArr() call\n";
return array(1, 2, 3);
}

foreach (getArr() as $i) {
echo "{$i}\n";
}

And here is its output:

$ php t.php
getArr() call
1 2
3

The foreach() initialization is clearly processed only once.

Any other ideas?

Mateusz

Re: poor performance while processing a file one byte a time

<j4tkvvFikcU1@mid.individual.net>

copy mid

https://www.rocksolidbbs.com/devel/article-flat.php?id=420&group=comp.lang.php#420

copy link Newsgroups: comp.lang.php

Path: i2pn2.org!i2pn.org!news.swapon.de!fu-berlin.de!uni-berlin.de!individual.net!not-for-mail
From: jpstewart@personalprojects.net (John-Paul Stewart)
Newsgroups: comp.lang.php
Subject: Re: poor performance while processing a file one byte a time
Date: Thu, 20 Jan 2022 12:38:06 -0500
Lines: 12
Message-ID: <j4tkvvFikcU1@mid.individual.net>
References: <ssbqrm$46r$1@gioia.aioe.org> <j4tfnoFu1qqU1@mid.individual.net>
<ssc2a3$1v7f$1@gioia.aioe.org>
Mime-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 7bit
X-Trace: individual.net sN+oASXejI9B7307piMxkA7zm65+FqgHKwdxu5wnLOSgPbqqT5
Cancel-Lock: sha1:DbCxj1g+MGqvhQwqKaIgDUw89Wc=
User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:91.0) Gecko/20100101
Thunderbird/91.5.0
Content-Language: en-CA
In-Reply-To: <ssc2a3$1v7f$1@gioia.aioe.org>

by: John-Paul Stewart - Thu, 20 Jan 2022 17:38 UTC

On 2022-01-20 11:23, Mateusz Viste wrote:
>
>> (There's nothing specific to PHP about that advice either. It's
>> equally applicable to C, although modern C compilers _may_ make that
>> optimization for you.)
>
> That's hardly applicable in C, since there is no "foreach" in C. There
> is a "for", and it's initialization argument is processed exactly once.

I was speaking more broadly about pulling invariant code out of any and
all loops regardless of where it appears in said loop, rather than
relying on the interpreter or compiler to handle it for you.

Re: poor performance while processing a file one byte a time

<j4tl77FjplU1@mid.individual.net>

copy mid

https://www.rocksolidbbs.com/devel/article-flat.php?id=421&group=comp.lang.php#421

copy link Newsgroups: comp.lang.php

Path: i2pn2.org!i2pn.org!news.swapon.de!fu-berlin.de!uni-berlin.de!individual.net!not-for-mail
From: jpstewart@personalprojects.net (John-Paul Stewart)
Newsgroups: comp.lang.php
Subject: Re: poor performance while processing a file one byte a time
Date: Thu, 20 Jan 2022 12:41:58 -0500
Lines: 12
Message-ID: <j4tl77FjplU1@mid.individual.net>
References: <ssbqrm$46r$1@gioia.aioe.org> <j4tfnoFu1qqU1@mid.individual.net>
<ssc33f$1v7f$2@gioia.aioe.org>
Mime-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 7bit
X-Trace: individual.net Sgrh2LfRim8Qr3tG/vCX/Q2IHqIy+9GLsJGVv0C8Pt3PkVh6wQ
Cancel-Lock: sha1:Z4qBJ7n+HFMbBQgWbMq8rTZJu7g=
User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:91.0) Gecko/20100101
Thunderbird/91.5.0
Content-Language: en-CA
In-Reply-To: <ssc33f$1v7f$2@gioia.aioe.org>

by: John-Paul Stewart - Thu, 20 Jan 2022 17:41 UTC

On 2022-01-20 11:37, Mateusz Viste wrote:
>
> The foreach() initialization is clearly processed only once.
>
> Any other ideas?

My next question is why create the array at all?

You can just use a simple for loop to iterate over the string, with
$buff[$i] to access it character by character. That would avoid the
overhead (of both memory use and computation time) that's involved in
creating the associative array.

Re: poor performance while processing a file one byte a time

<j4tnljF135hU1@mid.individual.net>

copy mid

https://www.rocksolidbbs.com/devel/article-flat.php?id=422&group=comp.lang.php#422

copy link Newsgroups: comp.lang.php

Path: i2pn2.org!i2pn.org!news.swapon.de!fu-berlin.de!uni-berlin.de!individual.net!not-for-mail
From: usenet@arnowelzel.de (Arno Welzel)
Newsgroups: comp.lang.php
Subject: Re: poor performance while processing a file one byte a time
Date: Thu, 20 Jan 2022 19:23:49 +0100
Lines: 79
Message-ID: <j4tnljF135hU1@mid.individual.net>
References: <ssbqrm$46r$1@gioia.aioe.org>
Mime-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 7bit
X-Trace: individual.net pCuOus/luns5Fb0UFN6tjgCkKN4rWNbPDF26m34dT7EiIcV/Id
Cancel-Lock: sha1:lajlpTB/MTJV29zwyQYVApJqV/o=
Content-Language: de-DE
In-Reply-To: <ssbqrm$46r$1@gioia.aioe.org>

by: Arno Welzel - Thu, 20 Jan 2022 18:23 UTC

Mateusz Viste:

> Hello,
>
> I am processing some files using php. Basically I read every byte of
> the file and perform a simple operation on it to compute a sum.
>
> My initial implementation was in C, but now I am trying re-doing the
> same in PHP. This is how my PHP code looks like:
>
>
> function fn($fname) {
> $fd = fopen($fname, 'rb');
> if ($fd === false) return(0);
>
> $result = 0;
>
> while (!feof($fd)) {
>
> $buff = fread($fd, 1024 * 1024);
>
> foreach (str_split($buff) as $b) {
> $result += ord($b);
> $result &= 0xffff;
> }
> }
>
> fclose($fd);
> return($result);
> }
>
> It works, but it is really slow (approximately 100x slower than the
> original C code). I know that I should not expect much performance
> from interpreted PHP code, but still - is there any trick I could use to
> speed this up?

By *not* using arrays.

str_split() creates an array based on a string:
<https://www.php.net/manual/en/function.str-split.php>

> I have also tried to replace str_split() and ord() with unpack('C*'),
> but it was even slower. Anything else I could try?

Just use the string as it is:

function fn($fname) {
$fd = fopen($fname, 'rb');
if ($fd === false) return(0);

$result = 0;

while (!feof($fd)) {

$buff = fread($fd, 1024 * 1024);

$len = strlen($buff);
$pos = 0;
while ($pos < $len) {
$result += ord($buff[$pos]);
$result &= 0xffff;
$pos++;
}
}

fclose($fd);
return($result);
}

However this may still be quite slow on larger files. Since it seems you
want to create some kind of checksum based on the file content, you may
want to use something else like hash_file(), sha1_file() or md5_file()
and use the result of these calls instead processing the whole file
content in a loop.

--
Arno Welzel
https://arnowelzel.de

Re: poor performance while processing a file one byte a time

<sscak4$kme$1@gioia.aioe.org>

copy mid

https://www.rocksolidbbs.com/devel/article-flat.php?id=423&group=comp.lang.php#423

copy link Newsgroups: comp.lang.php

Path: i2pn2.org!i2pn.org!aioe.org!299gYy2nqWB43X4cCBV6zg.user.46.165.242.75.POSTED!not-for-mail
From: mateusz@xyz.invalid (Mateusz Viste)
Newsgroups: comp.lang.php
Subject: Re: poor performance while processing a file one byte a time
Date: Thu, 20 Jan 2022 19:45:24 +0100
Organization: . . .
Message-ID: <sscak4$kme$1@gioia.aioe.org>
References: <ssbqrm$46r$1@gioia.aioe.org>
<j4tnljF135hU1@mid.individual.net>
Mime-Version: 1.0
Content-Type: text/plain; charset=US-ASCII
Content-Transfer-Encoding: 7bit
Injection-Info: gioia.aioe.org; logging-data="21198"; posting-host="299gYy2nqWB43X4cCBV6zg.user.gioia.aioe.org"; mail-complaints-to="abuse@aioe.org";
X-Notice: Filtered by postfilter v. 0.9.2

by: Mateusz Viste - Thu, 20 Jan 2022 18:45 UTC

On Thu, 20 Jan 2022 19:23:49 +0100
Arno Welzel <usenet@arnowelzel.de> wrote:
> > It works, but it is really slow (approximately 100x slower than the
> > original C code). I know that I should not expect much performance
> > from interpreted PHP code, but still - is there any trick I could
> > use to speed this up?
>
> By *not* using arrays.

That what I figured, but it did not find any way to not using them. :)

> Just use the string as it is:
>
> $len = strlen($buff);
> $pos = 0;
> while ($pos < $len) {
> $result += ord($buff[$pos]);
> $result &= 0xffff;
> $pos++;
> }

Yes, this is faster, but only by 10%. If I understand it right, the
"address a string like an array" is costly in PHP, and it must emulate
an array-like under the hood anyway. I was hoping there might be a
different approach that would be faster by at least an order of
magnitude... But if PHP doesn't have any faster construct for this kind
of things, I will live with it.

> However this may still be quite slow on larger files. Since it seems
> you want to create some kind of checksum based on the file content,
> you may want to use something else like hash_file(), sha1_file() or
> md5_file() and use the result of these calls instead processing the
> whole file content in a loop.

Sadly, that won't work. The kind of checksum I am computing is not
supported by PHP, hence why I do it byte by byte myself. Another
solution is to do it with my C code by system()-calling it from PHP,
but that's really ugly. At this point I'd rather stick with a slow, 100%
PHP solution.

Mateusz

Re: poor performance while processing a file one byte a time

<sscanp$kme$2@gioia.aioe.org>

copy mid

https://www.rocksolidbbs.com/devel/article-flat.php?id=424&group=comp.lang.php#424

copy link Newsgroups: comp.lang.php

Path: i2pn2.org!i2pn.org!aioe.org!299gYy2nqWB43X4cCBV6zg.user.46.165.242.75.POSTED!not-for-mail
From: mateusz@xyz.invalid (Mateusz Viste)
Newsgroups: comp.lang.php
Subject: Re: poor performance while processing a file one byte a time
Date: Thu, 20 Jan 2022 19:47:21 +0100
Organization: . . .
Message-ID: <sscanp$kme$2@gioia.aioe.org>
References: <ssbqrm$46r$1@gioia.aioe.org>
<j4tfnoFu1qqU1@mid.individual.net>
<ssc33f$1v7f$2@gioia.aioe.org>
<j4tl77FjplU1@mid.individual.net>
Mime-Version: 1.0
Content-Type: text/plain; charset=US-ASCII
Content-Transfer-Encoding: 7bit
Injection-Info: gioia.aioe.org; logging-data="21198"; posting-host="299gYy2nqWB43X4cCBV6zg.user.gioia.aioe.org"; mail-complaints-to="abuse@aioe.org";
X-Notice: Filtered by postfilter v. 0.9.2

by: Mateusz Viste - Thu, 20 Jan 2022 18:47 UTC

On Thu, 20 Jan 2022 12:41:58 -0500
John-Paul Stewart <jpstewart@personalprojects.net> wrote:

> On 2022-01-20 11:37, Mateusz Viste wrote:
> >
> > The foreach() initialization is clearly processed only once.
> >
> > Any other ideas?
>
> My next question is why create the array at all?
>
> You can just use a simple for loop to iterate over the string, with
> $buff[$i] to access it character by character. That would avoid the
> overhead (of both memory use and computation time) that's involved in
> creating the associative array.

Yes, I am aware, and that is indeed also something I had tested, but it
doesn't appear to be significantly faster than the array version.
Apparently accessing a string's bytes in an array-like fashion is very
costly.

Mateusz

Re: poor performance while processing a file one byte a time

<sscar2$kme$3@gioia.aioe.org>

copy mid

https://www.rocksolidbbs.com/devel/article-flat.php?id=425&group=comp.lang.php#425

copy link Newsgroups: comp.lang.php

Path: i2pn2.org!i2pn.org!aioe.org!299gYy2nqWB43X4cCBV6zg.user.46.165.242.75.POSTED!not-for-mail
From: mateusz@xyz.invalid (Mateusz Viste)
Newsgroups: comp.lang.php
Subject: Re: poor performance while processing a file one byte a time
Date: Thu, 20 Jan 2022 19:49:05 +0100
Organization: . . .
Message-ID: <sscar2$kme$3@gioia.aioe.org>
References: <ssbqrm$46r$1@gioia.aioe.org>
<j4tfnoFu1qqU1@mid.individual.net>
<ssc2a3$1v7f$1@gioia.aioe.org>
<j4tkvvFikcU1@mid.individual.net>
Mime-Version: 1.0
Content-Type: text/plain; charset=US-ASCII
Content-Transfer-Encoding: 7bit
Injection-Info: gioia.aioe.org; logging-data="21198"; posting-host="299gYy2nqWB43X4cCBV6zg.user.gioia.aioe.org"; mail-complaints-to="abuse@aioe.org";
X-Notice: Filtered by postfilter v. 0.9.2

by: Mateusz Viste - Thu, 20 Jan 2022 18:49 UTC

On Thu, 20 Jan 2022 12:38:06 -0500
John-Paul Stewart <jpstewart@personalprojects.net> wrote:

> On 2022-01-20 11:23, Mateusz Viste wrote:
> >
> >> (There's nothing specific to PHP about that advice either. It's
> >> equally applicable to C, although modern C compilers _may_ make
> >> that optimization for you.)
> >
> > That's hardly applicable in C, since there is no "foreach" in C.
> > There is a "for", and it's initialization argument is processed
> > exactly once.
>
> I was speaking more broadly about pulling invariant code out of any
> and all loops regardless of where it appears in said loop, rather than
> relying on the interpreter or compiler to handle it for you.

Ah, so you were answering a question that wasn't asked, and that is
irrelevant to the case at hand. Okay, that's fair. It's what the
usenet is all about after all. ;-)

Mateusz

Re: poor performance while processing a file one byte a time

<j4trd2F1p5dU1@mid.individual.net>

copy mid

https://www.rocksolidbbs.com/devel/article-flat.php?id=426&group=comp.lang.php#426

copy link Newsgroups: comp.lang.php

Path: i2pn2.org!i2pn.org!news.swapon.de!fu-berlin.de!uni-berlin.de!individual.net!not-for-mail
From: usenet@arnowelzel.de (Arno Welzel)
Newsgroups: comp.lang.php
Subject: Re: poor performance while processing a file one byte a time
Date: Thu, 20 Jan 2022 20:27:32 +0100
Lines: 24
Message-ID: <j4trd2F1p5dU1@mid.individual.net>
References: <ssbqrm$46r$1@gioia.aioe.org> <j4tnljF135hU1@mid.individual.net>
<sscak4$kme$1@gioia.aioe.org>
Mime-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 7bit
X-Trace: individual.net SgtrQG2mrsQ9xwmnfP+9yQwLSNc7WciYq12aU255aZKte6MTl5
Cancel-Lock: sha1:biERTFl1cD9eOXASpm5LVluqPe4=
Content-Language: de-DE
In-Reply-To: <sscak4$kme$1@gioia.aioe.org>

by: Arno Welzel - Thu, 20 Jan 2022 19:27 UTC

Mateusz Viste:

> On Thu, 20 Jan 2022 19:23:49 +0100
> Arno Welzel <usenet@arnowelzel.de> wrote:
[...]
>> However this may still be quite slow on larger files. Since it seems
>> you want to create some kind of checksum based on the file content,
>> you may want to use something else like hash_file(), sha1_file() or
>> md5_file() and use the result of these calls instead processing the
>> whole file content in a loop.
>
> Sadly, that won't work. The kind of checksum I am computing is not
> supported by PHP, hence why I do it byte by byte myself. Another
> solution is to do it with my C code by system()-calling it from PHP,
> but that's really ugly. At this point I'd rather stick with a slow, 100%
> PHP solution.

Or you create your own PHP extension which can then be used in the
script ;-)

--
Arno Welzel
https://arnowelzel.de

Subject	Author
poor performance while processing a file one byte a time	Mateusz Viste
Re: poor performance while processing a file one byte a time	John-Paul Stewart
Re: poor performance while processing a file one byte a time	Mateusz Viste
Re: poor performance while processing a file one byte a time	John-Paul Stewart
Re: poor performance while processing a file one byte a time	Mateusz Viste
Re: poor performance while processing a file one byte a time	Mateusz Viste
Re: poor performance while processing a file one byte a time	John-Paul Stewart
Re: poor performance while processing a file one byte a time	Mateusz Viste
Re: poor performance while processing a file one byte a time	Arno Welzel
Re: poor performance while processing a file one byte a time	Mateusz Viste
Re: poor performance while processing a file one byte a time	Arno Welzel