Rocksolid Light

Welcome to RetroBBS

mail  files  register  newsreader  groups  login

Message-ID:  

<wiggy> in a stunning new move I actually tested this upload


devel / comp.lang.python / Best practice for caching hash

SubjectAuthor
o Best practice for caching hashMarco Sulla

1
Best practice for caching hash

<mailman.280.1647117995.2329.python-list@python.org>

  copy mid

https://www.rocksolidbbs.com/devel/article-flat.php?id=21896&group=comp.lang.python#21896

  copy link   Newsgroups: comp.lang.python
Path: i2pn2.org!i2pn.org!news.swapon.de!fu-berlin.de!uni-berlin.de!not-for-mail
From: Marco.Sulla.Python@gmail.com (Marco Sulla)
Newsgroups: comp.lang.python
Subject: Best practice for caching hash
Date: Sat, 12 Mar 2022 21:45:56 +0100
Lines: 42
Message-ID: <mailman.280.1647117995.2329.python-list@python.org>
References: <CABbU2U9CwzB=ko_2n-QCG32MC3kn6R7R-m-TcOJZPPMWsULGxw@mail.gmail.com>
Mime-Version: 1.0
Content-Type: text/plain; charset="UTF-8"
X-Trace: news.uni-berlin.de oEvKLlxw1UpALXFAy0243gn2cC4emQzPyr8kNUbh5b1g==
Return-Path: <elbarbun@gmail.com>
X-Original-To: python-list@python.org
Delivered-To: python-list@mail.python.org
Authentication-Results: mail.python.org; dkim=pass
reason="2048-bit key; unprotected key"
header.d=gmail.com header.i=@gmail.com header.b=e7Be9qzR;
dkim-adsp=pass; dkim-atps=neutral
X-Spam-Status: OK 0.011
X-Spam-Evidence: '*H*': 0.98; '*S*': 0.00; 'url-ip:140.82.114/24':
0.03; 'url-ip:140.82/16': 0.03; 'error:': 0.05; 'example:': 0.09;
'objects,': 0.09; 'ok,': 0.09; 'typeerror:': 0.09; 'url:github':
0.14; 'memory': 0.15; 'url-ip:140/8': 0.15; 'from:name:marco
sulla': 0.16; 'furthermore,': 0.16; 'hand,': 0.16; 'hash': 0.16;
'int)': 0.16; 'object,': 0.16; 'values,': 0.16; 'problem': 0.16;
'python': 0.16; 'values': 0.17; 'calls': 0.19; 'to:addr:python-
list': 0.20; 'extension': 0.25; 'seems': 0.26; 'object': 0.26;
'leave': 0.27; 'thinking': 0.28; 'example,': 0.28; 'error': 0.29;
'raise': 0.31; 'think': 0.32; 'signal': 0.32; 'message-
id:@mail.gmail.com': 0.32; 'but': 0.32; "i'm": 0.33;
'subject:for': 0.33; 'received:google.com': 0.34; 'track': 0.35;
'from:addr:gmail.com': 0.35; 'currently': 0.37; 'received:209.85':
0.37; 'this.': 0.37; 'received:209': 0.39; 'added': 0.39; 'use':
0.39; 'define': 0.40; "there's": 0.61; 'me.': 0.62; 'here': 0.62;
'simply': 0.63; 'pass': 0.64; 'improve': 0.66; 'time.': 0.66;
'time,': 0.67; 'per': 0.68; 'type:': 0.69; 'times': 0.69;
'little': 0.73; 'subsequent': 0.76; 'composed': 0.84; 'url:src':
0.84; 'property': 0.88; 'details,': 0.95
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20210112;
h=mime-version:from:date:message-id:subject:to;
bh=BKy77H5EoF+UW1B7vyzP2Ilfs5dH/u3dFvAaKIRuPNk=;
b=e7Be9qzRvzVG3bxUc+FoJjlTi/GoKy4jbr9zH+vqHTYRABCe25JhbnCOD4RQDIiqz3
J6PDLXNDzo71433mRgs96Mhx+ahAHB5bwORzqu39dnnCAuAX5MpQxp4E93AA8FkREFzL
o67AB7PlA4waWHXhQ1402Mkzm7F0WJ2VAwYbYf5lKmrpcC3mp6TAFX+7pdeddsbQEqCb
bRUdvZ5bWLV657HCIfQrGKCGGCUYGVJM/aQ6FlLVsanryFuTi7r3WF5+nmNFXonnWFWC
v6MEofnLf9fUNSlggXQkbZgYWkgz9d9YwxtBVGmvyeHPDIdDbOjEI1c/Zr6vQTSxONcF
9Brg==
X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed;
d=1e100.net; s=20210112;
h=x-gm-message-state:mime-version:from:date:message-id:subject:to;
bh=BKy77H5EoF+UW1B7vyzP2Ilfs5dH/u3dFvAaKIRuPNk=;
b=xMz+8ihXx2zP665NVh51+6zY31dwPbvlmC6iFuyeUHnWs6rzLIzPcXBjKNWgxDbPwP
zynf9VFo92X9zdBp9Wt3VrwgeB/EwUrUacP+XhEUAFOTWYGCGYWXItDnmHypUlQaVzAV
8ovXxJmYN5cEO1D3XIGf82mkqairmtwxywtwJcFzkLeE9svBSqFMNH47ldT29AglDj2o
j9UQiV+Ksf7YJyutVd+NDbgkjjcio/44V96eksop1frJIYcQifrMExe1Fv1P+YbOfv7n
UkNvOAlte6R8YWQNWf4PMHwrnJae97pw02BAkwmEtMK190Ro32pZd1eCsK5ChMW2E1Sf
yC/g==
X-Gm-Message-State: AOAM533SdXiF+FxeoMpjuCoRtNJLzpnaRJk1F8kf2ZnsuABgKI1KHEgn
u/nFBxpf95DRgxWTBpkmNmmoaikErH81hxskaevK2Qm5LBA=
X-Google-Smtp-Source: ABdhPJyVEd38fP3/aH6eMf2P+lxBxI7jNzQUirhxquO4LUBU/r8dUlivfuJpQc6HVENo1uIENhwoZ07/4tl+nzHG+bM=
X-Received: by 2002:a81:1007:0:b0:2dc:2e49:9d38 with SMTP id
7-20020a811007000000b002dc2e499d38mr13073677ywq.28.1647117992870; Sat, 12 Mar
2022 12:46:32 -0800 (PST)
X-BeenThere: python-list@python.org
X-Mailman-Version: 2.1.39
Precedence: list
List-Id: General discussion list for the Python programming language
<python-list.python.org>
List-Unsubscribe: <https://mail.python.org/mailman/options/python-list>,
<mailto:python-list-request@python.org?subject=unsubscribe>
List-Archive: <https://mail.python.org/pipermail/python-list/>
List-Post: <mailto:python-list@python.org>
List-Help: <mailto:python-list-request@python.org?subject=help>
List-Subscribe: <https://mail.python.org/mailman/listinfo/python-list>,
<mailto:python-list-request@python.org?subject=subscribe>
X-Mailman-Original-Message-ID: <CABbU2U9CwzB=ko_2n-QCG32MC3kn6R7R-m-TcOJZPPMWsULGxw@mail.gmail.com>
 by: Marco Sulla - Sat, 12 Mar 2022 20:45 UTC

I have a custom immutable object, and I added a cache for its hash
value. The problem is the object can be composed of mutable or
immutable objects, so the hash can raise TypeError.

In this case I currently cache the value -1. The subsequent calls to
__hash__() will check if the value is -1. If so, a TypeError is
immediately raised.

The problem is the first time I get an error with details, for example:

TypeError: unhashable type: 'list'

The subsequent times I simply raise a generic error:

TypeError

Ok, I can improve it by raising, for example, TypeError: not all
values are hashable. But do you think this is acceptable? Now I'm
thinking about it and it seems a little hacky to me.

Furthermore, in the C extension I have to define another property in
the struct, ma_hash_calculated, to track if the hash value is cached
or not, since there's no bogus value I can use in cache property,
ma_hash, to signal this. If I don't cache unhashable values, -1 can be
used to signal that ma_hash contains no cached value.

So if I do not cache if the object is unhashable, I save a little
memory per object (1 int) and I get a better error message every time.

On the other hand, if I leave the things as they are, testing the
unhashability of the object multiple times is faster. The code:

try:
hash(o)
except TypeError:
pass

execute in nanoseconds, if called more than 1 time, even if o is not
hashable. Not sure if this is a big advantage.

What do you think about? Here is the python code:
https://github.com/Marco-Sulla/python-frozendict/blob/35611f4cd869383678104dc94f82aa636c20eb24/frozendict/src/3_10/frozendictobject.c#L652-L697


devel / comp.lang.python / Best practice for caching hash

1
server_pubkey.txt

rocksolid light 0.9.81
clearnet tor