Ok, big shout out to monort [0] for the link to the video [1]. This is just a qu...

conaclos · on Feb 11, 2025

Actually they propose two algorithms: Funnel Hashing and Elastic Hashing. Funnel Hashing is "greedy" and defeats the Yao's conjecture that concerns greedy hash mechanisms. Elastic Hashing is "non-greedy" and provides a better amortized time than greedy algorithms.

golly_ned · on Feb 11, 2025

That it circumvents Yao’s conjecture by being non-greedy contradicts the article. Is the article wrong or is your understanding of the paper? I don’t know, just want to see if you’re noticing something the article’s authors don’t know.

abetusk · on Feb 11, 2025

Can you elaborate?

From the article:

> Farach-Colton, Krapivin and Kuszmaul wanted to see if that same limit also applied to non-greedy hash tables. They showed that it did not by providing a counterexample, a non-greedy hash table with an average query time that’s much, much better than log x.

cma · on Feb 11, 2025

Sibling comment above says funnel hashing is greedy, elastic hashing was the non-greedy method that did even better.

bajsejohannes · on Feb 11, 2025

One thing I don't understand from watching the video, is what happens in the (very rare) case that you get collisions all the way down the funnel. I assume this is related to the "One special final level to catch a few keys" (around 14:41 in the video), but given that it has to be fixed size, this can also get full. What do you do in that case?

kaathewise · on Feb 11, 2025

The dereference table allows allocations to fail:

https://arxiv.org/pdf/2501.02305#:~:text=If%20both%20buckets...

(the text fragment doesn't seem to work in a PDF, it's the 12th page, first paragraph)

bajsejohannes · on Feb 11, 2025

Thanks! So I guess the best recourse then is to resize the table? Seems like it should be part of the analysis, even if it's low probability of it happening. I haven't read the paper, though, so no strong opinion here...

(By the way, the text fragment does works somewhat in Firefox. Not on the first load, but if load it, then focus the URL field and press enter)

kaathewise · on Feb 11, 2025

Yeah, I presume so. At least that's what Swiss Tables do. The paper is focused more on the asymptotics rather than the real-world hardware performance, so I can see why they chose not to handle such edge cases

sjamaan · on Feb 11, 2025

This bothered me too, reading it and the sample implementations I've found so far just bail out. I thought one of the benefits of hash tables was that they don't have a predefined size?

yencabulator · on Feb 11, 2025

The hash tables a programmer interacts with generally very much have a fixed size, but resize on demand. The idea of a fixed size is very much a part of the open addressing style hash tables -- how else could they even talk of how full a hash table is?

akatsarakis · on Feb 16, 2025

Quite a neat idea that could be useful for memory-constrained environments.

[Shameless plug]:

If you are into hashtables, you might want to check out Dandelion Hashtable [0]. We use it in our next-generation databases, and it was published in HPDC'24. It is currently the fastest in-memory hashtable in practice.

It improves closed addressing via bounded-cacheline chaining to surpass 1B in-memory requests/second on a commodity server.

[0] https://dandelion-datastore.com/#dlht

edflsafoiewq · on Feb 11, 2025

Funnel hashing is greedy.

abetusk · on Feb 11, 2025

This seems overly pedantic. Here I think "greedy" means uniform probing.

The authors very clearly state "non greedy":

https://www.youtube.com/watch?v=ArQNyOU1hyE&t=1087s

edflsafoiewq · on Feb 11, 2025

What the author says there is "What we just showed was that we can achieve a worst-case expected probe complexity of log squared one over delta with a greedy algorithm. And we don't have too much time to go over the non-greedy algorithm but...".

The funnel hashing described in the video is greedy. The video doesn't cover the non-greedy elastic hashing.

"Greedy" means that the search and insertion do the same probe sequence, and insertion just uses the first free slot in that sequence.

abetusk · on Feb 11, 2025

Ah, thanks for the clarification!