holy crap, this is so good. How did it get buried?

yoyohello13 · 2026-02-25T22:28:40 1772058520

Too technical for HN

nee1r · 2026-02-24T17:46:52 1771955212

sheepscreek · 2026-02-25T23:48:00 1772063280

Are you guys affiliated with Meta’s ex-CTO in any way? I remember he famously implied that LLMs hyped. The demos are very impressive. Does this use an attention based mechanism too? Just trying to understand (as a layman) how these models handle context and if long contexts lead to weaker results. Could be catastrophic in the real world!

sheepscreek · 2026-02-25T23:50:39 1772063439

I think in the long run, we may need something like a batch job that compresses context from the last N conversations (in LLMs) and applies that as an update to weights. A looser form of delayed automated reinforcement learning.

Or make something like LoRA mainstream for everyone (probably scales better for general use models shared by everyone).