
Hacker News: Front Page
shared a link post in group #Stream of Goodies
arxiv.org
LoMA: Lossless Compressed Memory Attention
The ability to handle long texts is one of the most important capabilities of Large Language Models (LLMs), but as the text length increases, the consumption of resources also increases dramatically.