
Hacker News: Front Page
shared a link post in group #Stream of Goodies

zeux.io
LLM inference speed of light
In the process of working on calm, a minimal from-scratch fast CUDA implementation of transformer-based language model inference, a critical consideration was establishing the speed of light for the i