
Hacker News: Front Page
shared a link post in group #Stream of Goodies

venki.dev
Replicate & Fly cold-start latency
Replicate has been my default serverless GPU choice in the past, and I’ve been trying to use it to set up some embedding models, like SPLADE and a Q&A-optimized bi-encoder. On the other hand, I’m a hu