
Hacker News: Front Page
shared a link post in group #Stream of Goodies
qwenlm.github.io
Qwen1.5-MoE: Matching 7B Model Performance with 1/3 Activated Parameters
GITHUB HUGGING FACE MODELSCOPE DEMO DISCORD Introduction Since the surge in interest sparked by Mixtral, research on mixture-of-expert (MoE) models has gained significant momentum. Both researchers an