Nicolas Mallison’s Post

View profile for Nicolas Mallison

Expert AI & Data Science Advisor

Game changer? Llama 3.1 405B is now running on Cerebras! – 969 tokens/s, frontier AI now runs at instant speed – 12x faster than GPT-4o, 18x Claude, 12x fastest GPU cloud – 128K context length, 16-bit weights – Industry’s fastest time-to-first token @ 240ms https://lnkd.in/e5zuEnM7

  • diagram

To view or add a comment, sign in

Explore topics