Add corrections, implementation notes, pricing changes, or usage caveats for other readers.
Input modalities
Output modalities
Capabilities
32,000 tokens
Recent tweets and retweets from Together AI
Frontier model performance on an open model, post-trained in under 24 hours. @trajectorylabs is showing what's possible when great open models meet the right training infrastructure. Proud to power the compute behind this work alongside @nvidia .
M3’s architecture makes long-context inference more efficient. Serving it at production scale required systems work.
Together’s kernel and inference teams built KV-block-major sparse attention, integrated MSA with paged KV cache, optimized decode index scoring, and moved…
I asked 8 AI models (including Fable 5) for their world cup predictions.
Going to keep an updated leaderboard based on match results to see which AI model performed the best!
Launching tomorrow, right before the world cup kicks off!
Video
Learn how @cursor_ai partnered with Together AI to deliver real-time inference for AI-powered coding in this article from @ce_zhang and @realDanFu.
Cursor's in-editor agents generate code while developers actively edit — requiring responses inside the editor's feedback loop.…
Discuss this model
Add corrections, implementation notes, pricing changes, or usage caveats for other readers.