Add corrections, implementation notes, pricing changes, or usage caveats for other readers.
Input modalities
Output modalities
Capabilities
131,072 tokens
Recent tweets and retweets from Fireworks AI
Move from test to production by running high-performance inference directly on Foundry.
At #MSBuild, we demoed an end-to-end workflow showing how unified infrastructure improves latency, reduces cost, and simplifies deployment for real enterprise AI use cases.
Video will be…
Microsoft MAI models. Coming soon to Fireworks.
Intelligence you control. End-to-end lineage you can prove.
Fine-tune MAI reasoning models for your enterprise tasks.
Your data. Your custom models. Your specialized intelligence.
Learn more: microsoft.ai/news/building-a…
Super excited to announce seven new world-class MAI models today. They represent what we consider a new era in AI designed to keep you in control and on the frontier.
First is our text foundation model, MAI-Thinking-1, exceptionally strong on reasoning and SWE tasks.
- It’s…
We’re looking forward to seeing how developers and enterprises use Fireworks AI on @Microsoft Foundry to power the next generation of intelligent applications.
Catch us at booth F111 at #MSBuild and see Fireworks AI + Microsoft Foundry in action.
Video
Many research labs only consider inference efficiency after the fact. Step 3.7 Flash is a 196B MoE model, and built for inference from the start by @StepFun_ai.
Multi-Matrix Factorization Attention (MFA) → KV-cache at ~22% of DeepSeek.
Attention-FFN Disaggregation (AFD) →…
Discuss this model
Add corrections, implementation notes, pricing changes, or usage caveats for other readers.