Add corrections, implementation notes, pricing changes, or usage caveats for other readers.
Knowledge cutoff
2025-05
Input modalities
Output modalities
Capabilities
1,000,000 tokens
Recent tweets and retweets from Fireworks AI
Coding agents break when models are "almost" bug-free. But almost valid JSON is just not the same valid JSON. Fun piece here from @akshay_pachaar shows why SFT can't fix this, and how GRPO trains against correctness directly.
Worth noting: the reason this works is inference…
Fireworks Training Platform keeps expanding.
Leading US open weight model Nemotron 3 Ultra is now ready for post-training: SFT and DPO via LoRA or full-parameter, on the same infrastructure that serves it.
The model you train is the model you ship: fireworks.ai/train
Fireworks was named to @Redpoint's InfraRed 100 which recognizes the companies building the foundation for the next wave of AI.
We're just getting started.
Come build with us: fireworks.ai/careers
We spent the week at #MSBuild talking about one thing: fine-tuning has gone from "maybe not worth it" to your actual competitive moat.
@lqiao sat down with @yina_arenas to break down why and what Fireworks + @Microsoft Foundry makes possible when you stop treating models as…
NVIDIA Nemotron 3 Ultra is on Fireworks, day zero.
Nemotron Ultra is an open model for frontier reasoning and orchestration in long-running autonomous agents.
Think use cases like coding agents, deep research, and complex enterprise workflows.
Read on:…
Discuss this model
Add corrections, implementation notes, pricing changes, or usage caveats for other readers.