DeepSeek V4 Flash by Nvidia | AI model information

Get Raycast

Model details

DeepSeek V4 Flash

Quick Info

Provider: Nvidia
Model key: deepseek-ai/deepseek-v4-flash
Release date: Apr 24, 2026
Last updated: Apr 24, 2026
Knowledge cutoff

Cost

Input token cost: $0.14
Output token cost: $0.28

Limits

Output tokens: 393,216 tokens

Use it for free

Latest news about DeepSeek V4 Flash

Official sourceOfficial Guidance

Build with DeepSeek V4 Using NVIDIA Blackwell and GPU-Accelerated Endpoints | NVIDIA Technical Blog

DeepSeek just launched its fourth generation of flagship models with DeepSeek-V4-Pro and DeepSeek-V4-Flash, both targeted at enabling highly efficient million…

CoverageBenchmark

The 'DeepSeek-V4' has finally arrived, an open-top model with performance exceeding that of the Claude Opus 4.6. - GIGAZINE

DeepSeek, a Chinese AI company, released its AI model ' DeepSeek-V4 ' on April 24, 2026. There are two versions: DeepSeek-V4-Pro and DeepSeek-V4-Flash. DeepSeek-V4-Pro has achieved scores exceeding Claude Opus 4.6 in multiple tests. deepseek-ai/DeepSeek-V4-Pro · Hugging Face https://huggingface.co/deepseek-ai/DeepSeek-

Official sourceOfficial

deepseek-v4-flash Model by Deepseek-ai | NVIDIA NIM

DeepSeek V4 Flash is a 284B MoE model with 1M-token context optimized for fast coding and agents.

Official sourceDocumentation

deepseek-ai / deepseek-v4-flash

DeepSeek-V4-Flash Overview Description: DeepSeek-V4-Flash is a Mixture-of-Experts (MoE) language model with 284 billion total parameters and 13 billion activated parameters. DeepSeek-V4-Flash was developed by DeepSeek as a part of DeepSeek-V4 collection. This model is ready for commercial/non-commer...

Use it for free

Videos about DeepSeek V4 Flash

DeepSeek V4 Flash API | Build Fast, Low-Latency & Scalable AI Systems on Qubrid AI

... NVIDIA GPUs DeepSeek V4 Flash is ideal for: Real-time chat and AI assistants; High-speed code generation; Customer support automation ...

Сказание о Кибер-Руси Локальная установка и запуск DeepSeek v4 Flash #localllM #обзорнейросетей

... Nvidia H100 через тензорный параллелизм и победим проклятие Мораны ... Сказание о Кибер-Руси Локальная установка и запуск DeepSeek v4 Flash # ...

DeepSeek V4开源 - 开源AI的新天花板 + 华为昇腾原生适配

... NVIDIA H20)、**昇腾A3 超 ... DeepSeek V4 Flash/Pro编程实测，接入Hermes体验完爆OpenClaw，搭配 ...

Videos about DeepSeek V4 Flash

Use it for free

Recent tweets and retweets from Nvidia

Apr 25, 2026, 1:57 PMUTC

We've got all the models here: dell.huggingface.co/authenti… Kimi K2.5, Mistral, Cohere, Arcee AI Trinity Large, Google Gemma, Meta/Llama, Qwen, Nvidia Nemotron, Grok, GPT OSS, Deepseek, Phi and many more

Apr 24, 2026, 10:09 PMUTC

🔗 nvda.ws/4cxpkZU

Apr 24, 2026, 10:09 PMUTC

What a week at #GoogleCloudNext in Las Vegas ✨ From sessions to showfloor demos to networking events, we were excited to partner with @GoogleCloud to advance how the AI ecosystem can build, deploy, and innovate in the cloud. Looking forward to working together to advance AI…

Apr 24, 2026, 4:46 PMUTC

The task of a radiologist was to read scans. The purpose was to diagnose disease. When AI handles the task, the purpose doesn’t shrink. It grows. Reflecting on CEO Jensen Huang’s insights at #AdobeSummit regarding the task vs. purpose of work in the agentic era. Video

Apr 24, 2026, 12:34 AMUTC

.@GoogleCloud CEO Thomas Kurian joined us at our booth to talk shop and sign our wall. Incredible energy at #GoogleCloudNext as we continue to push the boundaries of what’s possible in the cloud.

Recent tweets and retweets from Nvidia

More models around DeepSeek V4 Flash

Browse family

Also served by

This exact model name is also listed by 5 other providers.

DeepSeek LLM Gateway OpenCode Go OpenRouter Venice AI

Workflow pickAffiliate partner

Keep Reviews Moving

CodeRabbit helps fast-moving teams keep PR review from becoming the bottleneck.

When AI speeds up shipping, review queues get exposed fast. CodeRabbit reviews pull requests quickly, catches issues that surface late, and adds coverage before code reaches production.

Developers already feel this

Pull requests sit for days while review bandwidth gets stretched thin.
Bugs still slip into production when teams merge under time pressure.