2,540 6 hours ago

DeepSeek-V4-Flash is a preview of the DeepSeek-V4 series, a Mixture-of-Experts model with 284B total parameters and 13B activated, built for efficient reasoning across a 1M-token context window.

tools thinking cloud
ollama run deepseek-v4-flash:cloud

Readme

DeepSeek-V4: Towards Highly Efficient Million-Token Context Intelligence

DeepSeek-V4-Flash is a preview of the DeepSeek-V4 series, a Mixture-of-Experts model with 284B total parameters and 13B activated, built for efficient reasoning across a 1M-token context window.

3 Thinking modes:

  • No thinking: used for fast, intuitive answers
  • Thinking: used for careful logical analysis
  • Max thinking: maximum reasoning effort on the hardest problems benchmark
Benchmark (Metric) V4-Flash Non-Think V4-Flash High V4-Flash Max V4-Pro Non-Think V4-Pro High V4-Pro Max
Knowledge & Reasoning
MMLU-Pro (EM) 83.0 86.4 86.2 82.9 87.1 87.5
SimpleQA-Verified (Pass@1) 23.1 28.9 34.1 45.0 46.2 57.9
Chinese-SimpleQA (Pass@1) 71.5 73.2 78.9 75.8 77.7 84.4
GPQA Diamond (Pass@1) 71.2 87.4 88.1 72.9 89.1 90.1
HLE (Pass@1) 8.1 29.4 34.8 7.7 34.5 37.7
LiveCodeBench (Pass@1) 55.2 88.4 91.6 56.8 89.8 93.5
Codeforces (Rating) - 2816 3052 - 2919 3206
HMMT 2026 Feb (Pass@1) 40.8 91.9 94.8 31.7 94.0 95.2
IMOAnswerBench (Pass@1) 41.9 85.1 88.4 35.3 88.0 89.8
Apex (Pass@1) 1.0 19.1 33.0 0.4 27.4 38.3
Apex Shortlist (Pass@1) 9.3 72.1 85.7 9.2 85.5 90.2
Long Context
MRCR 1M (MMR) 37.5 76.9 78.7 44.7 83.3 83.5
CorpusQA 1M (ACC) 15.5 59.3 60.5 35.6 56.5 62.0
Agentic
Terminal Bench 2.0 (Acc) 49.1 56.6 56.9 59.1 63.3 67.9
SWE Verified (Resolved) 73.7 78.6 79.0 73.6 79.4 80.6
SWE Pro (Resolved) 49.1 52.3 52.6 52.1 54.4 55.4
SWE Multilingual (Resolved) 69.7 70.2 73.3 69.8 74.1 76.2
BrowseComp (Pass@1) - 53.5 73.2 - 80.4 83.4
HLE w/ tools (Pass@1) - 40.3 45.1 - 44.7 48.2
MCPAtlas (Pass@1) 64.0 67.4 69.0 69.4 74.2 73.6
GDPval-AA (Elo) - - 1395 - - 1554
Toolathlon (Pass@1) 40.7 43.5 47.8 46.3 49.0 51.8

Reference

DeepSeek-V4 technical report