A strong Mixture-of-Experts (MoE) language model with 671B total parameters with 37B activated for each token.
671b
197.6K Pulls Updated 3 weeks ago
Updated 3 weeks ago
3 weeks ago
5da0e2d4a9e0 · 404GB
model
archdeepseek2
·
parameters671B
·
quantizationQ4_K_M
404GB
params
{
"stop": [
"<|begin▁of▁sentence|>",
"<|end▁of▁sentence|>",
148B
template
{{- range $i, $_ := .Messages }}
{{- if eq .Role "user" }}<|User|>
{{- else if eq .Role "assista
359B
license
DEEPSEEK LICENSE AGREEMENT
Version 1.0, 23 October 2023
Copyright (c) 2023 DeepSeek
Section I: PR
14kB
Readme
Note: this model requires Ollama 0.5.5 or later.
DeepSeek-V3 achieves a significant breakthrough in inference speed over previous models. It tops the leaderboard among open-source models and rivals the most advanced closed-source models globally.