A strong Mixture-of-Experts (MoE) language model with 671B total parameters with 37B activated for each token.
671b
752K Pulls Updated 8 weeks ago
5 Tags
5da0e2d4a9e0 • 404GB •
8 weeks ago
5da0e2d4a9e0 • 404GB •
8 weeks ago
7770bf5a5ed8 • 1.3TB •
8 weeks ago
5da0e2d4a9e0 • 404GB •
8 weeks ago
96061c74c1a5 • 713GB •
8 weeks ago