State-of-the-art large embedding model from mixedbread.ai
embedding
335m
747.4K Pulls Updated 7 months ago
Updated 9 months ago
9 months ago
468836162de7 · 670MB
model
archbert
·
parameters334M
·
quantizationF16
670MB
params
{
"num_ctx": 512
}
16B
license
Apache License
Version 2.0, January 2004
11kB
Readme
mxbai-embed-large
As of March 2024, this model archives SOTA performance for Bert-large sized models on the MTEB. It outperforms commercial models like OpenAIs text-embedding-3-large
model and matches the performance of model 20x its size.
mxbai-embed-large
was trained with no overlap of the MTEB data, which indicates that the model generalizes well across several domains, tasks and text length.
Usage
REST API
curl http://localhost:11434/api/embeddings -d '{
"model": "mxbai-embed-large",
"prompt": "Represent this sentence for searching relevant passages: The sky is blue because of Rayleigh scattering"
}'
Python library
ollama.embeddings(model='mxbai-embed-large', prompt='Represent this sentence for searching relevant passages: The sky is blue because of Rayleigh scattering')
Javascript library
ollama.embeddings({ model: 'mxbai-embed-large', prompt: 'Represent this sentence for searching relevant passages: The sky is blue because of Rayleigh scattering' })