nuextract:3.8b-fp16

A 3.8B model fine-tuned on a private high-quality synthetic dataset for information extraction, based on Phi-3.

Details

Updated 1 year ago

1 year ago

d8bc760a64ed · 7.6GB ·

model

archphi3

parameters3.82B

quantizationF16

7.6GB

license

Apache License Version 2.0, January 2004 http://www.apache.org/licenses/ TERMS AND CONDITIONS FOR US

11kB

params

{ "stop": [ "<|end|>", "<|user|>", "<|assistant|>" ] }

78B

template

{{- range .Messages }} {{- if eq .Role "user" }}<|user|> {{ .Content }}<|end|> <|assistant|> {{- els

172B

Structure Extraction Model by NuMind 🔥

NuExtract is a version of phi-3-mini, fine-tuned on a private high-quality synthetic dataset for information extraction. To use the model, provide an input text (less than 2000 tokens) and a JSON template describing the information you need to extract.

Note: This model is purely extractive, so all text output by the model is present as is in the original text. You can also provide an example of output formatting to help the model understand your task more precisely.

Usage

Prompt Format

This model works best when using a specific prompt format to extract text:

### Template:
{
    "Model": {
        "Name": "",
        "Number of parameters": "",
    },
    "Usage": {
        "Use case": [],
        "Licence": ""
    }
}
### Example:
{
    "Model": {
        "Name": "Llama3",
        "Number of parameters": "8 billion",
    },
    "Usage": {
        "Use case":[
			"chat",
			"code completion"
		],
        "Licence": "Meta Llama3"
    }
}
### Text:
We introduce Mistral 7B, a 7–billion-parameter language model engineered for superior performance and efficiency. Mistral 7B outperforms the best open 13B model (Llama 2) across all evaluated benchmarks, and the best released 34B model (Llama 1) in reasoning, mathematics, and code generation. Our model leverages grouped-query attention (GQA) for faster inference, coupled with sliding window attention (SWA) to effectively handle sequences of arbitrary length with a reduced inference cost. We also provide a model fine-tuned to follow instructions, Mistral 7B – Instruct, that surpasses Llama 2 13B – chat model both on human and automated benchmarks. Our models are released under the Apache 2.0 license. 

Code: https://github.com/mistralai/mistral-src 
Webpage: https://mistral.ai/news/announcing-mistral-7b/

References

Hugging Face