starling-lm:7b-alpha-q4

starling-lm:7b-alpha-q4_1

928.3K Downloads Updated 2 years ago

Starling is a large language model trained by reinforcement learning from AI feedback focused on improving chatbot helpfulness.

ollama run starling-lm:7b-alpha-q4_1

curl http://localhost:11434/api/chat \
  -d '{
    "model": "starling-lm:7b-alpha-q4_1",
    "messages": [{"role": "user", "content": "Hello!"}]
  }'

from ollama import chat

response = chat(
    model='starling-lm:7b-alpha-q4_1',
    messages=[{'role': 'user', 'content': 'Hello!'}],
)
print(response.message.content)

import ollama from 'ollama'

const response = await ollama.chat({
  model: 'starling-lm:7b-alpha-q4_1',
  messages: [{role: 'user', content: 'Hello!'}],
})
console.log(response.message.content)

Details

Updated 2 years ago

2 years ago

e3dd9bd1826c · 4.6GB ·

model

archllama

parameters7.24B

quantizationQ4_1

4.6GB

params

{ "stop": [ "<|endoftext|>", "<|end_of_turn|>", "Human:", "Assis

87B

template

{{ .System }}<|end_of_turn|>GPT4 Correct User: {{ .Prompt}}<|end_of_turn|>GPT4 Correct Assistant:

97B

Readme

Starling-7B is an open (non-commercial) large language model (LLM) trained by reinforcement learning from AI feedback. (RLAIF)

The model harnesses the power of our new GPT-4 labeled ranking dataset, Nectar, and our new reward training and policy tuning pipeline. Starling-7B-alpha scores 8.09 in MT Bench with GPT-4 as a judge, outperforming every model to date on MT-Bench except for OpenAI’s GPT-4 and GPT-4 Turbo.

*Based on MT Bench evaluations, using GPT-4 scoring. Further human evaluation is needed.

Authors: Banghua Zhu, Evan Frick, Tianhao Wu, Hanlin Zhu and Jiantao Jiao.

For correspondence, please contact Banghua Zhu ([email protected]).

Reference

Starling-7B: Increasing LLM Helpfulness & Harmlessness with RLAIF

HuggingFace