Instructions to use Laurie/opt1.3b-deepspeed-chat with libraries, inference providers, notebooks, and local apps. Follow these links to get started.

Libraries

How to use Laurie/opt1.3b-deepspeed-chat with Transformers:

# Use a pipeline as a high-level helper
from transformers import pipeline

pipe = pipeline("text-generation", model="Laurie/opt1.3b-deepspeed-chat")

# Load model directly
from transformers import AutoTokenizer, AutoModelForCausalLM

tokenizer = AutoTokenizer.from_pretrained("Laurie/opt1.3b-deepspeed-chat")
model = AutoModelForCausalLM.from_pretrained("Laurie/opt1.3b-deepspeed-chat")

Notebooks
Google Colab
Kaggle
Local Apps

vLLM

How to use Laurie/opt1.3b-deepspeed-chat with vLLM:

Install from pip and serve model

# Install vLLM from pip:
pip install vllm
# Start the vLLM server:
vllm serve "Laurie/opt1.3b-deepspeed-chat"
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:8000/v1/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "Laurie/opt1.3b-deepspeed-chat",
		"prompt": "Once upon a time,",
		"max_tokens": 512,
		"temperature": 0.5
	}'

Use Docker

docker model run hf.co/Laurie/opt1.3b-deepspeed-chat

SGLang

How to use Laurie/opt1.3b-deepspeed-chat with SGLang:

Install from pip and serve model

# Install SGLang from pip:
pip install sglang
# Start the SGLang server:
python3 -m sglang.launch_server \
    --model-path "Laurie/opt1.3b-deepspeed-chat" \
    --host 0.0.0.0 \
    --port 30000
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:30000/v1/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "Laurie/opt1.3b-deepspeed-chat",
		"prompt": "Once upon a time,",
		"max_tokens": 512,
		"temperature": 0.5
	}'

Use Docker images

docker run --gpus all \
    --shm-size 32g \
    -p 30000:30000 \
    -v ~/.cache/huggingface:/root/.cache/huggingface \
    --env "HF_TOKEN=<secret>" \
    --ipc=host \
    lmsysorg/sglang:latest \
    python3 -m sglang.launch_server \
        --model-path "Laurie/opt1.3b-deepspeed-chat" \
        --host 0.0.0.0 \
        --port 30000
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:30000/v1/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "Laurie/opt1.3b-deepspeed-chat",
		"prompt": "Once upon a time,",
		"max_tokens": 512,
		"temperature": 0.5
	}'

Docker Model Runner
How to use Laurie/opt1.3b-deepspeed-chat with Docker Model Runner:
```
docker model run hf.co/Laurie/opt1.3b-deepspeed-chat
```

license: apache-2.0 language: am

DeepSpeed-RLHF系统训练：DeepSpeed-HE 能够在 RLHF 中无缝地在推理和训练模式之间切换，使其能够利用来自 DeepSpeed-Inference 的各种优化，如张量并行计算和高性能CUDA算子进行语言生成，同时对训练部分还能从 ZeRO- 和 LoRA-based 内存优化策略中受益。DeepSpeed-HE 还能够自动在 RLHF 的不同阶段进行智能的内存管理和数据缓存。
Train Data：（English）--data_path Dahoas/rm-static Dahoas/full-hh-rlhf Dahoas/synthetic-instruct-gptj-pairwise yitingxie/rlhf-reward-datasets openai/webgpt_comparisons stanfordnlp/SHP
Train Data：（Chinese）--data_path wangrui6/Zhihu-KOL Cohere/miracl-zh-queries-22-12 Hello-SimpleAI/HC3-Chinese mkqa-Chinese
可自定义actor model 和 reward model，亦可单独训练rlhf model

Usage:

git clone https://github.com/microsoft/DeepSpeedExamples

cd DeepSpeedExamples/applications/DeepSpeed-Chat

pip install -r requirements.txt

python chat.py --path Laurie/opt1.3b-deepspeed-chat

Downloads last month: 13

Laurie
/

opt1.3b-deepspeed-chat

Spaces using Laurie/opt1.3b-deepspeed-chat 2