Instructions to use Laurie/opt1.3b-deepspeed-chat with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- Transformers
How to use Laurie/opt1.3b-deepspeed-chat with Transformers:
# Use a pipeline as a high-level helper from transformers import pipeline pipe = pipeline("text-generation", model="Laurie/opt1.3b-deepspeed-chat")# Load model directly from transformers import AutoTokenizer, AutoModelForCausalLM tokenizer = AutoTokenizer.from_pretrained("Laurie/opt1.3b-deepspeed-chat") model = AutoModelForCausalLM.from_pretrained("Laurie/opt1.3b-deepspeed-chat") - Notebooks
- Google Colab
- Kaggle
- Local Apps
- vLLM
How to use Laurie/opt1.3b-deepspeed-chat with vLLM:
Install from pip and serve model
# Install vLLM from pip: pip install vllm # Start the vLLM server: vllm serve "Laurie/opt1.3b-deepspeed-chat" # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:8000/v1/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "Laurie/opt1.3b-deepspeed-chat", "prompt": "Once upon a time,", "max_tokens": 512, "temperature": 0.5 }'Use Docker
docker model run hf.co/Laurie/opt1.3b-deepspeed-chat
- SGLang
How to use Laurie/opt1.3b-deepspeed-chat with SGLang:
Install from pip and serve model
# Install SGLang from pip: pip install sglang # Start the SGLang server: python3 -m sglang.launch_server \ --model-path "Laurie/opt1.3b-deepspeed-chat" \ --host 0.0.0.0 \ --port 30000 # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:30000/v1/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "Laurie/opt1.3b-deepspeed-chat", "prompt": "Once upon a time,", "max_tokens": 512, "temperature": 0.5 }'Use Docker images
docker run --gpus all \ --shm-size 32g \ -p 30000:30000 \ -v ~/.cache/huggingface:/root/.cache/huggingface \ --env "HF_TOKEN=<secret>" \ --ipc=host \ lmsysorg/sglang:latest \ python3 -m sglang.launch_server \ --model-path "Laurie/opt1.3b-deepspeed-chat" \ --host 0.0.0.0 \ --port 30000 # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:30000/v1/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "Laurie/opt1.3b-deepspeed-chat", "prompt": "Once upon a time,", "max_tokens": 512, "temperature": 0.5 }' - Docker Model Runner
How to use Laurie/opt1.3b-deepspeed-chat with Docker Model Runner:
docker model run hf.co/Laurie/opt1.3b-deepspeed-chat
license: apache-2.0 language: am
DeepSpeed-RLHF系统训练:DeepSpeed-HE 能够在 RLHF 中无缝地在推理和训练模式之间切换,使其能够利用来自 DeepSpeed-Inference 的各种优化,如张量并行计算和高性能CUDA算子进行语言生成,同时对训练部分还能从 ZeRO- 和 LoRA-based 内存优化策略中受益。DeepSpeed-HE 还能够自动在 RLHF 的不同阶段进行智能的内存管理和数据缓存。
Train Data:(English)--data_path Dahoas/rm-static Dahoas/full-hh-rlhf Dahoas/synthetic-instruct-gptj-pairwise yitingxie/rlhf-reward-datasets openai/webgpt_comparisons stanfordnlp/SHP
Train Data:(Chinese)--data_path wangrui6/Zhihu-KOL Cohere/miracl-zh-queries-22-12 Hello-SimpleAI/HC3-Chinese mkqa-Chinese
可自定义actor model 和 reward model,亦可单独训练rlhf model
Usage:
git clone https://github.com/microsoft/DeepSpeedExamples cd DeepSpeedExamples/applications/DeepSpeed-Chat pip install -r requirements.txt python chat.py --path Laurie/opt1.3b-deepspeed-chat
- Downloads last month
- 13