JunHowie commited on
Commit
eebe395
·
verified ·
1 Parent(s): f79f272

Update README.md

Browse files

[BugFix] Fix compatibility issues with vLLM 0.10.1

Files changed (1) hide show
  1. README.md +17 -14
README.md CHANGED
@@ -13,15 +13,16 @@ base_model:
13
  base_model_relation: quantized
14
  ---
15
  # Qwen3-Coder-30B-A3B-Instruct-GPTQ-Int8
16
- 基础型 [Qwen3-Coder-30B-A3B-Instruct](https://www.modelscope.cn/models/Qwen3-Coder-30B-A3B-Instruct)
 
 
 
17
 
18
- ### 【Vllm 启动命令】
19
- <i>注: 4卡启动该模型一定要跟`--enable-expert-parallel` ,否则其专家张量TP整除除不尽;2卡则不需要。 </i>
20
  ```
21
  CONTEXT_LENGTH=32768
22
 
23
  vllm serve \
24
- tclf90/Qwen3-Coder-30B-A3B-Instruct-GPTQ-Int8 \
25
  --served-model-name Qwen3-Coder-30B-A3B-Instruct-GPTQ-Int8 \
26
  --enable-expert-parallel \
27
  --swap-space 16 \
@@ -36,33 +37,35 @@ vllm serve \
36
  --port 8000
37
  ```
38
 
39
- ### 【依赖】
40
 
41
  ```
42
  vllm==0.10.0
43
  ```
44
 
45
- ### 【模型更新日期】
46
- ```
 
 
47
  2025-08-01
48
- 1. 首次commit
49
  ```
50
 
51
- ### 【模型列表】
52
 
53
- | 文件大小 | 最近更新时间 |
54
  |--------|--------------|
55
  | `30GB` | `2025-08-01` |
56
 
57
 
58
- ### 【模型下载】
59
 
60
  ```python
61
- from modelscope import snapshot_download
62
- snapshot_download('tclf90/Qwen3-Coder-30B-A3B-Instruct-GPTQ-Int8', cache_dir="本地路径")
63
  ```
64
 
65
- ### 【介绍】
66
  # Qwen3-Coder-30B-A3B-Instruct
67
  <a href="https://chat.qwen.ai/" target="_blank" style="margin: 2px;">
68
  <img alt="Chat" src="https://img.shields.io/badge/%F0%9F%92%9C%EF%B8%8F%20Qwen%20Chat%20-536af5" style="display: inline-block; vertical-align: middle;"/>
 
13
  base_model_relation: quantized
14
  ---
15
  # Qwen3-Coder-30B-A3B-Instruct-GPTQ-Int8
16
+ Base model:[Qwen3-Coder-30B-A3B-Instruct](https://huggingface.co/Qwen/Qwen3-Coder-30B-A3B-Instruct)
17
+
18
+ ### 【vLLM 4-GPU Single Node Launch Command】
19
+ <i>Note: When using 4 GPUs, you must include `--enable-expert-parallel` because expert tensor TP must be evenly divisible; for 2 GPUs this is not necessary.</i>
20
 
 
 
21
  ```
22
  CONTEXT_LENGTH=32768
23
 
24
  vllm serve \
25
+ QuantTrio/Qwen3-Coder-30B-A3B-Instruct-GPTQ-Int8 \
26
  --served-model-name Qwen3-Coder-30B-A3B-Instruct-GPTQ-Int8 \
27
  --enable-expert-parallel \
28
  --swap-space 16 \
 
37
  --port 8000
38
  ```
39
 
40
+ ### 【Dependencies】
41
 
42
  ```
43
  vllm==0.10.0
44
  ```
45
 
46
+ ### 【Model Update Date】
47
+ ```
48
+ 2025-08-19
49
+ 1.[BugFix] Fix compatibility issues with vLLM 0.10.1
50
  2025-08-01
51
+ 1. Initial commit
52
  ```
53
 
54
+ ### 【Model Files】
55
 
56
+ | File Size | Last Updated |
57
  |--------|--------------|
58
  | `30GB` | `2025-08-01` |
59
 
60
 
61
+ ### 【Model Download】
62
 
63
  ```python
64
+ from huggingface_hub import snapshot_download
65
+ snapshot_download('QuantTrio/Qwen3-Coder-30B-A3B-Instruct-GPTQ-Int8', cache_dir="your_local_path")
66
  ```
67
 
68
+ ### 【Overview】
69
  # Qwen3-Coder-30B-A3B-Instruct
70
  <a href="https://chat.qwen.ai/" target="_blank" style="margin: 2px;">
71
  <img alt="Chat" src="https://img.shields.io/badge/%F0%9F%92%9C%EF%B8%8F%20Qwen%20Chat%20-536af5" style="display: inline-block; vertical-align: middle;"/>