[Request] Great work! Do you have plans to also create GLM-5.1-AWQ?

#6
by ag1988 - opened

GLM-5.1 has been released https://huggingface.co/zai-org/GLM-5.1. Are you planning on also creating a AWQ version of this?

QuantTrio org

downloading it 🥹

Hey, sorry for the naive question but what caliberation data did you use - the default data is from pileeval as shown in the AutoAWQ library. Given that this model has a chat template did you any chat data (e.g. smoltalk) or did you just use the default AutoAWQ settings. This info would be immensely helpful.

Default caliberation data used in AutoAWQ: https://github.com/casper-hansen/AutoAWQ/blob/88e4c76b20755db275574e6a03c83c84ba3bece5/awq/models/base.py#L150

QuantTrio org

Hey, sorry for the naive question but what caliberation data did you use - the default data is from pileeval as shown in the AutoAWQ library. Given that this model has a chat template did you any chat data (e.g. smoltalk) or did you just use the default AutoAWQ settings. This info would be immensely helpful.

Default caliberation data used in AutoAWQ: https://github.com/casper-hansen/AutoAWQ/blob/88e4c76b20755db275574e6a03c83c84ba3bece5/awq/models/base.py#L150

Reference readme, using data-free quantization (no calibration dataset required).

Thanks for the clarification 👍

I was hoping to find a QuantTrio GLM-5.1-AWQ quantization as we were quite happy with your GLM-5-AWQ variant.
Today I realized that someone else was quicker: https://huggingface.co/cyankiwi/GLM-5.1-AWQ-4bit
Would be nice to know if you plan with similar settings or a different configuration.

This is also why I've been waiting for you guys to release 5.1 version as I am happy with your glm-5 version. Yesterday, I started to put together some code to do it myself based on llm-compressor as I didnt hear back from you guys but later thought maybe its better to wait as you have more experience.

QuantTrio org

Soon to be released

Sign up or log in to comment