How to run this with 8-bit quantized (or lower)

#1
by lexat - opened

I have only 12GB of VRAM. Is it possible to run it?

Yes! For 8-bit quantization: It works fine for short texts, but long contexts will likely cause an OOM (Out of Memory) error.

Can you please provide command or instruction how to run it with 12GB of VRAM?

Sign up or log in to comment