GGUFs for use in llama.cpp. As of b8110 a specific chat template needs to be used, else the model crashes. The chat template is included in the repo files.
The chat template is finnicky and should probably not be used. Instead, for now a special tag <__media__> should be added before the actual user prompt string. I can confirm this gives better results.
The base files are BF16.
Basic usage with llama-server:
llama-server -m PaddleOCR-VL-1.5.gguf --mmproj mmproj-PaddleOCR-VL-1.5.gguf -c 20000 --fit off
Then make a request with an image attached and prompt e.g: <__media__>OCR:
PaddleOCR-VL is expected to be used with a separate Paddle layout analysis model (NOT included in llama.cpp - it's in the PaddlePaddle library). However, for simpler use cases PaddleOCR-VL can work alone.
So far on short multilingual line texts, it performs well on low quants (Q4_K_M lm & Q4_1 mmproj). Performance remains to be seen on more complex tasks.
- Downloads last month
- 730
Model tree for octopusmegalopod/some-paddleocr1.5-vl-ggufs
Base model
baidu/ERNIE-4.5-0.3B-Paddle