GGUFs for use in llama.cpp. ~~As of b8110 a specific chat template needs to be used, else the model crashes. The chat template is included in the repo files.~~

The chat template is finnicky and should probably not be used. Instead, for now a special tag <__media__> should be added before the actual user prompt string. I can confirm this gives better results.

The base files are BF16.

Basic usage with llama-server: llama-server -m PaddleOCR-VL-1.5.gguf --mmproj mmproj-PaddleOCR-VL-1.5.gguf -c 20000 --fit off

Then make a request with an image attached and prompt e.g: <__media__>OCR:

PaddleOCR-VL is expected to be used with a separate Paddle layout analysis model (NOT included in llama.cpp - it's in the PaddlePaddle library). However, for simpler use cases PaddleOCR-VL can work alone.

So far on short multilingual line texts, it performs well on low quants (Q4_K_M lm & Q4_1 mmproj). Performance remains to be seen on more complex tasks.

Downloads last month: 730

GGUF

Model size

0.5B params

Architecture

paddleocr

Hardware compatibility

4-bit

8-bit

View +1 variant

Model tree for octopusmegalopod/some-paddleocr1.5-vl-ggufs

Base model

baidu/ERNIE-4.5-0.3B-Paddle

Finetuned

PaddlePaddle/PaddleOCR-VL-1.5

Quantized

(5)

this model