TableScope Structure Extractor 8B

Korean table structure extraction LoRA adapter for Qwen3-VL-8B, trained on 15k synthetic Korean tables with row-strip chunking.

ํ•œ๊ตญ์–ด ํ…Œ์ด๋ธ” ์ด๋ฏธ์ง€์—์„œ **ํ…Œ์ด๋ธ” ๊ตฌ์กฐ(TableSchema JSON)**๋ฅผ ์ž๋™ ์ถ”์ถœํ•˜๋Š” QLoRA ํŒŒ์ธํŠœ๋‹ ์–ด๋Œ‘ํ„ฐ์ž…๋‹ˆ๋‹ค.

โœจ ์ฃผ์š” ํŠน์ง•

  • Qwen3-VL-8B ๊ธฐ๋ฐ˜ QLoRA (4-bit NF4, r=64, alpha=128)
  • ํ•œ๊ตญ์–ด ํ…Œ์ด๋ธ” ํŠนํ™”: 15,000๊ฑด ํ•œ๊ตญ์–ด ํ•ฉ์„ฑ ํ…Œ์ด๋ธ” ๋ฐ์ดํ„ฐ๋กœ ํ•™์Šต
  • Row-strip Chunking: 4096 ํ† ํฐ ์ดˆ๊ณผ ๋Œ€ํ˜• ํ…Œ์ด๋ธ”์„ ํ–‰ ๋‹จ์œ„ ๋ถ„ํ•  ์ฒ˜๋ฆฌ
  • Anti-Forgetting 2-Stage ํ•™์Šต: ๊ธฐ์กด ์„ฑ๋Šฅ ์œ ์ง€ํ•˜๋ฉด์„œ ๋Œ€ํ˜• ํ…Œ์ด๋ธ” ์ฒ˜๋ฆฌ ๋Šฅ๋ ฅ ์Šต๋“

๐Ÿ“Š ์„ฑ๋Šฅ

Chunked ์ถ”๋ก  ๋ชจ๋“œ (1,500๊ฑด ์ „์ฒด ํ…Œ์ŠคํŠธ์…‹)

๋ณต์žก๋„ ์ˆ˜๋Ÿ‰ TEDS TEDS-S CellAcc ValidRate
Simple 492 0.561 0.832 0.399 93.5%
Medium 505 0.620 0.830 0.437 89.7%
Complex 301 0.370 0.512 0.201 62.8%
Extreme 202 0.108 0.147 0.079 36.6%
์ „์ฒด 1500 0.481 0.675 0.329 78.4%

Standard vs Chunked ๋น„๊ต

๋ณต์žก๋„ Standard Chunked ๊ฐœ์„ 
Simple 0.561 0.561 ยฑ0
Medium 0.611 0.620 +1.4%
Complex 0.282 0.370 +31%
Extreme 0.048 0.108 +125%
์ „์ฒด 0.453 0.481 +6.2%

๐Ÿš€ ์‚ฌ์šฉ๋ฒ•

์„ค์น˜

pip install transformers peft bitsandbytes qwen-vl-utils

์ถ”๋ก 

from peft import PeftModel
from transformers import Qwen2_5_VLForConditionalGeneration, AutoProcessor
from qwen_vl_utils import process_vision_info

# ์ด ๋ฆฌํฌ๋Š” LoRA ์–ด๋Œ‘ํ„ฐ(weight)๋งŒ ํฌํ•จํ•˜๊ณ  ์žˆ์Šต๋‹ˆ๋‹ค.
# base_model(Qwen/Qwen3-VL-8B-Instruct)์€ ์•„๋ž˜ ์ฝ”๋“œ์—์„œ ์ž๋™์œผ๋กœ ๋‹ค์šด๋กœ๋“œ๋ฉ๋‹ˆ๋‹ค.

# ๋ฒ ์ด์Šค ๋ชจ๋ธ ๋กœ๋“œ (4-bit ์–‘์žํ™”)
from transformers import BitsAndBytesConfig
bnb_config = BitsAndBytesConfig(
    load_in_4bit=True,
    bnb_4bit_quant_type="nf4",
    bnb_4bit_compute_dtype="bfloat16",
)

model = Qwen2_5_VLForConditionalGeneration.from_pretrained(
    "Qwen/Qwen3-VL-8B-Instruct",
    quantization_config=bnb_config,
    device_map="auto",
)

# LoRA ์–ด๋Œ‘ํ„ฐ ๋กœ๋“œ
model = PeftModel.from_pretrained(model, "cywellai/tablescope-structure-extractor-8b")
processor = AutoProcessor.from_pretrained("cywellai/tablescope-structure-extractor-8b")

# ์ถ”๋ก 
messages = [
    {"role": "system", "content": "๋‹น์‹ ์€ ํ…Œ์ด๋ธ” ์ด๋ฏธ์ง€์—์„œ ๊ตฌ์กฐ๋ฅผ ์ถ”์ถœํ•˜๋Š” ์ „๋ฌธ๊ฐ€์ž…๋‹ˆ๋‹ค. ์ฃผ์–ด์ง„ ํ…Œ์ด๋ธ” ์ด๋ฏธ์ง€๋ฅผ ๋ถ„์„ํ•˜์—ฌ TableSchema JSON์„ ์ƒ์„ฑํ•˜์„ธ์š”."},
    {"role": "user", "content": [
        {"type": "image", "image": "path/to/table.png"},
        {"type": "text", "text": "์ด ํ…Œ์ด๋ธ” ์ด๋ฏธ์ง€์˜ ๊ตฌ์กฐ๋ฅผ JSON์œผ๋กœ ์ถ”์ถœํ•˜์„ธ์š”."},
    ]},
]

text = processor.apply_chat_template(messages, tokenize=False, add_generation_prompt=True)
image_inputs, video_inputs = process_vision_info(messages)
inputs = processor(text=[text], images=image_inputs, videos=video_inputs, return_tensors="pt").to(model.device)

output_ids = model.generate(**inputs, max_new_tokens=4096)
output = processor.batch_decode(output_ids[:, inputs.input_ids.shape[1]:], skip_special_tokens=True)[0]
print(output)  # TableSchema JSON

๐Ÿ—๏ธ ์ถœ๋ ฅ ํ˜•์‹ (TableSchema JSON)

{
  "col_headers": [
    {"labels": ["์ด๋ฆ„", "๋‚˜์ด", "์ง๊ธ‰"], "spans": {}}
  ],
  "row_headers": [],
  "data": [
    [{"value": "๊น€์ฒ ์ˆ˜"}, {"value": "35"}, {"value": "๋Œ€๋ฆฌ"}],
    [{"value": "์ด์˜ํฌ"}, {"value": "42"}, {"value": "๊ณผ์žฅ"}]
  ],
  "merged_regions": []
}

๐Ÿ”ง ํ•™์Šต ์ƒ์„ธ

Anti-Forgetting 2-Stage ํ•™์Šต

ํ•ญ๋ชฉ Stage 1 Stage 2
๋ฐ์ดํ„ฐ full 12,000๊ฑด full 70% + chunked 30% (17,142๊ฑด)
์‹œ์ž‘ v0.0.1 adapter ์ด์–ดํ•™์Šต Stage 1 best adapter
LR 5e-6 3e-6
Epochs 1 2
Best eval_loss 1.0154 1.0070

QLoRA ์„ค์ •

ํ•ญ๋ชฉ ๊ฐ’
Quantization NF4 4-bit
LoRA rank 64
LoRA alpha 128
LoRA dropout 0.05
Trainable params 174M / 8.9B (1.95%)

ํ•™์Šต ์ธํ”„๋ผ

  • GPU: NVIDIA H200 (143GB VRAM)
  • Framework: transformers + peft + trl (SFTTrainer)
  • ์ด ํ•™์Šต ์‹œ๊ฐ„: Stage 1 (1h 43m) + Stage 2 (5h 13m) = ~7์‹œ๊ฐ„

๐Ÿ“ฆ ๊ด€๋ จ ๋ฆฌ์†Œ์Šค

  • ๋ฒ ์ด์Šค ๋ชจ๋ธ: Qwen/Qwen3-VL-8B-Instruct
  • ํ”„๋กœ์ ํŠธ: TableScope โ€” Korean Table Vision Agent

โš ๏ธ ์ œํ•œ ์‚ฌํ•ญ

  • ํ•œ๊ตญ์–ด ํ•ฉ์„ฑ ํ…Œ์ด๋ธ” ๋ฐ์ดํ„ฐ๋กœ๋งŒ ํ•™์Šต โ†’ ์‹ค์‚ฌ์ง„/์Šค์บ” ํ…Œ์ด๋ธ”์€ ์„ฑ๋Šฅ ์ €ํ•˜ ๊ฐ€๋Šฅ
  • Complex/Extreme ๋ณต์žก๋„์—์„œ๋Š” ์•„์ง ๊ฐœ์„  ์—ฌ์ง€ ์žˆ์Œ
  • Row-strip Chunking์€ ํ–‰ ๊ธฐ๋ฐ˜ ๋ถ„ํ• ์ด๋ฏ€๋กœ ์—ด์ด ๋งค์šฐ ๋งŽ์€ ๊ฒฝ์šฐ ํ•œ๊ณ„
Downloads last month
9
Inference Providers NEW
This model isn't deployed by any Inference Provider. ๐Ÿ™‹ Ask for provider support

Model tree for cywellai/tablescope-structure-extractor-8b

Adapter
(46)
this model

Evaluation results