kowndinya23/Kvasir-SEG
Viewer • Updated • 1k • 1.08k • 2
How to use Mayank022/sam-vit-base-kvasir-polyp-segmentation with Transformers:
# Use a pipeline as a high-level helper
from transformers import pipeline
pipe = pipeline("mask-generation", model="Mayank022/sam-vit-base-kvasir-polyp-segmentation") # Load model directly
from transformers import AutoProcessor, AutoModelForMaskGeneration
processor = AutoProcessor.from_pretrained("Mayank022/sam-vit-base-kvasir-polyp-segmentation")
model = AutoModelForMaskGeneration.from_pretrained("Mayank022/sam-vit-base-kvasir-polyp-segmentation")Fine-tuned facebook/sam-vit-base on Kvasir-SEG for gastrointestinal polyp segmentation.
| Base Model | facebook/sam-vit-base (93.7M params) |
| Task | Binary polyp segmentation from endoscopy images |
| Strategy | Fine-tune mask decoder, freeze vision encoder + prompt encoder |
| Prompts | Bounding box (with random ±20px perturbation) |
| Dataset | Kvasir-SEG: 880 train / 120 val images |
| Loss | DiceCELoss (MONAI) |
| Optimizer | AdamW (lr=1e-05, wd=0.01) |
| Scheduler | Cosine Annealing |
| Epochs | 30 |
| Best Val Dice | 0.8355 |
from transformers import SamModel, SamProcessor
from PIL import Image
import torch, numpy as np
model = SamModel.from_pretrained("Mayank022/sam-vit-base-kvasir-polyp-segmentation")
processor = SamProcessor.from_pretrained("Mayank022/sam-vit-base-kvasir-polyp-segmentation")
image = Image.open("polyp.jpg").convert("RGB")
input_boxes = [[[100, 100, 400, 400]]] # bounding box prompt
inputs = processor(image, input_boxes=input_boxes, return_tensors="pt")
with torch.no_grad():
outputs = model(**inputs, multimask_output=False)
mask = (torch.sigmoid(outputs.pred_masks.squeeze()) > 0.5).cpu().numpy().astype(np.uint8)
Base model
facebook/sam-vit-base