Instructions to use PolarSeeker/LongSeeker-30B-SFT with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- Transformers
How to use PolarSeeker/LongSeeker-30B-SFT with Transformers:
# Use a pipeline as a high-level helper from transformers import pipeline pipe = pipeline("text-generation", model="PolarSeeker/LongSeeker-30B-SFT") messages = [ {"role": "user", "content": "Who are you?"}, ] pipe(messages)# Load model directly from transformers import AutoTokenizer, AutoModelForCausalLM tokenizer = AutoTokenizer.from_pretrained("PolarSeeker/LongSeeker-30B-SFT") model = AutoModelForCausalLM.from_pretrained("PolarSeeker/LongSeeker-30B-SFT") messages = [ {"role": "user", "content": "Who are you?"}, ] inputs = tokenizer.apply_chat_template( messages, add_generation_prompt=True, tokenize=True, return_dict=True, return_tensors="pt", ).to(model.device) outputs = model.generate(**inputs, max_new_tokens=40) print(tokenizer.decode(outputs[0][inputs["input_ids"].shape[-1]:])) - Notebooks
- Google Colab
- Kaggle
- Local Apps
- vLLM
How to use PolarSeeker/LongSeeker-30B-SFT with vLLM:
Install from pip and serve model
# Install vLLM from pip: pip install vllm # Start the vLLM server: vllm serve "PolarSeeker/LongSeeker-30B-SFT" # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:8000/v1/chat/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "PolarSeeker/LongSeeker-30B-SFT", "messages": [ { "role": "user", "content": "What is the capital of France?" } ] }'Use Docker
docker model run hf.co/PolarSeeker/LongSeeker-30B-SFT
- SGLang
How to use PolarSeeker/LongSeeker-30B-SFT with SGLang:
Install from pip and serve model
# Install SGLang from pip: pip install sglang # Start the SGLang server: python3 -m sglang.launch_server \ --model-path "PolarSeeker/LongSeeker-30B-SFT" \ --host 0.0.0.0 \ --port 30000 # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:30000/v1/chat/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "PolarSeeker/LongSeeker-30B-SFT", "messages": [ { "role": "user", "content": "What is the capital of France?" } ] }'Use Docker images
docker run --gpus all \ --shm-size 32g \ -p 30000:30000 \ -v ~/.cache/huggingface:/root/.cache/huggingface \ --env "HF_TOKEN=<secret>" \ --ipc=host \ lmsysorg/sglang:latest \ python3 -m sglang.launch_server \ --model-path "PolarSeeker/LongSeeker-30B-SFT" \ --host 0.0.0.0 \ --port 30000 # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:30000/v1/chat/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "PolarSeeker/LongSeeker-30B-SFT", "messages": [ { "role": "user", "content": "What is the capital of France?" } ] }' - Docker Model Runner
How to use PolarSeeker/LongSeeker-30B-SFT with Docker Model Runner:
docker model run hf.co/PolarSeeker/LongSeeker-30B-SFT
LongSeeker: Elastic Context Orchestration for Long-Horizon Search Agents
Update — May 27: The model has been updated. Please use the latest version for evaluation and deployment.
LongSeeker is a long-horizon search agent that introduces Context-ReAct, a novel paradigm for elastic context orchestration. Unlike standard ReAct agents that passively accumulate observations, LongSeeker dynamically reshapes its working context using five atomic meta-operations: Skip, Compress, Rollback, Snippet, and Delete. This allows the agent to preserve critical evidence, summarize resolved information, discard unhelpful branches, and control context size—achieving reliable and efficient long-horizon reasoning.
Highlights
- Strong long-horizon search performance: LongSeeker achieves 61.5 on BrowseComp, 62.5 on BrowseComp-ZH, 78.0 on xbench-2505, and 77.7 on GAIA-text, demonstrating competitive capability across both web search and general agent benchmarks.
- Elastic context orchestration for search agents: We introduce Context-ReAct, a new agentic paradigm that jointly generates reasoning, context meta-operations, and tool calls, enabling agents to dynamically decide when, where, and how to reshape their working context during long-horizon search.
- Comprehensive and fine-grained context control: Context-ReAct defines five atomic operations—Skip, Compress, Rollback, Snippet, and Delete—forming an expressively complete yet efficient operation set for multi-resolution context management.
- Efficient context management at extended horizons: LongSeeker maintains a stable working context of around 15k tokens even across long trajectories, using only a small fraction of its 256k context window while avoiding the rapid context growth of standard ReAct agents.
Performance
For more details, please refer to our GitHub repository. Paper: arXiv:2603.15594
- Downloads last month
- 277

