Edit Models filters

Apps

Docker Model Runner

Inference Providers

OVHcloud AI Endpoints

HF Inference API

Misc

Inference Endpoints

text-generation-inference

Eval Results (legacy)

text-embeddings-inference

4-bit precision

8-bit precision

Mixture of Experts

Carbon Emissions

Models

26,847

Full-text search

Active filters: 8-bit

openai/gpt-oss-120b

Text Generation • 120B • Updated Aug 26, 2025 • 2.88M • • 4.4k

openai/gpt-oss-20b

Text Generation • 22B • Updated Aug 26, 2025 • 6.43M • • 4.26k

GadflyII/GLM-4.7-Flash-NVFP4

Text Generation • 18B • Updated 9 days ago • 183k • 48

nvidia/NVIDIA-Nemotron-3-Nano-30B-A3B-NVFP4

Text Generation • 18B • Updated about 17 hours ago • 7.1k • 26

mlx-community/Qwen3-TTS-12Hz-0.6B-CustomVoice-8bit

Text-to-Speech • 0.5B • Updated 4 days ago • 1.67k • 9

microsoft/bitnet-b1.58-2B-4T

Text Generation • 0.8B • Updated Dec 17, 2025 • 5.88k • 1.26k

openai/gpt-oss-safeguard-20b

Text Generation • 22B • Updated 15 days ago • 17.1k • • 184

mlx-community/GLM-4.7-Flash-8bit

Text Generation • 30B • Updated 4 days ago • 8.16k • 17

MultiverseComputingCAI/HyperNova-60B

Text Generation • 60B • Updated 21 days ago • 1.56k • 48

mlx-community/GLM-4.7-Flash-8bit-gs32

Text Generation • 30B • Updated 4 days ago • 531 • 5

GadflyII/GLM-4.7-Flash-MXFP4

Text Generation • 18B • Updated 3 days ago • 661 • 5

FabioSarracino/VibeVoice-Large-Q8

Text-to-Audio • 9B • Updated Oct 1, 2025 • 2.62k • 80

Salyut1/GLM-4.7-NVFP4

Text Generation • 177B • Updated Dec 23, 2025 • 5.82k • 11

nvidia/Qwen3-8B-NVFP4

Text Generation • 5B • Updated Sep 9, 2025 • 6.83k • 13

ig1/Qwen3-VL-30B-A3B-Instruct-NVFP4

Image-Text-to-Text • 18B • Updated 18 days ago • 2.28k • 6

lukealonso/MiniMax-M2.1-NVFP4

115B • Updated 23 days ago • 26.9k • 19

nvidia/DeepSeek-V3.2-NVFP4

Text Generation • 394B • Updated 8 days ago • 1.54k • 3

lmstudio-community/GLM-4.7-Flash-MLX-8bit

Text Generation • 30B • Updated 7 days ago • 393k • 4

mlx-community/Qwen3-TTS-12Hz-1.7B-VoiceDesign-8bit

Text-to-Speech • 0.8B • Updated 4 days ago • 1.02k • 3

ragraph-ai/stable-cypher-instruct-3b

Text Generation • 3B • Updated Jun 12, 2025 • 359 • 31

MaziyarPanahi/Qwen2.5-1.5B-Instruct-GGUF

Text Generation • 2B • Updated Sep 18, 2024 • 151k • 10

nvidia/DeepSeek-R1-NVFP4

Text Generation • 397B • Updated Jun 6, 2025 • 10.2k • 269

tiiuae/Falcon-E-3B-Instruct

Text Generation • 0.9B • Updated Oct 7, 2025 • 284 • 36

MaziyarPanahi/Qwen3-1.7B-GGUF

Text Generation • 2B • Updated Apr 28, 2025 • 229k • 6

nvidia/Qwen3-235B-A22B-NVFP4

Text Generation • 133B • Updated Jul 8, 2025 • 5.22k • 13

NVFP4/Qwen3-Coder-30B-A3B-Instruct-FP4

Text Generation • 16B • Updated Aug 5, 2025 • 4.16k • 6

mlx-community/DeepSeek-OCR-8bit

Image-Text-to-Text • 1B • Updated about 24 hours ago • 1.39k • 30

kldzj/gpt-oss-120b-heretic-v2

Text Generation • 117B • Updated Nov 18, 2025 • 304 • 17

MaziyarPanahi/NVIDIA-Nemotron-Nano-12B-v2-GGUF

Text Generation • 12B • Updated Nov 28, 2025 • 73.4k • 2

Disty0/Z-Image-Turbo-SDNQ-int8

Text-to-Image • Updated Dec 2, 2025 • 1.89k • 17