Nemotron-Personas Collection A collection of multilingual, region-specific synthetic persona datasets that support sovereign AI development across many countries and regions. β’ 5 items β’ Updated 10 days ago β’ 21
Running Featured 1.29k FineWeb: decanting the web for the finest text data at scale π· 1.29k Download a trillionβtoken web text dataset for LLM training
Running 6 Responsible AI Benchmark π 6 Evaluating safety, robustness & fairness for real use-cases
A Flexible Large Language Models Guardrail Development Methodology Applied to Off-Topic Prompt Detection Paper β’ 2411.12946 β’ Published Nov 20, 2024 β’ 22
Runtime error Featured 515 Florence2 + SAM2 π₯ 515 Segment and caption objects in images and videos
protectai/distilroberta-base-rejection-v1 Text Classification β’ 82.1M β’ Updated Mar 11, 2024 β’ 4.16k β’ β’ 8