Hugging Face's logo Hugging Face
  • Models
  • Datasets
  • Spaces
  • Buckets new
  • Docs
  • Enterprise
  • Pricing

  • Log In
  • Sign Up
codelion 's Collections
Dhara Foundational Models
Sutra Pedagogical Datasets
Nano Language Models
Pre-training Dataset Samples
Ellora
Pivotal Token Search
Internal Coherence Maximization
Securade.ai

Sutra Pedagogical Datasets

updated 10 days ago

High-quality synthetic educational datasets designed for LLM pretraining with structured pedagogical content across 9 knowledge domains.

Upvote
3

  • codelion/sutra-10B

    Viewer • Updated 19 days ago • 5M • 3.02k • 4

  • codelion/sutra-improved-100M

    Preview • Updated 10 days ago • 23 • 1

  • codelion/sutra-1B

    Viewer • Updated 19 days ago • 429k • 84 • 2

  • codelion/sutra-100M

    Viewer • Updated 19 days ago • 70.4k • 41 • 2

  • codelion/sutra-10M

    Viewer • Updated 19 days ago • 7.25k • 28 • 3

  • codelion/sutra-30k-seeds

    Viewer • Updated 19 days ago • 30.3k • 20 • 2

  • codelion/sutra-magpie-sft

    Viewer • Updated 19 days ago • 20.7k • 31 • 2
Upvote
3
  • Collection guide
  • Browse collections
Company
TOS Privacy About Careers
Website
Models Datasets Spaces Pricing Docs