SciLaD: A Large-Scale, Transparent, Reproducible Dataset for Natural Scientific Language Processing
Paper
•
2512.11192
•
Published
NLP, Digital Humanities
Gaperon: A Peppered English-French Generative Language Model Suite
LLM Reasoning for Machine Translation: Synthetic Data Generation over Thinking Tokens