Submitted by Zihao1 57 Too Good to be Bad: On the Failure of LLMs to Role-Play Villains Tencent 287 7
Submitted by AnnieFeng 36 VeriCoT: Neuro-symbolic Chain-of-Thought Validation via Logical Consistency Checks Amazon Web Services 2
Submitted by ProKil 13 Real-Time Reasoning Agents in Evolving Environments Social And Language Technology Lab 31 2
Submitted by taesiri 9 Towards Mitigating Hallucinations in Large Vision-Language Models by Refining Textual Embeddings · 8 authors 2
Submitted by JiayuJeff 4 CritiCal: Can Critique Help LLM Uncertainty or Confidence Calibration? · 10 authors 5 2