New preprint on RL for mitigating hallucination in reasoning models!