arxiv:2504.16828
Muhammad Khalifa
mkhalifa
AI & ML interests
natural language genration, reinforcement learning
Recent Activity
updated a dataset 1 day ago
launch/thinkprm-1K-verification-cots submitted a paper 3 months ago
Gaming the Judge: Unfaithful Chain-of-Thought Can Undermine Agent Evaluation new activity 3 months ago
mkhalifa/flan-t5-large-gsm8k:Add model card for GRACE