-
Themis: Training Robust Multilingual Code Reward Models for Flexible Multi-Cr...
Themis-CodeRewardBench is a code-specific reward model evaluation benchmark comprising ~8.9k diverse code preference pairs across eight programming languages and five quality...
