-
Themis: Training Robust Multilingual Code Reward Models for Flexible Multi-Cr...
Themis-CodeRewardBench is a code-specific reward model evaluation benchmark comprising ~8.9k diverse code preference pairs across eight programming languages and five quality... -
Reward Modeling for Scientific Writing Evaluation
The components of this dataset are used in the experiments of the paper "Reward Modeling for Scientific Writing Evaluation". Please see README.md for more information.
