-
A multilingual benchmark for evaluating metalinguistic knowledge WALS-Bench 1.0
This is a large-scale multilingual benchmark for evaluating metalinguistic knowledge (i.e. explicit knowledge about the structure of languages) in large language models using...
