The dataset includes an annotated dataset of N = 5098 tokens of English s-genitive (e.g. "the children's voices") and of-genitive (e.g. "the voices of the children") constructions extracted from 5 components an of the Brown and Frown corpora of published written American English. The Brown corpus was compiled in the early 1960s, and the parallel Frown corpus in the early 1990s. The tokens are annotated for 18 syntactic, semantic, phonological, and contextual features. The dataset also includes the full sentence containing the token as well as metadata pertaining to the location of the token within the corpus.
Stanford parser, 1.6.1
Corpus data used was version privately stored on Stanford University servers.