Replication Data for: A serial founder effect model of phonemic diversity based on phonemic loss in low-density populations

DOI

It has been observed that the number of phonemes in languages in use today tends to decrease with increasing distance from Africa. A previous formal model has recently reproduced the observed cline, but under two strong assumptions. Here we tackle the question of whether an alternative explanation for the worldwide phonemic cline is possible, by using alternative assumptions. The answer is affirmative. We show this by formalizing a proposal, following Atkinson, that this pattern may be due to a repeated bottleneck effect and phonemic loss. In our simulations, low-density populations lose phonemes during the Out-of-Africa dispersal of modern humans. Our results reproduce the observed global cline for the number of phonemes. In addition, we also detect a cline of phonemic diversity and reproduce it using our simulation model. We suggest how future work could determine whether the previous model or the new one (or even a combination of them) is valid. Simulations also show that the clines can still be present even 300 kyr after the Out-of-Africa dispersal, which is contrary to some previous claims which were not supported by numerical simulations​

The zip file contains the following documents and files: - S1 Text: Supplementary results in DOCX, with different graphic simulations that complement the results mentioned in the published article. Graphics have been calculated from the data collected in the "Language database" - S1 Database in XLSX. It is the Language database that contains the list of phonemes for 359 languages. For each language are provided the number of phonemes and the distance from the origin of the out-of-Africa. For these 359 languages, 908 different phonemes have been found. First, all languages ​​in the dataset were coded in strings of "1" and "0". This leads to a "full" matrix of 359 rows (languages) x 908 columns (phonemes). The presence of a phoneme is marked with a "1" in the corresponding position. The absence of a given phoneme is marked with a "0". Data from this database are used to generate the observed phonetic cline and the simulated phonemic cline, explained in the published article. - S1 Software: SFE (serial founder effect) with phonemic loss program in FORTRAN - S2 Software: Program to compute diversity tF of languages at given distance intervals in FORTRAN

Identifier
DOI https://doi.org/10.34810/data671
Metadata Access https://dataverse.csuc.cat/oai?verb=GetRecord&metadataPrefix=oai_datacite&identifier=doi:10.34810/data671
Provenance
Creator Pérez Losada, Joaquim ORCID logo; Fort, Joaquim ORCID logo
Publisher CORA.Repositori de Dades de Recerca
Publication Year 2023
Funding Reference Agència de Gestió d'Ajuts Universitaris i de Recerca 2017-SGR-243 ; Ministerio de Economía, Indústria y Competividad (MINECO) FIS2016-80200-P ; Institució Catalana de Recerca i Estudis Avançats (ICREA) Academia Humanities award 2014
Rights CC BY 4.0; info:eu-repo/semantics/openAccess; http://creativecommons.org/licenses/by/4.0
OpenAccess true
Representation
Resource Type Compiled data; Dataset
Format application/zip; text/plain
Size 3100979; 2272
Version 1.1
Discipline Agriculture, Forestry, Horticulture, Aquaculture; Agriculture, Forestry, Horticulture, Aquaculture and Veterinary Medicine; Life Sciences; Social Sciences; Social and Behavioural Sciences; Soil Sciences