Prosodically annotated TED talks

DOI

"Audio files of the recordings are provided in the partitioned archives as WAV format. ""talk_proscripts"" archive contains Proscript format annotations of complete talks. ""punkProse_dataset"" archive contains sampled dataset partitioning used in prosodic punctuation modelling experiments (See http://github.com/alpoktem/punkProse). README.txt file contains information on the dataset and authors. Indexing of the files and their corresponding talks are listed in TED_talk_ids.txt.

Proscript format files contain the sequence of uttered words in a recording, their approximate timings and corresponding acoustic measurements (pitch, intensity, speech rate). For more information on Proscript format see http://github.com/alpoktem/proscript."

TED talks are a set of conference talks that have been held worldwide in more than 100 languages. They include a large variety of topics, from technology and design to science, culture and academia. This corpus consists of speech recordings and Proscript format annotations of 1046 talks by 877 English speakers, uttering a total amount of 155174 sentences.

Identifier
DOI https://doi.org/10.34810/data501
Related Identifier IsCitedBy https://doi.org/10.34810/data484
Metadata Access https://dataverse.csuc.cat/oai?verb=GetRecord&metadataPrefix=oai_datacite&identifier=doi:10.34810/data501
Provenance
Creator Öktem, Alp ORCID logo; Farrús, Mireia ORCID logo; Lai, Catherine
Publisher CORA.Repositori de Dades de Recerca
Publication Year 2023
Funding Reference European Commission 645012
Rights Custom Dataset Terms; info:eu-repo/semantics/openAccess; https://dataverse.csuc.cat/api/datasets/:persistentId/versions/1.0/customlicense?persistentId=doi:10.34810/data501
OpenAccess true
Representation
Resource Type Other; Dataset
Format application/zip; text/plain
Size 96681280; 2258; 82647048; 1828580403; 1811773112; 1808257791; 1815815004; 1817286516; 1811049903; 1809044903; 1815564929; 1813919545; 1814560995; 1808943612; 1821152837; 1803577090; 1721625018; 73292
Version 1.0
Discipline Humanities