This collaborative project involved the University of Oxford and two universities in Papua, Universitas Cenderawasih and Universitas Negeri Papua, in the creation of an on-line database of 52 digital audio and video texts and the linguistically annotated transcriptions and translations of 23 of the texts for the Austronesian language Biak, a language with about 50,000-70,000 speakers in Papua. These resources provide a snapshot of audio and textual data on the language, and are useful for language preservation efforts, for ongoing efforts to produce teaching materials in the indigenous languages of Papua, and as a basis for the creation of dictionaries and glossaries in the language. Since they are linguistically annotated, they are also useful for linguists conducting research on Biak and related Austronesian languages. The purpose of this project was to create an on-line database of digital audio texts and their analysed and annotated transcriptions in the Biak language, an Austronesian language spoken in Papua by 50,000-70,000 speakers. The project benefits the academic linguistic community by making Biak digitised audio and annotated transcriptions freely available in further linguistic analysis and theory development; the community of Biak speakers in West Papua by creating a permanent on-line storehouse of a representative variety of Biak texts in both audio and written form; and the project partners at Universitas Negeri Papua and Universitas Cenderawasih by training and experience in the use of tools and best practice methods in language documentation and the practical skills to undertake future documentation efforts for the hundreds of under-described languages of the region.
The recordings were the result of face-to-face and telephone interviews. Besides the 52 digital audio and video files, we provide annotated transcriptions of 23 of the texts by using Toolbox, a freely-available data management and analysis tool for language documentation, which supports the creation of resources in various forms: transcribed texts with free translations into Indonesian and English (of most use to the Biak-speaking community and for pedagogical use in Papua) and linguistically annotated transcriptions in two forms: a standard human-readable form like the paper-based corpora familiar to linguists, and a translation of this form to XML via the utility tools for Toolbox, suitable for computer analysis and database search.