Joining Tools for SL Lexicography and Corpus Analysis
Over the last years, we have carried out a number of empirically based lexicographic projects resulting in sign language dictionaries of technical terms (psychology, joinery, home economics). The lexical databases used in these projects not only became more sophisticated and powerful as technology evolved, but they also shifted focus: The complete integration of video into the database makes form description of signs less important than it was in the beginning when access to the original data was time-consuming and tedious. Now, with immediate online access to dozens of hours of video data, phonetic transcription can be much broader. It only serves to memorize the form and the formal aspects of sign classification: The focus of form description is phonological in order to identify the type, as the tokens are easily accessible.
Instead, relations between signs, be they based on form, iconicity or on semantical aspects, appear to be most essential for the lexicographer's selection decisions. As a consequence, much of our lexical analysis work is devoted to building up a net of sign classes showing the conventional and productive use of signs, form variation and modification with the basis being the elicited data.
The use of such a lexical database in the context of corpus linguistics seems to be very promising. However, computerized tools in the fields of lexicography and corpus analysis today seem to be mutually incompatible. While the first are essentially sophisticated relational databases indexing original data in digital video format, multimedia tools used in corpus analysis concentrate on multi-tier video annotation and, at the best, feature relatively simple databases. In this poster, we suggest an open framework in which these two approaches can be unified and we investigate the implications using such a tool would have for work in both fields.