INTRODUCTION
Why computer learner corpora?
In this study, I analyse the ISLE corpus,
1an already existing learner corpus, focusing mainly on the pronunciation errors that the German and Italian protagonists of this corpus make. Learner corpora are collections of authentic texts produced by foreign or second language learners, stored in an electronic format. Aside from their precious role as a resource for second language acquisition research, they can be used to identify typical difficulties of learners of a particular learner group (e.g. intermediate or advanced learners) or learners of a particular native language (e.g. German or Italian learners of English), and thus provide a basis for the identification of frequently occurring mistakes in learner language.
Research on learner corpora immediately caught my attention, for several reasons. The first one, quite obvious and instinctive, is to have the possibility to somehow “control” big amounts of spoken or written texts increased my curiosity to explore this mysterious and perplexing field, pushing me further towards the area of corpus linguistics. The discovery then, that this kind of research has only existed since the late 1980s increased my interest in this particular area of linguistics, as I feel that my contribution to such recent and unsaturated studies really could add something new and be further used for many goals in foreign language acquisition, as well as in foreign language teaching.
These large amounts of texts are certainly appealing as much as they are intimidating, but fortunately nowadays we can dispose of very useful computerised tools that are capable of analysing large amounts of linguistic data.
1
The ISLE corpus is a second language learning collection of non-native speech data, which consists of almost 18 hours of annotated speech signals spoken by Italian and German learners of English.
http://catalog.elra.info/product_info.php?products_id=568