For this week's assignment, I reviewed three corpuses . . . corpusi . . . corpora: The National American Corpus, The Corpus of Contemporary American English and The Cambridge English Corpus. Here are my thoughts:
The American National
Corpus contains over 14
million words drawn from authentic texts which are donated by contributors. The
goal of this corpus is to enable software designers to analyze typical American
English so that their products and the web will “handle [actual] American
usage.” This corpus is principled, authentic,
accessible, and would be a good resource for business professionals in software
and web design.
The Corpus of
Contemporary American English contains 425 million words collected from
more than 175,000 sources including spoken, fiction, popular magazines,
newspapers, and academic journals. This
corpus does not have any stated goals or explanatory information. The “see notes” link to verify the authenticity
of texts doesn’t work. The corpus is
downloadable and no membership is required to access the data. This corpus might be great for an individual
researcher. However, I would not
recommend it because the authenticity of the corpus cannot be verified.
The
Cambridge English Corpus would be the best site for educators and language
learners to use. The goal of the corpus is “to help in writing books for
learners of English.” This is a
principled corpus. It contains 1 billion,
760 million words taken from authentic sources: “newspapers, best-selling
novels, non-fiction books on a wide range of topics, websites, magazines, junk
mail, TV and radio programmes, recordings of people's everyday conversations
and many other sources.” This corpus consists
of 8 corpora specializing in Spoken English in the UK, Business language,
Spoken English in North America, Business reports and docs in the UK and US, Legal English, Financial English from US and
US, Academic English from the UK and US, and a corpus of student exam scripts from
the Cambridge ESOL exams.
This corpus is ideal for educators and text book
developers. Only members have full
access, but there are many features of this corpus that are available to the
public. The corpus provides learning
materials including interactive quizzes and games free online. I will recommend this to my tutees.
Oh, Corpora!
No comments:
Post a Comment