There are many types of corpus depending on their use. Below is a list some of the main types.
diachronic – a corpus which looks at changes across a timeframe.
learner – a corpus of L2 learner writing of speech.
monitor – a type of diachronic corpus which may continue to grow with new texts added over time.
monolingual – includes only one language.
multilingual – a corpus with two or more languages.
parallel – a corpus with both a target language (L2) and first language (L1).
reference – a corpus to which other corpora are used to compare with, usually through statistical data analysis.
synchronic – a corpus that has been constructed at a certain time (like a snapshot) to represent a language.
raw – a corpus with no annotation.
tagged – a corpus with annotation (for example, Parts-Of-Speech tags).
target – a corpus that is compared to a reference corpus.
Leave a Reply