0% found this document useful (0 votes)
44 views2 pages

Corpus Linguistics: Tools and Resources: IT Services Course Hilary Term 2015

This document provides resources for corpus linguistics, including software, online corpora, and archives of text and media. It lists corpora for many languages like English, Chinese, Czech, Finnish, French, German, Italian, Portuguese, Russian, Spanish, and multilingual resources. AntConc software allows users to analyze corpora on their own computer. The Virtual Language Observatory and resources listed here contain corpora that can be used for linguistic analysis. Users are invited to contact the document authors with any questions.

Uploaded by

Pandeng Li
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
44 views2 pages

Corpus Linguistics: Tools and Resources: IT Services Course Hilary Term 2015

This document provides resources for corpus linguistics, including software, online corpora, and archives of text and media. It lists corpora for many languages like English, Chinese, Czech, Finnish, French, German, Italian, Portuguese, Russian, Spanish, and multilingual resources. AntConc software allows users to analyze corpora on their own computer. The Virtual Language Observatory and resources listed here contain corpora that can be used for linguistic analysis. Users are invited to contact the document authors with any questions.

Uploaded by

Pandeng Li
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 2

Corpus Linguistics: Tools and Resources

IT Services Course
Hilary Term 2015
Tips
Please feel free to get in touch with Ylva Berglund Prytz (ylva.berglund@it.ox.ac.uk) or Martin Wynne
(martin.wynne@it.ox.ac.uk) with any questions.

Explore the Corpora mailing list http://www.hit.uib.no/corpora/. You can sign up and ask a question on the
list, or search the archive for questions and answers in the past.

A software application which you can use for doing corpus linguistics with texts and corpora on your own
computer is AntConc (http://www.antlab.sci.waseda.ac.jp/).
It is free, and it is very simple to find, download and install. It has the main functions such as concordance,
collocation, wordlists, etc., and built-in support for many languages and writing systems. There are versions
for Windows, Mac and Linux.

RESOURCES
For modern European languages in particular, the Virtual Language Observatory at
http://www.clarin.eu/vlo/ is increasingly becoming the one-stop shop, and is constantly added to and kept
up to date.
Here is a selection of corpora available online:

English
Brigham Young Corpora (BNC, American English, Time) http://corpora.byu.edu/
British National Corpus http://ota.oerc.ox.ac.uk/bncweb-cgi/BNCweb.pl/ (full access for Oxford users),
http://www.natcorp.ox.ac.uk/, http://bncweb.info/
The Compleat Lexical Tutor concordances http://www.lextutor.ca/conc/
ELISA (interviews on film + transcription) http://www.uni-tuebingen.de/elisa/
MICASE Michigan Corpus of Academic Spoken English http://www.lsa.umich.edu/eli/micase/
Oxford English Corpus (more than 2 billion words and counting)
http://dws-sketch.uk.oup.com/bonito/home.html (log-in required - ask Martin Wynne)
Phrases in English (multiword expressions in the BNC) http://phrasesinenglish.org/

Chinese
The Lancaster Corpus of Mandarin Chinese (download from OTA)
http://www.ota.ox.ac.uk/headers/2474.xml

Czech
Czech National Corpus http://ucnk.ff.cuni.cz/

Finnish
Korp – access to various corpora https://korp.csc.fi/

French
ABU: la Bibliothèque Universelle (Online texts) http://abu.cnam.fr/

Ylva Berglund Prytz (ylva.berglund@it.ox.ac.uk) and Martin Wynne (martin.wynne@it.ox.ac.uk)


Corpus français (Université de Leipzig) http://wortschatz.uni-leipzig.de/ws_fra/
Online Concordancers at The Compleat Lexical Tutor French and English corpora with online concordancer
http://www.lextutor.ca/concordancers/

German
Das digitale Wörterbuch der deutschen Sprache http://www.dwds.de/
Institut fűr Deutsche Sprache http://corpora.ids-mannheim.de/

Italian
MultiSemCor English and Italian parallel corpus http://multisemcor.itc.it/

Portuguese
Corpus do Português http://www.corpusdoportugues.org/
COMPARA – parallel Portuguese-English http://www.linguateca.pt/COMPARA/

Russian
Russian National Corpus (Национальный корпус русского языка) http://ruscorpora.ru/

Swedish
Språkbanken (Swedish corpora) http://spraakbanken.gu.se/

Spanish
Corpus del Español http://www.corpusdelespanol.org/
SOL – Spanish Online Concordancias españolas en la Web http://spraakbanken.gu.se/lb/konk/rom2/

Multi-Lingual
Corpuseye Danish project with resources in different languages http://corp.hum.sdu.dk/
Intellitext Online interface to corpora in English, Chinese, Arabic, French, German, Italian, Japanese
http://corpus.leeds.ac.uk/it/
KWICfinder make concordances of webpages http://www.kwicfinder.com/
SACODEYL multi-media, teenagers http://www.um.es/sacodeyl/
WebCorp concordances of from online texts http://www.webcorp.org.uk/

ARCHIVES: TEXT, CORPORA, MEDIA


American Rhetoric project Text, audio and (streaming) video. http://www.americanrhetoric.com
Internet Archive Text, audio, video http://www.archive.org
Oxford Text Archive http://ota.ox.ac.uk/ (see 'Catalogue' and 'Oxford' pages)
OxLip+ for electronic text collections http://oxlip-plus.bodleian.ox.ac.uk/

Ylva Berglund Prytz (ylva.berglund@it.ox.ac.uk) and Martin Wynne (martin.wynne@it.ox.ac.uk)

You might also like