0% found this document useful (0 votes)

113 views10 pages

German Verb-Noun Collocation Database

The document presents a database of collocations for German verbs and nouns extracted from a statistical grammar model trained on 35 million words of German newspaper text. The database contains verb subcategorization information and verb-noun collocations indicating selectional preferences, as well as adjectival and genitive noun phrase modifiers for nouns and their associated verbal subcategorizations. It also lists over 23,000 German proper name tuples as a special case of noun-noun collocations. A script allows querying the database to retrieve relevant co-occurrence information for a given lexical item.

Uploaded by

Neillohit

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

113 views10 pages

German Verb-Noun Collocation Database

Uploaded by

Neillohit

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 10

See discussions, stats, and author profiles for this publication at: https://www.researchgate.

net/publication/2566519

A Collocation Database for German Verbs and Nouns

Article · March 2003

Source: CiteSeer

CITATIONS READS
7 1,628

2 authors, including:

Sabine Schulte Im Walde

Universität Stuttgart
119 PUBLICATIONS 1,181 CITATIONS

SEE PROFILE

Some of the authors of this publication are also working on these related projects:

Semantic Classification View project

Cross-lingual Sentiment Analysis for Under-resourced Languages View project

All content following this page was uploaded by Sabine Schulte Im Walde on 17 August 2018.

The user has requested enhancement of the downloaded file.

A Collocation Database for German Verbs and Nouns

S ABINE S CHULTE IM WALDE

Institut für Maschinelle Sprachverarbeitung
Universität Stuttgart
Azenbergstraße 12, 70174 Stuttgart, Germany
schulte@ims.uni-stuttgart.de

Abstract

The paper presents a database of collocations for German verbs and nouns. The colloca-
tions are induced from a statistical grammar model, whose parameters have been trained
on 35 million words of German newspaper corpora. Concerning verbs, the database con-
centrates on subcategorisation properties and verb-noun collocations with regard to their
specific subcategorisation relation (i.e. the representation of selectional preferences);
concerning nouns, the database contains adjectival and genitive noun phrase modifiers,
as well as their verbal subcategorisation. As a special case of noun-noun collocations, we
present a list of 23,227 German proper name tuples. All collocation types are combined
by a perl script which can be queried by the lexicographic user in order to filter relevant
co-occurrence information on a specific lexical item. The database is ready to be used for
lexicographic research and exploitation.
1 Introduction
The term collocation refers to the habitual co-occurrence of two lexical items within a specific gram-
matical relationship. The usage of collocations represents a crucial part of the meaning of words,
cf. Harris (1968), and therefore constitutes an essential part of lexical dictionary entries. For exam-
ple, within the lexical entry for the verb essen ‘to eat’, one would expect to find collocational nouns
representing the transitive verb’s direct object choice for food, such as Brot ‘bread‘, Fleisch ‘meat’,
Eis ‘ice-cream’, etc. The manual and computational work of lexicographers is supported by lexical
resources such as collocational databases, which provide coherent combinations of lexical items.
In some approaches on collocation extraction, the definition of collocations is restricted to the
non-compositional and idiosyncratic combination of lexical items. For example, Lin (1999) describes
a method for a general automatic identification of non-compositional phrases, and Krenn and Evert
(2001) extract German support verb constructions and figurative expressions. In contrast to the above
approaches, our notion of collocations refers to their habitual usage.
This paper provides a lexical database of German verb and noun collocations. Concerning verbs,
the database concentrates on subcategorisation properties and verb-noun collocations with regard to
their subcategorisation relation (i.e. the representation of selectional preferences); concerning nouns,
the database contains adjectival and genitive noun phrase modifiers, as well as their verbal subcate-
gorisation. As a special case of noun-noun collocations, we present German proper name tuples.
The collocations are induced from a statistical grammar model, whose parameters have been
trained on a German newspaper corpus: the collocation candidates refer to the empirical co-occurrence
of two lexical items within a specific grammatical relationship; the collocation strength is based on the
probabilistic co-occurrence counts and determined by the lexical association measure log-likelihood
(Dunning, 1993). All collocation types are combined by a perl script which can be queried by the
lexicographic user in order to filter relevant co-occurrence information on a specific lexical item. The
database is ready to be used for lexicographic research and exploitation.
The work is closest to the word sketches for British English in (Kilgarriff and Tugwell, 2001b),
the core of the lexicographic workstation WASP (Kilgarriff and Tugwell, 2001a). Related work by
Lin (1998b; 1999) describes the automatic extraction of both habitual and non-compositional colloca-
tions for English and their usage in various NLP applications, such as the MUC tasks of named entity
recognition and coreference resolution (Lin, 1998c), and semantic clustering (Lin, 1998a). Krenn and
Evert (2001) and Evert and Krenn (2001) concentrate on the influence of lexical association measures
on collocation induction, with reference to the extraction of support verb constructions and figurative
expressions. Zinsmeister and Heid (2002) perform an extraction of German noun-verb collocations
to compare the collocational preferences of compound nouns with those of the respective base nouns,
Zinsmeister and Heid (2003) extract collocation triples of adjective-noun-verb combinations for lex-
icographic use, and Kermes and Heid (2003) use a chunker for the extraction of German verb-noun
and adjective-verb collocations as well as tuples and triples of idiomatic expressions.
The paper is organised as follows. Section 2 describes the induction of collocations, followed by
examples from the collocation database in Section 3. Section 4 refers to evaluation possibilities and
realisations, and Section 5 describes related work on collocations.

2 Collocation Induction
The collocations are induced from a statistical grammar model, which is based on the framework
of head-lexicalised probabilistic context-free grammars (Schulte im Walde et al., 2001). The core
of the grammar model is a context-free grammar for German, which incorporates the lexical heads
of each rule into the grammar. The statistical parser LoPar (Schmid, 2000) performs unsupervised
training on the lexicalised grammar, using 35 million words of a large German newspaper corpus
from the 1990s. The trained grammar provides frequencies for the lexicalised rules and lexical choice
parameters (relations between lexical heads with reference to a grammar rule).
The trained statistical grammar model serves as source for the induction of collocations: The
model provides frequencies f for any two lexical items l1 and l2 co-occurring within a grammar-
specific relationship r : f (l1 ; r; l2 ). For any pair of lexical items within a specific relationship hl1 ; r; l2 i,
the collocation strength of the pair with respect to their relation is calculated by the lexical association
measure log-likelihood. Dunning (1993) introduced the likelihood ratio as a useful tool for measuring
similarity in text analysis, especially with respect to the behaviour of rare events. Among others,
Evert and Krenn (2001) confirm the reliability of the log-likelihood measure in collocation induction,
next to lexical associations based on raw frequencies and the t-score, and emphasise its usage for low
frequency data.
A mathematical re-formulation of Dunning’s log-likelihood ratio for the lexical association of the

log likelihood(l1 ; r; l2 ) =2
X
lexical items l1 and l2 in the relationship r (cf. www.collocations.de) is given in Equation (1):

O ij log
Oij (1)
Eij
ij
O ij and Eij refer to entries in the contingency table for the lexical items l1 and l2 , cf. Table 1: Oij
represent the empirical frequencies in the statistical grammar model, as observed for the lexical items
l1 and l2 within the relationship r . The expected frequencies Eij are calculated as the product of

the respective Oij marginals, normalised by the total frequency N of all relationship r tuples, with
N = O11 + O12 + O21 + O22 .

l2 : l2

l1 O11 = f (l1 ; r; l2 ) O12 = f (l1 ; r; :l2 )

=
O O
( 11 + 12 )( O11 +O21 ) =
O
( 11 + O12 )(O12 +O22 )
E11
N E12
N
: l1 O21 = f( :
l1 ; r; l2 ) O22 = f( :
l1 ; r; l2 ) :
=
O O
( 21 + 22 )( 11 +O O21 ) =
O O
( 21 + 22 )( 12 + O O22 )
E21
N E22
N
Table 1: Contingency table for co-occurrence counts

3 Collocation Database
The collocation database contains various collocation types for German verbs and nouns, where the
types of collocations refer to different relationships with respect to the verbs and nouns. Concern-
ing verbs, the database concentrates on subcategorisation properties and verb-noun collocations with
regard to their specific subcategorisation relation (i.e. the representation of selectional preferences);
concerning nouns, the database contains adjectival and genitive noun phrase modifiers, as well as
their verbal subcategorisation. As a special case of noun-noun collocations, we present German
proper name tuples. Following, we present example entries of the collocation database. Each entry is
accompanied by the respective log-likelihood value (LLH).
Subcategorisation properties of verbs represent an essential part of our linguistic knowledge, since
the verb is central to the meaning and the structure of a sentence. We therefore place emphasis on
subcategorisation-specific aspects of collocations: the lexical association between verbs and nouns
with regard to their specific subcategorisation relation.
The verb-noun collocations are regarded as a particular strength of the collocational database,
since the relationship between verb and noun refers to a fine-grained combination of subcategorisation
frame types and the respective frame roles. The German grammar contains 38 subcategorisation
frame types. Possible arguments in the frames are nominative (n), dative (d) and accusative (a) noun
phrases, reflexive pronouns (r), prepositional phrases (p), expletive es (x), non-finite clauses (i), finite
clauses (s-2 for verb second clauses, s-dass for dass-clauses, s-ob for ob-clauses, s-w for indirect wh-
questions), and copula constructions (k). In the case of prepositional phrase arguments in the frame,
the prepositions in addition refer to case and preposition, such as ‘mitDat ’, ‘fürAkk ’. The verb-noun
collocations are defined with respect to any nominal argument slot within the frame types:
Considering each role of a specific verb-frame combination, the collocations represent nominal
selectional preferences of the verb. Table 2 illustrates examples for the object nouns of the verbs
kaufen ‘to buy’ and reden with the preposition überAkk ‘to talk about’. The verb-noun colloca-
tions in the former table contain things one can buy, as expected. The latter table illustrates that
the range of things to talk about is diverse, with specific attention towards politics and arts.
Considering the collocations with respect to a specific noun, they represent properties of the
noun. Table 3 illustrates an example for the noun Buch ‘book’, accompanied by the verbs
which most prominently subcategorise the noun as direct object. The verbs refer to different
properties of a book, e.g. to its content which is written and read, to the publication process,
and to the item which is borrowed and given back.
In addition to the verb-noun collocations, nouns as the content holder of utterances are described
by their collocational choices. The collocations describe the nouns in question by typical adjective
and genitive modifiers. Table 4 demonstrates an example of adjectival modifiers for the noun Nacht
‘night’. As for the subcategorisation by verbs, the range of adjectives refers to different properties of
the noun, such as the time aspect with respect to the last or the coming night, the appearance of the
night being dark, hot or cold, quiet or disturbed, and the manner of spending the night, e.g. drinking or
sleepless. An example of typical genitive modifiers is given in Table 5 for the noun Zeichen ‘symbol’.
In this case, most modifiers refer to different kinds of states, e.g. time states, abstract mind states such
as hope and confidence, but also to a specific kind of symbol, e.g. D as abbreviation for Deutschland
‘Germany’. As a special case of noun-noun collocations, we induce a list of 23,227 German proper
name tuples; the 20 most prominent combinations in the newspaper corpora are given in Table 6.

Noun LLH Noun LLH

Grundstück ‘site’ 191.945 Geld ‘money’ 97.070
Haus ‘house’ 167.313 Inhalt ‘content’ 57.786
Aktie ‘share’ 143.489 Problem ‘problem’ 54.780
Zeug ‘stuff’ 120.558 Politik ‘politics’ 51.668
Wohnung ‘appartment’ 63.638 Thema ‘topic’ 38.516
Karte ‘map’ 62.493 Ding ‘thing’ 38.189
Produkt ‘product’ 61.167 Koalition ‘coalition’ 35.140
Gelände ‘site’ 55.414 Freiheit ‘freedom’ 33.847
Fleisch ‘meat’ 54.858 Kunst ‘art’ 27.027
Katze ‘cat’ 52.053 Perspektive ‘perspective’ 22.387
Gemüse ‘vegetables’ 51.447 Umfang ‘extent’ 20.269
Auto ‘car’ 51.024 Möglichkeit ‘possibility’ 19.327
Buch ‘book’ 48.355 Konsequenz ‘consequence’ 19.246
Panzer ‘tank’ 47.814 Film ‘movie’ 18.734
Ware ‘goods’ 41.086 Sekte ‘sect’ 18.032
Sache ‘thing’ 39.127 Sex ‘sex’ 17.083
Immobilie ‘real estate’ 38.464 Islam ‘Islam’ 16.018
Gut ‘manor’ 38.021 Besetzung ‘occupation’ 15.418
Milch ‘milk’ 36.630 Detail ‘detail’ 14.819
Schuh ‘shoe’ 35.729 Zölle ‘customs’ 14.706
Table 2: Verb-noun collocations for objects of kaufen ‘to buy’ and reden überAkk ‘to talk about’
Verb LLH
schreiben ‘to write’ 1,172.622
lesen ‘to read’ 573.643
veröffentlichen ‘to publish’ 274.126
führen ‘to keep account of’ 107.207
herausbringen ‘to publish’ 88.072
verfassen ‘to write’ 77.820
publizieren ‘to publish’ 52.625
vorstellen ‘to present’ 50.766
kaufen ‘to buy’ 48.720
zuklappen ‘to close’ 46.816
herausgeben ‘to publish’ 35.326
füllen ‘to fill’ 33.704
mitbringen ‘to bring’ 31.214
verfilmen ‘to film’ 28.364
ausleihen ‘to borrow’ 27.513
zurückgeben ‘to give back’ 27.487
wälzen ‘to read (intensively)’ 22.865
übersetzen ‘to translate’ 18.813
zurückschicken ‘to send back’ 17.991
rezensieren ‘to review’ 17.825
Table 3: Noun-verb collocations for verbs subcategorising Buch ‘book’ as direct object

Adjective LLH
schlaflos ‘sleepless’ 664.577
ganz ‘whole’ 322.272
lang ‘long’ 194.687
durchzecht ‘to spend the night drinking’ 115.659
lau ‘tepid’ 115.366
dunkel ‘dark’ 98.603
still ‘quiet’ 96.963
heilig ‘holy’ 88.313
ruhig ‘quiet’ 76.759
durchwachen ‘to stay awake all night’ 72.451
letzt ‘last’ 69.724
durchzechen ‘to spend the night drinking’ 68.247
heiSS ‘hot’ 66.826
darauffolgen ‘following’ 59.217
rauschen ‘great’ (idiomatic) 57.369
unruhig ‘disturbed’ 55.044
vorletzt ‘last but one’ 44.006
neu ‘new’ 43.896
vergehen ‘last’ 40.477
kalt ‘cold’ 38.652
Table 4: Adjectival modifiers to noun Nacht ‘night’
NounGen LLH
Zeit ‘time’ 166.272
Trauer ‘mourning’ 111.050
Solidarität ‘solidarity’ 110.368
Schwäche ‘weakness’ 107.896
Hoffnung ‘hope’ 101.726
Dank ‘thanks’ 54.810
Protest ‘protest’ 53.644
Verfall ‘decline’ 39.737
Stern ‘star’ 37.870
Ermutigung ‘encouragement’ 37.621
Wille ‘will’ 35.720
Jubiläum ‘anniversary’ 33.582
Bereitschaft ‘willingness’ 27.524
Versöhnung ‘conciliation’ 27.289
Zuversicht ‘confidence’ 27.029
D ‘D(eutschland)’ 26.329
Resignation ‘resignation’ 24.010
Unzufriedenheit ‘unhappiness’ 24.010
Wachstum ‘increase’ 22.740
Freundschaft ‘friendship’ 22.676
Wende ‘change’ 22.318
Ernsthaftigkeit ‘seriousness’ 21.163
Migration ‘migration’ 19.631
Würde ‘dignity’ 19.038
Table 5: Genitive modifiers to noun Zeichen ‘symbol’

Proper Name LLH Proper Name LLH

New York 8,955.388 Willy Brandt 2,694.888
Helmut Kohl 6,586.359 Bad Vilbel 2,444.396
Saddam Hussein 5,611.021 Rose Hausen 2,315.475
George Bush 3,976.309 Gregor Gysi 2,256.899
Bill Clinton 3,961.956 Erich Honecker 2,243.533
Bad Homburg 3,568.071 Nelson Mandela 2,175.772
Theo Waigel 3,145.698 Rita Süssmuth 2,151.375
Boris Jelzin 2,860.349 Tel Aviv 2,093.286
Oskar Lafontaine 2,825.231 Björn Engholm 1,908.901
Steffi Graf 2,778.741 Joschka Fischer 1,887.982
Table 6: (German) Proper name tuples
4 Evaluation
The evaluation of automatically produced semantic information is a difficult task. Introspection (es-
pecially by the lexicographer producing the lexical information) is unreliable, since it cannot prove
the value of the data in an objective way. An evaluation grounded on the usage of the data, cf. Kil-
garriff and Tugwell (2001b), is a proof of the usefulness of the data, but cannot judge the data in an
objective (numerical) way either. In few cases, existing manual resources such as dictionaries and
thesauri are available. In most other cases, the only objective way to judge about the semantic useful-
ness of the data is to integrate the information into NLP applications and hope for an improvement.
For example, in some languages the framework of SENSEVAL provides an opportunity to utilise and
evaluate semantic information for improving word sense disambiguation.
Concerning this work, the collocational data is evaluated in parts. The subcategorisation frame
descriptions underlying any verb-noun collocations are formally evaluated by comparing the auto-
matically generated verb frames of over 3,000 verbs against manual definitions in the German dictio-
nary Duden – Das Stilwörterbuch (Dudenredaktion, 2001). The F-score is 65.30% with and 72.05%
without prepositional phrase information: the automatically generated data is both easy to produce
in large quantities and reliable enough to serve as proxy for human judgement (Schulte im Walde,
2002). However, the evaluation does only refer to the structural verb frame types; so far, no semantic
information has been compared to dictionary entries.
The proper names are evaluated against their appearance in the training corpus: 200 proper names
are randomly chosen from the list of 23,227 German proper name tuples. The proper names are
looked up in the training corpus: in case they are correctly induced from the corpus data, they are
judged correct, otherwise they are false positives. The overall precision of the proper name database
is 65.33%.
For the main part of the semantic collocation data we do not provide an evaluation yet, and
SENSEVAL does not include German and therefore drops out of the evaluation possibilities. But the
data are ready to be used in lexicographic research and exploitation, in order to prove them useful by
utilisation.

5 Related Work
This work was inspired by and is therefore closest to the word sketches for British English as described
in (Kilgarriff and Tugwell, 2001b). Kilgarriff and Tugwell define a collocation database on basis of
26 grammatical relations between two lexical items, as found in the British National Corpus. The
strength of their collocations is estimated by a salience measure combining mutual information and
the logarithm of the co-occurrence count. In addition to presenting the collocations and a measure
of strength, the co-occurrences are linked to corpus positions, to facilitate the recovery of the related
word pair. The word sketches have been used for years and proven valuable by lexicographers in a
dictionary project. Compared to (Kilgarriff and Tugwell, 2001b), the German collocation database is
less extensive with respect to the number of different relationships, and the linking to corpus positions
is not implemented. In contrast, the German grammar specialises in the subcategorisation behaviour
of the verbs, which results in a fine-grained lexical collocation resource of verb frames and selectional
preferences.
Lin (1998b; 1999) uses a dependency parser to extract collocations from corpora. In (Lin, 1998b),
he concentrates on the extraction of habitual collocations, in (Lin, 1999) on the extraction of non-
compositional collocations. In both cases, the same methodology is applied: the strength of the col-
locations is determined by mutual information. Lin (1998b) evaluates the collocation tuple extraction
by comparing all extracted collocations to those in a treebank for a different corpus, but he does not
evaluate the semantic content of the collocations. Lin (1999) compares the non-compositional collo-
cations to an English Idioms Dictionary, which results in precision and recall values of approx. 15%.
He justifies the low evaluation results by showing that also manual dictionaries evaluated against each
other show remarkably low PR-results. In (Lin, 1998a), he compares thesaurus entries based on the
similarity of word collocations with entries in the manually constructed thesauri WordNet and Roget
and shows a significantly closer similarity to WordNet than Roget. (Lin, 1998c) successfully ap-
plies the collocation information to concrete NLP tasks, the named entity recognition and coreference
resolution in MUC-7.
Evert and Krenn (Krenn and Evert, 2001; Evert and Krenn, 2001) study the extraction of collo-
cations from corpora from a specific point of view. They extract collocation candidates for adjective
pairs, support verb constructions and figurative expressions and compare the application of different
measures of lexical association in order to filter non-compositional collocations. For the evaluation,
they provide an extensive set of the collocation types, manually annotated with the collocation judge-
ment.
Zinsmeister and Heid (2002) perform an extraction of noun-verb collocations by full parsing,
whose results represent the basis for comparing the collocational preferences of compound nouns
with those of the respective base nouns. The insights are used to improve the lexicon of the statistical
parser. Zinsmeister and Heid (2003) present an approach for German collocations with collocation
triples: Five different formation types of adjectives, nouns and verbs are extracted from the most
probable parses of German newspaper sentences, using the same statistical grammar model as under-
lying this work. The collocation candidates are determined automatically and then manually filtered
for lexicographic use. Kermes and Heid (2003) utilise a recursive chunker to annotate German cor-
pus data with complex phrase structures. The chunks specify lemma information, morpho-syntactic
features and coarse semantic properties. Manually defined search routines extract verb-noun and
adjective-verb collocations as well as tuples and triples of idiomatic expressions.
The illustration of related work on collocations shows that our approach of German lexical collo-
cations is not the first one, but differently to previous approaches our database contains more variable
collocation types and pays specific attention towards the variety of verb subcategorisation aspects.
The database is in general more restricted than the English pendants, but more detailed with respect
to a fine-grained lexical resource of verb frames and selectional preferences. Most approaches on
collocation extraction suffer from the difficulty of evaluating the collocation information.

6 Summary
This paper presented a database of collocations for German verbs and nouns. Specific attention is
paid towards the variety of verbal subcategorisation aspects, ranging from selectional preferences of
verbs with respect to a particular subcategorisation environment, to nominal properties as given by
their diverse modifiers. As a special case of noun-noun collocations, we presented a list of 23,227
German proper name tuples with 65.33% precision.
All collocation types are combined by a perl script which can be queried by the lexicographic
user in order to filter relevant co-occurrence information on a specific lexical item. The database is
ready to be used for lexicographic research and exploitation. So far, an evaluation is provided for the
underlying structural verb-frame definitions and the proper name database.

References
Dudenredaktion, editor. DUDEN – Das Stilwörterbuch. Number 2 in ‘Duden in zwölf Bänden’.
Dudenverlag, Mannheim, 8th edition, 2001.

Ted Dunning. Accurate Methods for the Statistics of Surprise and Coincidence. Computational
Linguistics, 19(1):61–74, 1993.
Stevan Evert and Brigitte Krenn. Methods for the Qualitative Evaluation of Lexical Association Mea-
sures. In Proceedings of the 39th Annual Metting of the Association for Computational Linguistics,
Toulouse, France, 2001.
Zellig Harris. Distributional Structure. In Jerold J. Katz, editor, The Philosophy of Linguistics, Oxford
Readings in Philosophy, pages 26–47. Oxford University Press, 1968.
Hannah Kermes and Ulrich Heid. Using Cunked Corpora for the Acquisition of Collocations and
Idiomatic Expressions. In Proceedings of the 7th Conference on Computational Lexicography and
Text Research, Budapest, Hungary, 2003. This volume.
Adam Kilgarriff and David Tugwell. WASP-Bench: an MT Lexicographers’ Workstation Support-
ing State-of-the-art Lexical Disambiguation. In Proceedings of the MT Summit VII, Santiago de
Compostela, Spain, 2001a.
Adam Kilgarriff and David Tugwell. WORD SKETCH: Extraction and Display of Significant Collo-
cations for Lexicography. In Proceedings of the ACL Workshop on Collocations, Toulouse, France,
2001b.

Brigitte Krenn and Stefan Evert. Can we do better than Frequency? A Case Study on Extracting
PP-Verb Collocations. In Proceedings of the ACL Workshop on Collocations, Toulouse, France,
2001.
Dekang Lin. Automatic Retrieval and Clustering of Similar Words. In Proceedings of the 17th
International Conference on Computational Linguistics, Montreal, Canada, 1998a.
Dekang Lin. Extracting Collocations from Text Corpora. In Proceedings of the First Workshop on
Computational Terminology, Montreal, Canada, 1998b.
Dekang Lin. Using Collocation Statistics in Information Extraction. In Proceedings of the 7th Mes-
sage Understanding Conference, 1998c.

Dekang Lin. Automatic Identification of Non-compositional Phrases. In Proceedings of the 37th

Annual Meeting of the Association for Computational Linguistics, Maryland, MD, 1999.
Helmut Schmid. Lopar: Design and Implementation. Arbeitspapiere des Sonderforschungsbere-
ichs 340 Linguistic Theory and the Foundations of Computational Linguistics 149, Institut für
Maschinelle Sprachverarbeitung, Universität Stuttgart, 2000.
Sabine Schulte im Walde. Evaluating Verb Subcategorisation Frames learned by a German Statis-
tical Grammar against Manual Definitions in the Duden Dictionary. In Proceedings of the 10th
EURALEX International Congress, pages 187–197, Copenhagen, Denmark, 2002.

Sabine Schulte im Walde, Helmut Schmid, Mats Rooth, Stefan Riezler, and Detlef Prescher. Statistical
Grammar Models and Lexicon Acquisition. In Christian Rohrer, Antje Rossdeutscher, and Hans
Kamp, editors, Linguistic Form and its Computation. CSLI Publications, Stanford, CA, 2001.
Heike Zinsmeister and Ulrich Heid. Collocations of Complex Words: Implications for the Acqui-
sition with a Stochastic Grammar. In International Workshop on ‘Computational Approaches to
Collocations’, Vienna, Austria, 2002.
Heike Zinsmeister and Ulrich Heid. Significant Triples: Adjective+Noun+Verb Combinations. In
Proceedings of the 7th Conference on Computational Lexicography and Text Research, Budapest,
Hungary, 2003. This volume.

View publication stats

Complex 03
No ratings yet
Complex 03
10 pages
Adjective Noun Collocates in German
No ratings yet
Adjective Noun Collocates in German
10 pages
2005 Exploratory Collocation Extraction
No ratings yet
2005 Exploratory Collocation Extraction
4 pages
Collocation & Corpus Linguistics
No ratings yet
Collocation & Corpus Linguistics
14 pages
Anthology-New O O08 O08-1003
No ratings yet
Anthology-New O O08 O08-1003
15 pages
Luận Văn Extraction of Vietnamese Collocation From Text Corpora
No ratings yet
Luận Văn Extraction of Vietnamese Collocation From Text Corpora
16 pages
Post 0413
No ratings yet
Post 0413
2 pages
Making Sense of Collocations: Leo Wanner, Bernd Bohnet, Mark Giereth
No ratings yet
Making Sense of Collocations: Leo Wanner, Bernd Bohnet, Mark Giereth
16 pages
On The Presentation of Collocations in Monotingual
No ratings yet
On The Presentation of Collocations in Monotingual
10 pages
Dirk Siepmann COLLOCATION, COLLIGATION AND
No ratings yet
Dirk Siepmann COLLOCATION, COLLIGATION AND
40 pages
Bahns Jens Lexical Collocations A Contrastive View
No ratings yet
Bahns Jens Lexical Collocations A Contrastive View
8 pages
TESOL Quarterly - 2012 - WALKER - A Corpus Based Study of The Linguistic Features and Processes Which Influence The Way
No ratings yet
TESOL Quarterly - 2012 - WALKER - A Corpus Based Study of The Linguistic Features and Processes Which Influence The Way
22 pages
An Automatic Chinese Collocation Extraction Algori
No ratings yet
An Automatic Chinese Collocation Extraction Algori
7 pages
The Collocation in French
0% (1)
The Collocation in French
8 pages
Bahns J. Lexical Collocations: A Contrastive View
75% (4)
Bahns J. Lexical Collocations: A Contrastive View
8 pages
A Comparative Evaluation of Collocation Extraction Techniques
No ratings yet
A Comparative Evaluation of Collocation Extraction Techniques
7 pages
(Violeta Seretan) Syntax-Based Collocation Extract (B-Ok - CC) PDF
100% (1)
(Violeta Seretan) Syntax-Based Collocation Extract (B-Ok - CC) PDF
215 pages
Art66 PDF
No ratings yet
Art66 PDF
9 pages
Chapter 1.2
No ratings yet
Chapter 1.2
47 pages
50-Something Years of Work On Collocations
No ratings yet
50-Something Years of Work On Collocations
29 pages
Compound Noun Semantics Analysis
No ratings yet
Compound Noun Semantics Analysis
167 pages
Language Corpora and Lexis (2) : Assistant Professor Supakorn Phoocharoensil, PH.D
No ratings yet
Language Corpora and Lexis (2) : Assistant Professor Supakorn Phoocharoensil, PH.D
25 pages
Vocabulary Applied Linguistic Perspectives - (3.4 Collocation and Style)
No ratings yet
Vocabulary Applied Linguistic Perspectives - (3.4 Collocation and Style)
3 pages
Collocation and Corpus Linguistics
No ratings yet
Collocation and Corpus Linguistics
10 pages
Wagner Nesselhauf
No ratings yet
Wagner Nesselhauf
2 pages
Alo Gar Vin 2014 A
No ratings yet
Alo Gar Vin 2014 A
12 pages
English Adjective-Noun Collocations
No ratings yet
English Adjective-Noun Collocations
21 pages
English Adjective Collocations
No ratings yet
English Adjective Collocations
17 pages
Collocation and The Arabic-English Dictionary
No ratings yet
Collocation and The Arabic-English Dictionary
22 pages
4
No ratings yet
4
47 pages
Collocation 3. Colligation 4. Combining Collocation and Colligation Analyses 5. Conclusion References
No ratings yet
Collocation 3. Colligation 4. Combining Collocation and Colligation Analyses 5. Conclusion References
23 pages
2005 Using Small Random Samples
No ratings yet
2005 Using Small Random Samples
21 pages
2003 Evert PDF
No ratings yet
2003 Evert PDF
4 pages
Collocation Dictionary of English and German
100% (5)
Collocation Dictionary of English and German
204 pages
Vocabulary Applied Linguistic Perspectives - (3.5 Collocation and Grammar)
No ratings yet
Vocabulary Applied Linguistic Perspectives - (3.5 Collocation and Grammar)
5 pages
The Contribution of Dictionary Use To The Production and Retention of Collocations in A Second Language
No ratings yet
The Contribution of Dictionary Use To The Production and Retention of Collocations in A Second Language
22 pages
Oxford Collocations Dictionary For Students of English: International Journal of Lexicography March 2003
No ratings yet
Oxford Collocations Dictionary For Students of English: International Journal of Lexicography March 2003
6 pages
Video v2
No ratings yet
Video v2
43 pages
English Adjectives of Comparison Lexical and Gramm PDF
100% (1)
English Adjectives of Comparison Lexical and Gramm PDF
408 pages
Breban - Tine.2010. English - Adjectives.and - Comparison
100% (2)
Breban - Tine.2010. English - Adjectives.and - Comparison
408 pages
Semantic Density Analysis: Comparing Word Meaning Across Time and Phonetic Space
No ratings yet
Semantic Density Analysis: Comparing Word Meaning Across Time and Phonetic Space
8 pages
Malgorzata Martynska Inve11
No ratings yet
Malgorzata Martynska Inve11
12 pages
O. Introduction.: Lau - Hans Dieter Lutz
No ratings yet
O. Introduction.: Lau - Hans Dieter Lutz
20 pages
4TH Group English V Research Work
No ratings yet
4TH Group English V Research Work
10 pages
Multi-Word Items in English
100% (1)
Multi-Word Items in English
12 pages
167-Texto Del Artículo-329-1-10-20190521
No ratings yet
167-Texto Del Artículo-329-1-10-20190521
13 pages
Batak Toba Adjective-Noun Collocations
No ratings yet
Batak Toba Adjective-Noun Collocations
9 pages
Week 3 Corpora, Collocations and The Study of Patterns NEW
No ratings yet
Week 3 Corpora, Collocations and The Study of Patterns NEW
50 pages
Adverb+Verb Collocations in Chinese Learner English
No ratings yet
Adverb+Verb Collocations in Chinese Learner English
28 pages
Collocation - Applications and Implications PDF
No ratings yet
Collocation - Applications and Implications PDF
258 pages
1 s2.0 S0024384124000846 Main
No ratings yet
1 s2.0 S0024384124000846 Main
19 pages
Lexical Semantics: Fall - Winter 2009 - 2010
No ratings yet
Lexical Semantics: Fall - Winter 2009 - 2010
30 pages
Collocation, Anchoring, and The Mental Lexicon - An Ontogenic Perspective
No ratings yet
Collocation, Anchoring, and The Mental Lexicon - An Ontogenic Perspective
32 pages
Shin Nation Elt
No ratings yet
Shin Nation Elt
15 pages
Module 4 The Demands of Society From The Teacher As A Person
100% (2)
Module 4 The Demands of Society From The Teacher As A Person
9 pages
The Body Electronics Experience With Illia
No ratings yet
The Body Electronics Experience With Illia
92 pages
HTIC 2023 Abstract Miroslaw Czak Dynamic and Static Posture Infl Music Entrain
No ratings yet
HTIC 2023 Abstract Miroslaw Czak Dynamic and Static Posture Infl Music Entrain
1 page
Tips For Being An Effective Advocate
No ratings yet
Tips For Being An Effective Advocate
3 pages
Final Compiled Module - Bimbingan BI
No ratings yet
Final Compiled Module - Bimbingan BI
129 pages
Understanding Vague Language
No ratings yet
Understanding Vague Language
4 pages
Statement On Scientific Temper
No ratings yet
Statement On Scientific Temper
5 pages
Gestalt Therapy Explained - History, Definition and Examples
No ratings yet
Gestalt Therapy Explained - History, Definition and Examples
1 page
Field Thoery
No ratings yet
Field Thoery
44 pages
Nietzsche and Schopenhauer On The Self A
100% (1)
Nietzsche and Schopenhauer On The Self A
35 pages
DLL 16
No ratings yet
DLL 16
3 pages
English For Mechanical Engineering Student's Book 3: 1. Overall Objectives
No ratings yet
English For Mechanical Engineering Student's Book 3: 1. Overall Objectives
13 pages
Ics 2308 Schedule
No ratings yet
Ics 2308 Schedule
3 pages
Encyclopedia Gamification Education2023
No ratings yet
Encyclopedia Gamification Education2023
22 pages
RPL - Full Lesson Plan - Maymight - S4
No ratings yet
RPL - Full Lesson Plan - Maymight - S4
5 pages
Scope of Industrial Psychology
No ratings yet
Scope of Industrial Psychology
2 pages
What Are Some Examples of Metacognition
No ratings yet
What Are Some Examples of Metacognition
6 pages
Effective Communication Strategies
No ratings yet
Effective Communication Strategies
69 pages
Understanding Constructed Response in Math
No ratings yet
Understanding Constructed Response in Math
21 pages
Reading Az White Paper PDF
No ratings yet
Reading Az White Paper PDF
0 pages
Managing Groups and Teams Michael Grinder
100% (1)
Managing Groups and Teams Michael Grinder
36 pages
Håndbok I Grammatik Og Språkbruk - Partial Amateur Translation
No ratings yet
Håndbok I Grammatik Og Språkbruk - Partial Amateur Translation
13 pages
2019 Me 0174
No ratings yet
2019 Me 0174
52 pages
Capstoneproposal Emorris
No ratings yet
Capstoneproposal Emorris
26 pages
Business Communication Final Exam Questions (BC)
No ratings yet
Business Communication Final Exam Questions (BC)
5 pages
SCT - QB - Anwers - p1
No ratings yet
SCT - QB - Anwers - p1
53 pages
Vocabulary French
100% (1)
Vocabulary French
48 pages
Resume - Telschow
No ratings yet
Resume - Telschow
1 page
Career Myths for Young Adults
No ratings yet
Career Myths for Young Adults
7 pages
CTHEORY THEOR1 Virilio
100% (1)
CTHEORY THEOR1 Virilio
9 pages

German Verb-Noun Collocation Database

Uploaded by

German Verb-Noun Collocation Database

Uploaded by

See discussions, stats, and author profiles for this publication at: https://www.researchgate.

A Collocation Database for German Verbs and Nouns

Article · March 2003

Sabine Schulte Im Walde

Semantic Classification View project

Cross-lingual Sentiment Analysis for Under-resourced Languages View project

The user has requested enhancement of the downloaded file.

S ABINE S CHULTE IM WALDE

l1 O11 = f (l1 ; r; l2 ) O12 = f (l1 ; r; :l2 )

Noun LLH Noun LLH

Proper Name LLH Proper Name LLH

Dekang Lin. Automatic Identification of Non-compositional Phrases. In Proceedings of the 37th

View publication stats

You might also like