WRITTEN ACADEMIC DISCOURSE IN CORPUS LINGUISTICS 1
WRITTEN ACADEMIC DISCOURSE IN CORPUS LINGUISTICS
MEHWISH PARVEEN (181581)
TEHREEM TAHIR (181585)
AIR UNIVERSITY, Islamabad
WRITTEN ACADEMIC DISCOURSE IN CORPUS LINGUISTICS 2
Corpus linguistics is the study of language through a large collection of machine-readable
textual data. It is a methodology to recognize the frequencies of words in the content. It additionally
centers around the delight and the interest that lie in the portrayal and examination of the English
language. A definitive point is to get familiar with the language and sees well how it functions in
specific content. Collocations in a content give diverse implications relying upon the setting they
are utilized in. these collocations put an effect on the importance of the expressions and lexico-
grammar structures of the content. The significance of the words and collocations are changing
with the progression of time and same collocations and expressions are utilized in various settings
inside a similar content. Corpus here encourages the eyewitness to feature the frequencies and
ready to comprehend the utilization of the expressions and collocations in a precise request
(Gledhill, p.130-131, 2000). English is being instructed as a subject of English for explicit purposes
in various fields, consequently, extraordinary fields have their very own registers and explicit
expressions. Words and collocations utilized in the content build up certain semantic connections
in the content. It tends to be seen that with the progression of time diverse implications were given
to similar words (Nelson, p.217, 2006). The spoken and composed both corpus were taken to
comprehend the semantic affiliations and connections among the words in a particular course of
business English. Which contains 56% composed and 44% spoken corpus to be utilized for the
investigation. To the aftereffect of this investigation, it was discovered that particular words like
"supervisor", "manager" and "head "all were given similar implications that were connected to the
status. Also, these words happened habitually in the corpus (Nelson, p.223, 2006). Moreover, the
concept of concordance can also be seen in the results where the same word is used in different
contexts in the text.
WRITTEN ACADEMIC DISCOURSE IN CORPUS LINGUISTICS 3
Semantic prosody is a term that is utilized to allude to the distinctive relationship between
the implications of a similar word. Here and there it gives the inferred importance and some of the
time it alludes to the unequivocal significance of the word in various settings. It likewise clarifies
through the corpus of lexical choices from the content that how the significance of a similar word
has a positive idea previously and over the timeframe it has got some negative thought. The corpus
methodology really encourages a researcher to find the co-occurrences and the frequencies of the
words in the text. Since there may be the contrasts between the utilization of language or lexical
decisions by the local or non-local speakers of the language. In addition, a few words are utilized
to give the shrouded importance like irony and symbolism which additionally changes the
significance of the words in the specific content (Hunston, 2007).
Collocations in the scholastic composing are utilized broadly throughout the decades in the
educational angles since it helps the non-local students to adapt all the more productively. To
complete such examination the specialists gathered the information from the theoretical corpus
from the web which subsequently demonstrates that collocations are for the most part utilized by
the students while finding the setting in which they are talking on the grounds that without the
setting the greater part of the students are ignorant of the utilization of collocations (Wu, Chang,
Mitamura, & Chang, 2010).
Another use of corpus linguistics has been explained in Cacoullas’ and Walker’s article
“The Present of the English Future: Grammatical Variations and Collocations in Discourse”. The
study used the variationist method in order to explicate the usage of future tense in English with
regard to its different grammatical forms. The researchers argue that the usage patterns of these
tenses and verb forms show that “the choice of form is not determined by invariant semantic
WRITTEN ACADEMIC DISCOURSE IN CORPUS LINGUISTICS 4
readings such as proximity, certainty, willingness, or intention”. On the contrary, there are some
occurrences of each of the general construction which occupy lexical, syntactic, and pragmatic
niches. The research further states that even though the supposed differences in meaning are
largely neutralized in discourse, the particular constructions of different degrees of lexical
specificity reflect the grammaticalization paths. They also bear distinct meanings and firm patterns
of distribution from the formerly meaningful associations. The study concludes that the shape of
grammatical variation is subsidized by the collocations.
Jiang and Conrath’s article “Semantic Similarity based on Corpus Statistics and Lexical
Taxonomy” presents a distinct approach for measuring semantic similarities and differences
among various words and concepts. The researchers have used a combination of lexical taxonomy
structure and corpus statistical information in order to get a more viable analysis of semantic
distance between nodes in semantic space constructed by taxonomy. They have then enhanced the
measure resulted through using an edge counting scheme and the node-based approach of the
information content calculation. The testing of the similarity ratings on a data set of word pair
showed that the proposed approach outperformed the computational models, giving the highest
correlational value with a standard based on human similarity judgements. However, when the
human subjects replicate the same task, an upper bound is observed.
Walker, in his article “A Corpus-Based Study of the Linguistic Features and Processes
Which Influence the Way Collocations Are Formed: Some Implications for the Learning of
Collocations” has examined the collocational behavior of the various grouos of semantically
related verbs and nouns from the business English. According to the results of this study, the
collocational behavior of the lexical items is explainable via the examination of some linguistic
WRITTEN ACADEMIC DISCOURSE IN CORPUS LINGUISTICS 5
features as well as the processes which influence the formation of the collocations. It comprises of
the semantics of the individual items, the usage of metaphors, semantic prosody, and the tendency
for the selected items to become a part of various larger phraseological units. The research exhibits
the possibility of explaining many of the collocations by considering the linguistic features and
processes that influence their formation and claims that the learning process becomes more
effective when the learner looks for the explanations of the collocations.
WRITTEN ACADEMIC DISCOURSE IN CORPUS LINGUISTICS 6
References
Gledhill, C. (2000). The discourse function of collocation in research article introductions. English
for Specific Purposes, 19(2), 115-135.
Nelson, M. (2006). Semantic associations in Business English: A corpus-based analysis. English
for Specific Purposes, 25(2), 217-234.
Hunston, S. (2007). Semantic prosody revisited. International journal of corpus linguistics, 12(2),
249-268.
Wu, J. C., Chang, Y. C., Mitamura, T., & Chang, J. S. (2010, July). Automatic collocation
suggestion in academic writing. In Proceedings of the ACL 2010 Conference Short Papers (pp.
115-119). Association for Computational Linguistics.
Walker, C. P. (2011). A corpus‐based study of the linguistic features and processes which
influence the way collocations are formed: Some implications for the learning of
collocations. Tesol Quarterly, 45(2), 291-312.
Jiang, J. J., & Conrath, D. W. (1997). Semantic similarity based on corpus statistics and lexical
taxonomy. arXiv preprint cmp-lg/9709008.
Cacoullos, R. T., & Walker, J. A. (2009). The present of the English future: Grammatical variation
and collocations in discourse. Language, 321-354.