CORPUS LINGUSITICS RESEARCH

ALL ISSUE

Year :

Volume :

ColloGram : A Collocation Family Analysis Program

Dongkwang Shin,Yuah Chon,Shinwoong Lee,Myongsu Park

CORPUS LINGUSITICS RESEARCH :: Vol.2 No. pp.36-36

Abstract

ColloGram : A Collocation Family Analysis Program ×

The analyses of collocations in the existing collocation programs have often been based on the repetition of ‘N-gram’ patterns rather than a specific collocation list. In comparison, ColloGram bases its analysis on a collocation list from the Corpus of Contemporary American English (COCA) (1990-2015), a 5 hundred-million-word corpus. For the development of ColloGram, the corpus compiled during the period of 1990-2009 (a 4 hundred- million-word corpus), which became available in 2014 to the public, was utilized for the program. ColloGram is a collocation analysis program named from the compound, Collocation and N-gram or Program. The functions of this program benchmarked those of the RANGE, the vocabulary analysis program, by Heatley and Nation (2002).

Download PDF Export Citation

ColloGram : A Collocation Family Analysis Program ×

EndNote
RefWorks
Scholar's Aid
BibTeX

Export Citation Cancel

Corpus-based Study of - free Compounds

Hongwei Zhan,Sihong Huang

CORPUS LINGUSITICS RESEARCH :: Vol.2 No. pp.37-37

Abstract

Corpus-based Study of - free Compounds ×

Using data from the 400-million-word Corpus of Historical American English (COHA) and the 450-million-word Corpus of Contemporary American English (COCA), this study investigates both diachronically and synchronically the use of the -free compound and its counterparts, the free of/from phrases. A close examination of the frequency, distribution, and structural and semantic functions of the constructions yields the following key findings. First, frequency-wise, free of has exhibited a steady slow growth, and free from has declined dramatically; in contrast -free has increased enormously, although its archaic use in the senses of ‘free to’ and ‘free with’ disappeared by the 1940s. Second, the -free compound boasts a high potential productivity index. Third, while both the compound and the phrasal constructions may be used as predicative and postnominal adjectives, objective complements, and adverbials, only the -free compound is used as a prenominal (attributive) adjective. Fourth, whereas the -free compound is used almost exclusively nonreferentially, the free of/from constructions are used significantly more referentially. Fifth, even in the contexts where the constructions may all be used, often only one is allowed or preferred due to certain internal structural and semantic factors. Finally, condensation of information along with changes in language and life styles appears to have driven the increased use of the compound over its phrasal counterparts, although the phrase free of charge has resisted the change: its high frequency likely has blocked its *charge-free compound counterpart.

Download PDF Export Citation

Corpus-based Study of - free Compounds ×

EndNote
RefWorks
Scholar's Aid
BibTeX

Export Citation Cancel

A Corpus-Based Analysis of Kasum ‘Chest/Breast' in Korean

Haeyeon Kim

CORPUS LINGUSITICS RESEARCH :: Vol.2 No. pp.38-39

Abstract

A Corpus-Based Analysis of Kasum ‘Chest/Breast' in Korean ×

In recent years, much research has been carried out on body-part terms in expressing metonymic and metaphorical notions in terms of embodiment in cognitive linguistics. Among the terms, heart-centered expressions have been widely explored in English and other languages to show the significance of our bodily experiences in expressing our emotions as well as interpersonal or social relationships. However, little research has been carried out on the metaphoric expressions of chest or breast (cf. Kövecses 2002, Lakoff & Johnson 1980). The purpose of this research is to explore literal and metaphoric meanings of kasum ‘chest/ breast’ in a Korean written corpus in terms of corpus linguistics and conceptual metaphor (Perez 2008, Berendt, et al. n.d.). This research analyzes 2,373 tokens of kasum, 114 ces-kasum ‘milk-breast’ and 355 simcang ‘heart’ from the Sejong Project Corpus. Cognitive linguistic research has shown that the head is usually related to mind/reason but the heart to emotion/feelings in many languages. However, examination of the Korean data shows that simcang is used literally in most cases as a technical term in medical contexts, and that ces-kasum is used literally in most cases to refer to a woman’s breast. Unlike these two body-part terms, kasum is used as a cover term to express not only the upper body part literally but also an entity as a locus and perceiver of emotion/ feelings metaphorically. Examination of the corpus data shows the following major findings: (i) 819 tokens (34.5%) are used to refer to the body part chest literally, 1,512 tokens (63.7%) metaphorically (Deignan 2005), (ii) 921 tokens (38.8%) are used to show physiological responses in expressing emotion/feelings, using such terms as “the chest is aching, pounding, choking/being blocked, trembling, sinking, etc.”, (iii) 412 tokens (17.4%) are used to show kasum is a locus for emotion. As these findings show, in Korean, as folk knowledge contrary to the scientific knowledge about the roles of the brain, kasum is viewed as a locus or perceiver for emotional feelings in response to physical/ physiological stimuli. The findings show that the folk knowledge forms the basis for conceptual metaphor for kasum: (i) a locus for feelings, (ii) a container of emotions, (iii) an entity/a material, (iv) a storage/hiding place, etc. Finally, this research shows that in Korean, unlike in English and some other languages, kasum plays an important role in expressing emotional feelings, displaying conceptual metaphorical meanings derived from folk knowledge.

Download PDF Export Citation

A Corpus-Based Analysis of Kasum ‘Chest/Breast' in Korean ×

EndNote
RefWorks
Scholar's Aid
BibTeX

Export Citation Cancel

A Corpus-based Analysis of ‘Safety' and ‘Security' in Maritime English

Wenyu Lu

CORPUS LINGUSITICS RESEARCH :: Vol.2 No. pp.40-40

Abstract

A Corpus-based Analysis of ‘Safety' and ‘Security' in Maritime English ×

This study examines collocates of two near-synonyms safety and security, drawing data from two general corpora (BNC Written and COCA Written) and one self-built specialized corpus (IMO Corpus). Collocates are extracted by Wordsmith 6.0; then a network analysis is given on some interesting collocates. Firstly, diachronic analysis exhibits a stable usage of safety whereas two big leaps of security in 1920-1930 and in 2000. In addition, it has been discovered that some generalizations made by industrial and information fields cannot be applied to the maritime-related words in a specialized corpus. Specifically, non-maritime-related general collocates tend to show very clear preference on safety and security in IMO corpus whereas maritime-related words have various behaviors when collocating with safety and security. Thirdly, network analysis by Netminer 4.0 is given on some shared interesting collocates such as maritime and ship, providing the brokerage role analysis for the differences between safety and security. At last, a semantic domain network analysis is displayed to explain how near-synonyms are different and similar based on their common and exclusive semantic domains.

Download PDF Export Citation

A Corpus-based Analysis of ‘Safety' and ‘Security' in Maritime English ×

EndNote
RefWorks
Scholar's Aid
BibTeX

Export Citation Cancel

Corpus-based Analysis on Gendered Items in Hip-hop and Country Song Lyrics

Jihye Shin

CORPUS LINGUSITICS RESEARCH :: Vol.2 No. pp.41-41

Abstract

Corpus-based Analysis on Gendered Items in Hip-hop and Country Song Lyrics ×

Song lyrics reflect the society linguistically and ideologically and in turn have a social and linguistic power to influence the society. Thus, lyrics can be used to gain insights into social beliefs, namely gender representations and stereotypes. This study sets out to investigate the representations of male and female in lyrics using a corpus-based approach. Focusing on the lemmas GIRL, WOMAN, BOY, and MAN, it examines what these gender-marked items refer to and the way males and females are portrayed in hip-hop and country music. The results show that the lemmas refer to a number of different things that are not limited to their literal meanings. In fact, some of them may not even be easily found in the definitions provided in the dictionary. Although they frequently refer to adult males and females in the lyrics of both types of music, it was revealed that some uses of the lemmas were unique to a particular music genre. Additionally, stereotypical representations of gender seem to prevail in song lyrics; although some differences can be found across music genres, females are often sexually objectified and are associated with beauty and emotional intemperance whereas males are portrayed as active, aggressive, and confident. This suggests that, as a type of text that reflects social beliefs, lyrics can be useful for raising awareness of the aspects of society, such as prevalent gender stereotypes. Therefore, lyrics may be used as an authentic material that contains rich sociolinguistic information on gender representations for linguistic and cultural lessons.

Download PDF Export Citation

Corpus-based Analysis on Gendered Items in Hip-hop and Country Song Lyrics ×

EndNote
RefWorks
Scholar's Aid
BibTeX

Export Citation Cancel

A Corpus-driven Research on the Relationship between Subjectival Position and Syntactic Complexity in English Sentences

Yang Yu

CORPUS LINGUSITICS RESEARCH :: Vol.2 No. pp.42-42

Abstract

A Corpus-driven Research on the Relationship between Subjectival Position and Syntactic Complexity in English Sentences ×

This article examines the relationship between sentential subjectival position and sentential syntactic complexity using the written section of the ICE-GB as the data source. The information the subject bears is generally regarded as given, with new information provided by other functional classes after it. This means that generally the part of the sentence after the verb would more elaborate, hence longer, than the one before it, manifesting the principle of end weight. Since the English language is basically an SVO language, this seems to suggest that the position of the subject in the sentence would have a certain relationship with sentence length, hence sentence complexity. The result shows the relationship between sentential syntactic complexity and sentential structural variation in number of sentences with different structures is bell-shaped, which can be described with Nemcová and Serdelová’s synonyms and word length model. The sentential subjects appear in 46 different positions in the sentence, but the predominant position is sentence initial. Generally, the sentential subjectival position is an indicator of sentential syntactic complexity; the larger the sentential subjectival position, the more syntactically complex the sentence. This phenomenon, apart from rhetorical and stylistic reasons, is due to the principle of end weight and communication dynamism in the sentence.

Download PDF Export Citation

A Corpus-driven Research on the Relationship between Subjectival Position and Syntactic Complexity in English Sentences ×

EndNote
RefWorks
Scholar's Aid
BibTeX

Export Citation Cancel

A Multifactorial Analysis of Can and May in Three East Asian EFL Learners' Writings

Yong-hun Lee,Tae-Jin Yoon,Yong-cheol Lee,Yeonkyung Park

CORPUS LINGUSITICS RESEARCH :: Vol.2 No. pp.43-43

Abstract

A Multifactorial Analysis of Can and May in Three East Asian EFL Learners' Writings ×

This paper investigated English modal auxiliary verbs can and may in three East Asian EFL learners’ writings (Chinese, Japanese, and Korean). For the investigation, two different types of corpora were adopted. One is the ICE-USA corpus which included the writings of the English as a Native Language (ENL) speakers and the TOEFL11 corpus which contained the writings of the English as a Foreign Language (EFL) learners. Among the eleven sections of the TOEFL11 corpus, three components were selected in the analysis. This paper was theoretically based on Bates and MacWhinney’s Competition Model (1982, 1989), and four language models were statistically constructed. These models were constructed as follows. From the four different corpora, all the sentences with can and may were extracted, and twenty linguistic factors were manually encoded. Then, the statistical models were constructed based on the encoded corpus data, and similarities and differences were analyzed in the statistical models. Two different types of statistical analyses were adopted. One was a logistic regression, and the other was a Behavioural Profiles analysis. Through the analysis, the following facts were observed: (i) six linguistic factors were involved in the choice of alternation, (ii) eight linguistic factors were interact with L1 which made three East Asian EFL learners’ writings non-native, and (iii) the uses of can and may by the three East Asian EFL learners were different from those of the ENL speakers.

Download PDF Export Citation

A Multifactorial Analysis of Can and May in Three East Asian EFL Learners' Writings ×

EndNote
RefWorks
Scholar's Aid
BibTeX

Export Citation Cancel

A Corpus Stylistic Analysis of the Characteristics of ESL Film Scripts Written by Chinese Students in an English-medium University in Mainland China

Yan Zhao

CORPUS LINGUSITICS RESEARCH :: Vol.2 No. pp.44-44

Abstract

A Corpus Stylistic Analysis of the Characteristics of ESL Film Scripts Written by Chinese Students in an English-medium University in Mainland China ×

Recent years have witnessed increasing attention to L2 creative writing not only in language classrooms but also in various disciplinary settings, in this case, an English scriptwriting module for Chinese university students doing Communication Studies. This text-based L2 writing study investigates the stylistic characteristics emerging from 54 pieces of film scripts written in English submitted by the students upon completion of the module. The above investigation is done through comparison of the student scripts to 16 selected professional English film scripts representing different genres. The particular pedagogical approaches for this scriptwriting module are briefly explained. Nevertheless, the study is corpus-driven. It utilises Keyword analysis (Scott, 2015) and Key Semantic Domain analysis (Rayson, 2008) to locate noticeable ideological and discoursal features of the student scripts particularly concerning: the attempts at dramatic tension; effort at visual details; and deviation from conventions of scriptwriting in English. Two focal students were selected to see the trajectories of changes regarding individual writers’ textual features throughout the workshops to the final film script. The above is facilitated by Cluster analysis which reveals ‘local textual functions’ (Mahlberg, 2007). In addition, qualitative interpretation of the focal students’ writing and their reflective comments was also performed. The results hold pedagogical implications particularly regarding: the modelling of certain stylistic aspects of English scriptwriting; and the design of more targeted future ESL/EFL scriptwriting course in China.

Download PDF Export Citation

A Corpus Stylistic Analysis of the Characteristics of ESL Film Scripts Written by Chinese Students in an English-medium University in Mainland China ×

EndNote
RefWorks
Scholar's Aid
BibTeX

Export Citation Cancel

The Corpus Analysis of the Use of Connectors in English Writing of Korean Undergraduates

Miryeong Ryu,Mae-Ran Park

CORPUS LINGUSITICS RESEARCH :: Vol.2 No. pp.45-45

Abstract

The Corpus Analysis of the Use of Connectors in English Writing of Korean Undergraduates ×

The purpose of this presentation is to examine the use of connectors in English writing of Korean undergraduate students. Compared with intermediate or advanced level English writers in EFL contexts, not much has been explored in what beginning EFL writers tend to do in their writing. In this study, the authors aim to find out what types of connectors are most widely used (i.e., additive, causal/ resultative and sequential) by the beginning Korean writers of English and whether or not they are influenced by their L1 use. The participants will be 40 sophomores who are taking the pre-intermediate English writing course during the spring semester of 2016. As for the instruments, the questionnaire survey and the pre- and the post-tests on the use of the connectives by the students will be conducted. The participants are going to write 5 different kinds of paragraphs (i.e., definition, process, description, opinion, and narration) consisting of approximately 120 words in length respectively during the 15-week semester. Because the kinds of writing the beginning participants will produce are paragraphs not essays, it is expected that the use of connectors will be somewhat limited to certain kinds. The students’ works will be analyzed in terms of the frequency of the connective use using AntCon. The results will shed light on pedagogical implications and suggestions for how to teach beginning writers the use of connectors more effectively.

Download PDF Export Citation

The Corpus Analysis of the Use of Connectors in English Writing of Korean Undergraduates ×

EndNote
RefWorks
Scholar's Aid
BibTeX

Export Citation Cancel

Lexical Bundles in Spoken and Written Russian

Daehyeon Nam,Sungmin Lee

CORPUS LINGUSITICS RESEARCH :: Vol.2 No. pp.46-46

Abstract

Lexical Bundles in Spoken and Written Russian ×

The current study explores the characteristics of frequently-used multi-word expressions (i.e., lexical bundles) in spoken and written Russian. Lexical bundles are retrieved from a one million word Russian National Corpus (RNC) sample. The lexical bundles in spoken and written sub-corpora of the RNC are analyzed quantitatively regarding discourse functions of reference expressions, stance bundles, and discourse organizers. The analysis confirms that the spoken and written Russian corpora exhibit significantly different lexical bundle distribution patterns: there are more referential expressions in written Russian; while there are more stance bundles in spoken Russian. The study also suggests future study calling more in-depth investigation for developing language-specific discourse functions.

Download PDF Export Citation

Lexical Bundles in Spoken and Written Russian ×

EndNote
RefWorks
Scholar's Aid
BibTeX

Export Citation Cancel

1 2 3 4 5 6 7 8