CORPUS LINGUSITICS RESEARCH

ALL ISSUE

Year :

Volume :

코퍼스 분석을 통한 서술 특징 분석 : 구병모 作『한 스푼의 시간』과 『상아의 문으로』의 비교

한송,류병래

CORPUS LINGUSITICS RESEARCH :: Vol.8 No.2 pp.1-14

Abstract

코퍼스 분석을 통한 서술 특징 분석 : 구병모 作『한 스푼의 시간』과 『상아의 문으로』의 비교 ×

This paper analyzed narrative characteristics of a writer through the comparison of two literary works of the same writer. Two literary works written by Byeong-mo Gu were chosen for this purpose: ‘A Spoonful of Time’ and ‘To the Ivory Gate’. The corpus was compiled with these two texts, and all the words are POS-tagged. Then, AntConc was utilized for the analysis of corpus data. Three types of linguistic factors were incorporated in this analysis: high-frequency words, n-grams, and pronouns. Through the analysis, the following facts were revealed: (i) the highfrequency words showed the material of the work, (ii) n-gram analysis foregrounded the atmosphere of the work intended by the author, and (iii) pronouns were rarely used when referring to characters. Although there were some valid aspects of analyzing literary works through the corpus analysis, it was recommended that a database of literary works was necessary to be constructed by period and that further studies were necessary to be conducted based on the database.

Download PDF Export Citation

코퍼스 분석을 통한 서술 특징 분석 : 구병모 作『한 스푼의 시간』과 『상아의 문으로』의 비교 ×

EndNote
RefWorks
Scholar's Aid
BibTeX

Export Citation Cancel

A Comparative Analysis of Syntactic Complexity between Scholars and AI-based Machine Translation Systems

Xinyue Wang,Se-Eun Jhang

CORPUS LINGUSITICS RESEARCH :: Vol.8 No.2 pp.15-37

Abstract

A Comparative Analysis of Syntactic Complexity between Scholars and AI-based Machine Translation Systems ×

This study investigates the syntactic complexity in English prose between the writings of Chinese scholars and the corresponding translations generated by AI-based machine translation systems. A corpus of 100 English abstracts written by Chinese scholars and 300 English abstracts translated by ChatGPT 4.0, Google Bard and Microsoft Bing was constructed. These texts were analysed using 14 measures of syntactic complexity as defined by the L2 Syntactic Complexity Analyzer (Lu, 2010). The analysis revealed that when comparing the original Chinese-English texts with the outputs of machine translation systems, significant differences were found in 13 of the 14 syntactic measures. Conversely, when comparing the translations from ChatGPT 4.0, Bard and Bing, significant differences were found in 10 of the 14 measures. This research advances the understanding of machine translation systems and has relevant implications for pedagogy and assessment in the field.

Download PDF Export Citation

A Comparative Analysis of Syntactic Complexity between Scholars and AI-based Machine Translation Systems ×

EndNote
RefWorks
Scholar's Aid
BibTeX

Export Citation Cancel

SNS 데이터 기반 신어 추출 및 용례 분석

이도영

CORPUS LINGUSITICS RESEARCH :: Vol.8 No.2 pp.39-55

Abstract

SNS 데이터 기반 신어 추출 및 용례 분석 ×

Recently, the amount of newly coined words generated in Korean is vast, and the frequency of use in official language media such as the media, broadcasting, and books as well as everyday spoken language is gradually increasing. As the time spent in the Internet space increases, language for communication is created in various forms or its meaning changes to convey new information or values to members of society. In this study, the SNS corpus containing the rapidly changing use of language was analyzed. After selecting new word candidates by constructing a series of pipelines for extracting noun-type new words from the SNS corpus, characteristics and usage were analyzed. At this time, in the natural language processing pipeline that extracts new words, a pipeline including all the processes of rule-based learning using Mecab, unsupervised learning using Soynlp, and user dictionary addition using a correct morpheme analyzer was constructed to extract meaningful tokens. After completing the step of selecting new word candidates, 255 new words were collected. The proportion of sentences including the new word candidate group in the SNS data was 4.799%. Among them, the proportion of sentences in which words belonging to the top 10 appeared was 12.345%. Looking at the ratio of classifying the top 30 new words according to the word formation method, the word formation method that occupied the highest ratio was compound word-synthetic abbreviations (33.3%). The type/token ratio of sentence data including new words was 0.324. The type/token ratio of SNS data was 0.254. Since the type/token ratio of SNS data is lower, it can be said that ototoxicity is higher than that of sentences containing new words. When looking at the collocation relationship and usage of new words such as the initial constant word 'ㄹㅇ', the borrowed word '-특', and the meaning-expanded word '코인', various forms and syntactic uses could be found, and there were many collocations that reflected the social image at the time of data collection. Judging from this phenomenon, the characteristics of corpus, in which initials, borrowings, meeaning-expandings, and special characters are used among newly coined words, become incomplete when simply relying on a dictionary consisting of words or word lists, so a natural language processing dataset containing more diverse social meanings can be constructed by using usage data.

Download PDF Export Citation

SNS 데이터 기반 신어 추출 및 용례 분석 ×

EndNote
RefWorks
Scholar's Aid
BibTeX

Export Citation Cancel

한국코퍼스언어학회 회칙 외

한국코퍼스언어학회

CORPUS LINGUSITICS RESEARCH :: Vol.8 No.2 pp.56-72

Abstract

한국코퍼스언어학회 회칙 외 ×

Download PDF Export Citation

한국코퍼스언어학회 회칙 외 ×

EndNote
RefWorks
Scholar's Aid
BibTeX

Export Citation Cancel

영어 동사 준유의어 extend 와 expand 에 대한 코퍼스 기반 연구

이소영,장세은

CORPUS LINGUSITICS RESEARCH :: Vol.8 No.1 pp.1-27

Abstract

영어 동사 준유의어 extend 와 expand 에 대한 코퍼스 기반 연구 ×

The study analyzes the semantic differences between the English verbs of near-synonyms extend and expand through corpus linguistics and a survey. Using COCA, COHA, and the Merriam-Webster dictionary, the study compares authentic language data and dictionary definitions. The results show that there are differences in the frequency and usage trends of extend and expand in the corpora. Students had difficulty distinguishing their meanings, which was reflected in low comprehension and accuracy rates in the survey. The analysis of noun collocations also reveals differences, with some collocations appearing in unexpected ways. This suggests limitations in vocabulary learning and understanding of these words in the context of learning English as a foreign language for Korean college students, where exposure to natural English language environments is limited. The study highlights the importance of providing students with authentic language experiences, diverse contexts, and the use of dictionaries to improve their understanding and use of extend and expand.

Download PDF Export Citation

영어 동사 준유의어 extend 와 expand 에 대한 코퍼스 기반 연구 ×

EndNote
RefWorks
Scholar's Aid
BibTeX

Export Citation Cancel

영어 발음 및 말하기 평가를 위한 코퍼스 구축 이론과 실제 Part 2

황영,김지은,소영순,이석재,윤태진

CORPUS LINGUSITICS RESEARCH :: Vol.8 No.1 pp.29-48

Abstract

영어 발음 및 말하기 평가를 위한 코퍼스 구축 이론과 실제 Part 2 ×

The purpose of this study is to describe the project implementation process to build an AI-based English speaking evaluation system. The project was carried out for approximately four months from September to December 2022 with support from the Korea Intelligence and Information Society Promotion Agency. Approximately 1,000 hours of English speaking evaluation data sets were collected, purified, processed, and artificial intelligence modeled. In this paper, the organizations formed to build data are introduced, and data collection is explained in detail. In addition, the process of refining the collected data so that it can be used for speaking evaluation and artificial intelligence modeling is described, and the evaluation method for constructing speaking evaluation data is also described.

Download PDF Export Citation

영어 발음 및 말하기 평가를 위한 코퍼스 구축 이론과 실제 Part 2 ×

EndNote
RefWorks
Scholar's Aid
BibTeX

Export Citation Cancel

언어모델의 편향 개선을 위한 프롬프트 엔지니어링 연구 : ChatGPT를 활용한 정치인 감성분석 말뭉치를 중심으로

김유진,강조은,김한샘

CORPUS LINGUSITICS RESEARCH :: Vol.8 No.1 pp.49-66

Abstract

언어모델의 편향 개선을 위한 프롬프트 엔지니어링 연구 : ChatGPT를 활용한 정치인 감성분석 말뭉치를 중심으로 ×

This study aims to identify and address the inherent political bias in ChatGPT by utilizing a sentiment analysis task. We set up representative figures from each political faction, asked chatgpt to write about the politicians using various prompts, and then sentimentally analyzed their outputs to determine the bias of ChatGPT. We found that ChatGPT is more positively biased toward liberal politicians in South Korea. We also found that ChatGPT's bias can be reduced by combining general narratives that encourage neutral writing or by refining the prompts with variables such as tone and writing style. This study provides important insights into the responsible use of AI and how to improve its bias.

Download PDF Export Citation

언어모델의 편향 개선을 위한 프롬프트 엔지니어링 연구 : ChatGPT를 활용한 정치인 감성분석 말뭉치를 중심으로 ×

EndNote
RefWorks
Scholar's Aid
BibTeX

Export Citation Cancel

한국코퍼스언어학회 회칙 외

한국코퍼스언어학회

CORPUS LINGUSITICS RESEARCH :: Vol.8 No.1 pp.67-83

Abstract

한국코퍼스언어학회 회칙 외 ×

Download PDF Export Citation

한국코퍼스언어학회 회칙 외 ×

EndNote
RefWorks
Scholar's Aid
BibTeX

Export Citation Cancel

영화 대본 번역에 나타난 영어 불변화사 UP/OUT의 한국어 번역 전략

박정인

CORPUS LINGUSITICS RESEARCH :: Vol.7 No.2 pp.1-20

Abstract

영화 대본 번역에 나타난 영어 불변화사 UP/OUT의 한국어 번역 전략 ×

The purpose of this study is to analyze how particles of English are translated in Korean and discuss whether the results of the analysis are consistent with the cognitive semantic approach about English particles. First, UP and OUT were selected for the subjects of the study. Then, to obtain the most appropriate data for the purpose of the study, eight films were selected from the OTT platform Netflix and the text as data were collected by extracting subtitles. We examined the collected translation data, investigated what translation strategies the English particles and phrasal verbs in the data were translated through, and produced statistics. Finally, we confirmed that the results of the analysis support cognitive semantic view on English particles that particles in phrasal verbs are not arbitrary, and we made a brief generalization of the results of the analysis.scale.

Download PDF Export Citation

영화 대본 번역에 나타난 영어 불변화사 UP/OUT의 한국어 번역 전략 ×

EndNote
RefWorks
Scholar's Aid
BibTeX

Export Citation Cancel

Keyword Analysis of Maritime Legal Texts : Text-dispersion Approach

Guandong Zhang,Charmhun Jo,Se-Eun Jhang

CORPUS LINGUSITICS RESEARCH :: Vol.7 No.2 pp.21-41

Abstract

Keyword Analysis of Maritime Legal Texts : Text-dispersion Approach ×

The present study is based on a self-built Maritime English Law Corpus compared with BNC Baby as a reference corpus to explore some homogeneous features of four different maritime legal genres through the comparison of two different keyword analyses: corpus frequency-based keyword analysis and text dispersion-based keyword analysis. A comparison of keyword lists of four legal genres by using a cross-validation is also conducted to explore unique characteristics of each genre. The results show that two keyword methods generated both shared words and unshared words. According to the two criteria of keywords, we concluded that text dispersion-based keyword analysis is much better than traditional corpus frequency-based keyword analysis because the former meets both the content-distinctiveness of maritime-related keywords and the content-generalisability of law content keywords as well as showing more homogeneous maritime legal features than the latter.

Download PDF Export Citation

Keyword Analysis of Maritime Legal Texts : Text-dispersion Approach ×

EndNote
RefWorks
Scholar's Aid
BibTeX

Export Citation Cancel

1 2 3 4 5 6 7 8