|
|
Publication
|
| Research Title |
Semi-Automatic Word-Aligned Tool for Thai-Vietnamese Parallel Corpus Construction |
| Date of Distribution |
14 October 2019 |
| Conference |
| Title of the Conference |
16th International Joint Conference on Computer Science and Software Engineering (JCSSE 2019) |
| Organiser |
Faculty of Informatics, Burapha University, Chonburi, Thailand |
| Conference Place |
Amari Pattaya |
| Province/State |
ชลบุรี |
| Conference Date |
10 July 2019 |
| To |
12 July 2019 |
| Proceeding Paper |
| Volume |
2019 |
| Issue |
1 |
| Page |
121-125 |
| Editors/edition/publisher |
|
| Abstract |
A corpus, especially a parallel corpus, which contains both source and target language, is an important resource in Natural Language Processing (NLP) research, particularly in machine translation. A quality corpus can improve the accuracy of the translation results significantly; however, corpus construction is very time consuming, and requires the expertise of linguistic experts. In this paper, we present Thai-Vietnamese parallel corpus construction and the process of building a Thai-Vietnamese parallel corpus. This work focuses on the construction of a semi-automatic word-alignment tool, capable of assisting researchers in the construction of a parallel corpus. The collection and validation within this study was achieved through the use of our development tool. In the first stage, the Vietnamese -Thai parallel corpus, containing 14,771 sentence pairs; was collected, aligned at word level, and validated by linguistic experts. This parallel corpus can be used as a reliable resource for statistical machine translation and other applications. |
| Author |
|
| Peer Review Status |
มีผู้ประเมินอิสระ |
| Level of Conference |
นานาชาติ |
| Type of Proceeding |
Full paper |
| Type of Presentation |
Oral |
| Part of thesis |
true |
| ใช้สำหรับสำเร็จการศึกษา |
ไม่เป็น |
| Presentation awarding |
false |
| Attach file |
|
| Citation |
0
|
|
|
|
|
|
|
|
|