GSMIS

ข้อมูลการเผยแพร่ผลงาน

การเผยแพร่ในรูปของบทความวารสารทางวิชาการ

ชื่อบทความ

Multi‐Stroke Thai Finger‐Spelling Sign Language Recognition System with Deep Learning

วัน/เดือน/ปี ที่ได้ตอบรับ

31 มกราคม 2564

วารสาร

ชื่อวารสาร

symmetry

มาตรฐานของวารสาร

ISI

หน่วยงานเจ้าของวารสาร

MDPI

ISBN/ISSN

2073-8994

ปีที่

ฉบับที่

เดือน

February

ปี พ.ศ. ที่พิมพ์

2564

หน้า

บทคัดย่อ

Sign language is a type of language for the hearing impaired that people in the general public commonly do not understand. A sign language recognition system, therefore, represents an intermediary between the two sides. As a communication tool, a multi‐stroke Thai finger‐spelling sign language (TFSL) recognition system featuring deep learning was developed in this study. This research uses a vision‐based technique on a complex background with semantic segmentation performed with dilated convolution for hand segmentation, hand strokes separated using optical flow, and learning feature and classification done with convolution neural network (CNN). We then compared the five CNN structures that define the formats. The first format was used to set the number of filters to 64 and the size of the filter to 3 × 3 with 7 layers; the second format used 128 filters, each filter 3 × 3 in size with 7 layers; the third format used the number of filters in ascending order with 7 layers, all of which had an equal 3 × 3 filter size; the fourth format determined the number of filters in ascending order and the size of the filter based on a small size with 7 layers; the final format was a structure based on AlexNet. As a result, the average accuracy was 88.83%, 87.97%, 89.91%, 90.43%, and 92.03%, respectively. We implemented the CNN structure based on AlexNet to create models for multi‐stroke TFSL recognition systems. The experiment was performed using an isolated video of 42 Thai alphabets, which are divided into three categories consisting of one stroke, two strokes, and three strokes. The results presented an 88.00% average accuracy for one stroke, 85.42% for two strokes, and 75.00% for three strokes.

คำสำคัญ

TFSL recognition system; deep learning; semantic segmentation; optical flow; complex background

ผู้เขียน

587020014-9	นาย ทองปาน ปริวัตร [ผู้เขียนหลัก]
	คณะวิทยาศาสตร์ ปริญญาเอก ภาคปกติ

การประเมินบทความ

มีผู้ประเมินอิสระ

สถานภาพการเผยแพร่

ตีพิมพ์แล้ว

วารสารมีการเผยแพร่ในระดับ

นานาชาติ

citation

มี

เป็นส่วนหนึ่งของวิทยานิพนธ์

เป็น

ใช้สำหรับสำเร็จการศึกษา

ไม่เป็น

แนบไฟล์

Citation