2012 ©
             Publication
Journal Publication
Title of Article Multi-Class Text Classification on Khmer News Using Ensemble Method in Machine Learning Algorithms 
Date of Acceptance 26 February 2023 
Journal
     Title of Journal Acta Informatica Pragensia 
     Standard SCOPUS 
     Institute of Journal Prague University of Economics and Business 
     ISBN/ISSN 1805-4951 
     Volume 2023 
     Issue 12 
     Month March
     Year of Publication 2023 
     Page 17 
     Abstract The research herein applies text classification with which to categorize Khmer news articles. News articles were collected from three online websites through web scraping and grouped into nine categories. After text preprocessing, the dataset was split into training and testing sets. We then evaluated the performance of the ensemble learning method via machine learning classifiers with k-fold validation. Various machine learning classifiers were employed, namely logistic regression, Complement Naive Bayes, Bernoulli Naive Bayes, k-nearest neighbours, perceptron, support vector machines, stochastic gradient descent, AdaBoost, decision tree, and random forest were employed. Accuracy was improved for the categorization of Khmer news articles, in which Grid Search CV was used to find the optimal hyperparameters for each machine learning classifier with feature extraction TF-IDF and Delta TF-IDF. The results determined that the highest accuracy was achieved through the ensemble learning method in the support vector machine with the optimal hyperparameters (C = 10, kernel = rbf), using feature extraction TF-IDF and Delta TF-IDF, at 83.47% and 83.40%, respectively. The model establishes that Khmer news articles can be accurately categorized. 
     Keyword Text classification; Khmer news; Machine learning; Feature extraction; Optimal hyperparameters; News categorization; Ensemble learning method 
Author
645020084-0 Mr. RAKSMEY PHANN [Main Author]
College of Computing Master's Degree

Reviewing Status มีผู้ประเมินอิสระ 
Status ตีพิมพ์แล้ว 
Level of Publication นานาชาติ 
citation true 
Part of thesis true 
Attach file
Citation 0

<
forum