|
ข้อมูลการเผยแพร่ผลงาน
|
ชื่อบทความ |
Efficient algorithms based on the k-means and Chaotic League Championship Algorithm for numeric, categorical, and mixed-type data clustering |
วัน/เดือน/ปี ที่ได้ตอบรับ |
30 ธันวาคม 2560 |
วารสาร |
ชื่อวารสาร |
Expert Systems with Applications |
มาตรฐานของวารสาร |
ISI |
หน่วยงานเจ้าของวารสาร |
Expert Systems with Applications |
ISBN/ISSN |
0957-4174 |
ปีที่ |
2017 |
ฉบับที่ |
90 |
เดือน |
December |
ปี พ.ศ. ที่พิมพ์ |
2560 |
หน้า |
146-167 |
บทคัดย่อ |
The success rates of the expert or intelligent systems depend on the selection of the correct data clusters.
The k-means algorithm is a well-known method in solving data clustering problems. It suffers not only
from a high dependency on the algorithm’s initial solution but also from the used distance function. A
number of algorithms have been proposed to address the centroid initialization problem, but the produced solution does not produce optimum clusters. This paper proposes three algorithms (i) the search
algorithm C-LCA that is an improved League Championship Algorithm (LCA), (ii) a search clustering using
C-LCA (SC-LCA), and (iii) a hybrid-clustering algorithm called the hybrid of k-means and Chaotic League
Championship Algorithm (KSC-LCA) and this algorithm has of two computation stages. The C-LCA employs chaotic adaptation for the retreat and approach parameters, rather than constants, which can enhance the search capability. Furthermore, to overcome the limitation of the original k-means algorithm
using the Euclidean distance that cannot handle the categorical attribute type properly, we adopt the
Gower distance and the mechanism for handling a discrete value requirement of the categorical value attribute. The proposed algorithms can handle not only the pure numeric data but also the mixed-type data
and can find the best centroids containing categorical values. Experiments were conducted on 14 datasets
from the UCI repository. The SC-LCA and KSC-LCA competed with 16 established algorithms including the
k-means, k-means++, global k-means algorithms, four search clustering algorithms and nine hybrids of
k-means algorithm with several state-of-the-art evolutionary algorithms. The experimental results show
that the SC-LCA produces the cluster with the highest F-Measure on the pure categorical dataset and the
KSC-LCA produces the cluster with the highest F-Measure for the pure numeric and mixed-type tested
datasets. Out of 14 datasets, there were 13 centroids produced by the SC-LCA that had better F-Measures
than that of the k-means algorithm. On the Tic-Tac-Toe dataset containing only categorical attributes,
the SC-LCA can achieve an F-Measure of 66.61 that is 21.74 points over that of the k-means algorithm
(44.87). The KSC-LCA produced better centroids than k-means algorithm in all 14 datasets; the maximum
F-Measure improvement was 11.59 points. However, in terms of the computational time, the SC-LCA and
KSC-LCA took more NFEs than the k-means and its variants but the KSC-LCA ranks first and SC-LCA ranks
fourth among the hybrid clustering and the search clustering algorithms that we tested. Therefore, the
SC-LCA and KSC-LCA are general and effective clustering algorithms that could be used when an expert
or intelligent system requires an accurate high-speed cluster selection. |
คำสำคัญ |
Data clustering, Search clustering algorithm, Hybrid clustering algorithm, League Championship Algorithm (LCA), Chaos optimization algorithms (COA), Mixed-type data |
ผู้เขียน |
|
การประเมินบทความ |
มีผู้ประเมินอิสระ |
สถานภาพการเผยแพร่ |
ตีพิมพ์แล้ว |
วารสารมีการเผยแพร่ในระดับ |
นานาชาติ |
citation |
ไม่มี |
เป็นส่วนหนึ่งของวิทยานิพนธ์ |
เป็น |
แนบไฟล์ |
|
Citation |
0
|
|
|
|
|
|
|