林泓毅張怡雅2025-08-112025-08-112010-07-23U0061-2108201019191600https://www.airitilibrary.com/Publication/alDetailedMesh1?DocID=U0061-2108201019191600https://nutcir-lib.nutc.edu.tw/handle/123456789/1090隨著科技的進步與網際網路的普及,每年全球的資訊量不斷以倍數成長,為了因應資料量的快速成長,許多應用領域均保有龐大的資料庫管理系統,並且仰賴高效率的資料探勘技術,以便擷取出重要的知識。在許多資料探勘技術之中,決策樹是一項重要的資料分類工具,決策樹為資料屬性間找出可能存在的因果關係;傳統決策樹在分類器的設計上,皆使用單一變量屬性對資料進行逐步的分類,進而建構出龐大的分類模式,然而此舉忽略特徵屬性間存在的相關現象(correlation),造成決策樹重複運用類似的分類法則,拖累歸納學習的效率。為了提高決策樹對資料分類上的效率,本研究則提出一項不同於以往的分類策略,將主成份分析(principal component analysis, PCA)簡化資料的特性運用在決策樹建構分類模式上,並透過PCA產生的共通性(communality)與解釋能力決定出一組適當的特徵屬性,進而產生一個多變量的分類器,此多變量合成屬性(multivariate hybrid attribute)作為本研究建構決策樹之樹根。最後本研究以UCI資料庫進行驗證工作,評估傳統C4.5單一變量屬性與本研究多變量合成屬性之分類能力良窳。With the advance of technology and popularity of the Internet, the amount of information around the world has grown geometrically. In response to the rapid growth of the information, we have huge database manage systems in fields for different applications. To extract important knowledge relies on efficient data mining techniques. Among data mining techniques, decision tree is an important tool that is possible to identity existing causal relationships. Traditional decision tree uses univariate attributes to classify the data, and further constructs a classification model which is usually huge. However, due to the neglect of the correlation between feature attributes, it may result in low efficiency of inductive learning by using similar classification rules repeatedly. In order to improve the efficiency of classification, the study proposes a strategy which adapts PCA (principal component analysis) to simplify the classification. By the communality and explanation resulted from PCA, we can decide an appropriate set of feature attributes. Therefore, a multivariate classifier is produced. We then use this multivariate hybrid attribute for the root of the constructed decision tree. Finally, the UCI database is used to evaluate the method of the study. A comparison between proposed method (multivariate hybrid attributes) and traditional C4.5 (univariate attribute) is made as well.zh熵值決策樹主成份分析分類器entropydecision treePCAclassifier以主成份分析建構高效率決策樹Construction of High-Efficiency Decision Tree with Principal Component Analysismaster thesis