Repository logo
  • English
  • 中文
  • Log In
    Have you forgotten your password?
Repository logo
    Communities & Collections
    Research Outputs
    Fundings & Projects
    People
    Organizations
    Statistics
  • English
  • 中文
  • Log In
    Have you forgotten your password?
  1. Home
  2. 資訊與流通學院
  3. 流通管理系
  4. 流通管理系研究成果
  5. 利用循序前進選擇策略於微陣列關鍵基因選取問題
 
  • Details
Options

利用循序前進選擇策略於微陣列關鍵基因選取問題

Other Title
Key Gene Selection in Microarray Using Sequential Forward Selection Strategy
Date Issued
2013-07-30
Author(s)
陳昱超
流通管理系  
Advisor
林泓毅
URI
https://www.airitilibrary.com/Publication/alDetailedMesh1?DocID=U0061-0508201313252300
https://nutcir-lib.nutc.edu.tw/handle/123456789/1133
Abstract
基因微陣列的表現資料,屬性具高維度(high-dimensional attributes),且為廣域資料(wide data),並僅有少部分的基因與資料分類的基因有關,因此需要更有效率且有效地選出關鍵基因的方法,不過結合一群具備良好識別力的基因屬性,不一定可達到最佳的分類結果,這是因為多個基因屬性可能重複執行類似的分類工作,為確保參與分類工作的特徵屬性(characterizing attributes)能夠各司其職並達到最大的分類功效,本研究以循序前進選擇法(sequential forward selection, SFS)當作屬性選擇的策略, SFS方法為最能大幅降低特徵變數選取所需花費的時間。另外本研究以模糊群集分析(fuzzy cluster analysis)對基因微陣列資料做群集處理,這樣可使資料簡化,並且可以快速突顯出具有識別力(discrimination power)之屬性。我們利用以熵值(entropy)為基礎的屬性評估準則,運用傳統”個別單一屬性評估”和”屬性集合之評估”方法,對所有原始屬性逕行辨別能力評估(discrimination capability evaluation),希冀藉由此方法的實施,更精準且快速地評估屬性的識別力。最後,我們採用六個微陣列資料集來驗證本研究提出的設計與方法,將挑選的屬性用於建構六種常見的分類器,藉以觀察群集處理、評估方法以及循序前進選擇法對分類準確度(accuracy)及區別能力(ROC area)的影響。
High dimension of feature space、low instance amount、and only a limited number of key genes critical for bioinformation classification problems are three characteristics in the analysis of microarray. On one hand, the selection of discriminative genes is important. On the other hand, a collection of discriminative genes do not necessarily lead to good classification quality. This is because some attributes could likely possess the similar classification effects and in turn lead to the redundant classification results. In order to generate the subsets of genes with not only sufficient but also necessary discrimination power for bioinformation classification problems, a novel selection strategy which integrates fuzzy cluster analyses and information gain (IG) into the traditional sequential forward selection (SFS) algorithm is proposed in this paper. In terms of classification accuracy and discrimination power, the experimental results gained from six microarray datasets show that our strategy can efficiently select compact subsets of characterizing genes and these selected genes are suitable for various conventional classifiers.
Subjects
群集分析
分類問題
循序前進選擇演算法
分類準確率
識別力
屬性評估
cluster analysis
classification problems
SFS, classification accuracy
discrimination power
attribution evaluation
Type
master thesis

Built with DSpace-CRIS software - Extension maintained and optimized by 4Science

  • Cookie settings
  • Privacy policy
  • End User Agreement
  • Send Feedback