skip to main |
skip to sidebar
Character state methods:
Statistical test for phylogenetic tree
2010年5月13日 星期四
Distance matrix methods:
1. UPGMA --> 假設演化速率須一致才可用
2. Neighbor-joining method --> 最廣泛應用,但會遺失序列特性, 全部轉為distance 數字畫樹
A. Distance matrix 需經校正
校正方法:
l Jukes-cantor (JC or JC69) : one rate of substitution, equal base frequencies
l Kimura 2-Parameter (K2P or K80) two types of substitution, equal base frequencies
l Felsentein 1981(F81) : one rate of substitution, unequal base frequencies
l Hasegawa, or Felsentein (HKY85 or F84) : two types of substitution, unequal base frequencies
l Tamura and Nei (TN93) : tree types of substitution, unequal base frequencies
l General time Reversible (GTR) : six types of substitution, unequal base frequencies
à 若須精確計算(演化、流病、分子時鐘等可先跑 LRT (likelihood ratio test),決定選用的 model
à 一般採用 HKY85 , 鹼基頻率一致時採K2P即可
B. Phylip 操作
i. Use [ClustalW method] run multiple sequence alignment à output format : phylip à rename output file as “infile”
ii. Use [dnadist.exe] run distance matrix table to get “outfile” à rename “outfile” as “infile”
iii. Run [neighbor.exe] to generate “outtree”
iv. View as [treeview]
Character state methods:
1. Maximum parsimony method (最大簡約法) à 依據最簡約的演化步驟所得演化樹,以sequence 特性畫圖, 但只計算 information site (至少其中二組序列含相同突變才考量)。(優點: 可得多種演化樹供選擇, 缺點: 不遵循minimal evolution不適用, 例HIV演化)
A. 演算法則:
i. Exhaustive search : 一定可得最短演化路徑
ii. Branch and bound : 以巧妙運算所得最短演化樹, 但可能非最短演化路徑
iii. ….
Parsimony 方法可能會有 Hill-climbing algorithms, 所以輸入的序列必須選擇Random order, NJ法沒此問題。
會得到多種演化樹選擇, 可重新畫出 consensus tree
B. Phylip 操作
i. Run [dnapars.exe]
l S à choose “rearrange on one best tree”
l J à Randomize ….--> 給一個奇數 à Mix 次數 1
l Y à get “outtree” à rename it as “intree”
ii. Run [consensus.exe] to generate “outtree”
iii. View as [treeview]
2. Maximum likelihood method à 不假設演化為簡約方式,以所有序列計算最大likelihood, 故短序列 (<500bp)採用最佳。演算基於統計方法, 可得P value, 不須再進行 Bootstrap. (缺點: 演算慢)
A. Phylip 操作
i. Run [dnaml.exe] to generate “outfile” and “outtree”
ii. View “outtree” with [treeview], “outfile” with [notepad]
Statistical test for phylogenetic tree
l Bootstrap analysis à 演化樹經重新取樣後, topology 還是可重覆出現的比率,至少75%以上才為可信的單系群, 70%可勉強接受, 若值低於70%以下, 若不同演算法可得相同的結果, 也可佐證。
1. Run “seqboot.exe” to generate 1000 rearrangement sequences file à rename “outfile” as “infile”
2. In N-J method
u Run “dnadist.exe” to get distance matrix file à rename “outfile”as“infile”
※ M: yes, 1000 data sets
u Run [neighbor.exe] to generate 1000 “outtree” à rename “outtree” as “intree”
※ M: yes, 1000 data sets
u Run [consensus.exe] to generate consensus “outtree”
u View “outfile” with [notepad]
※ 發表時須使用 “original tree” 標上bootstrap 統計值, 不可直接使用 bootstrap 所得之演化樹
3. In Parsimony methods
u Run [dnapars.exe]
※ S: choose “rearrange on one best tree”
※ J: Randomize ….--> 給一個奇數 à Mix 次數 1
※ M: D, 1000 data sets
※ Y: get“outtree”rename it as“intree”
u Run [consensus.exe] to generate “outtree”
u View “outfile” with [notepad]
※ 發表時須使用 “original tree” 標上bootstrap 統計值, 不可直接使用 bootstrap 所得之演化樹
4. In Likilihood methods
u Run [dnaml.exe]
u Y: get “outtree” à rename it as“intree”
u View “outtree” with [treeview], “outfile” with [notepad]
標籤: Bioinfo
0 意見:
張貼留言