無風的日子: Phylogenetic analysis

2010年5月13日星期四

Phylogenetic analysis

張貼者： Windlessday 於中午12:46

Distance matrix methods:

1. UPGMA --> 假設演化速率須一致才可用

2. Neighbor-joining method --> 最廣泛應用，但會遺失序列特性, 全部轉為distance 數字畫樹

A. Distance matrix 需經校正

校正方法:

l Jukes-cantor (JC or JC69) : one rate of substitution, equal base frequencies

l Kimura 2-Parameter (K2P or K80) two types of substitution, equal base frequencies

l Felsentein 1981(F81) : one rate of substitution, unequal base frequencies

l Hasegawa, or Felsentein (HKY85 or F84) : two types of substitution, unequal base frequencies

l Tamura and Nei (TN93) : tree types of substitution, unequal base frequencies

l General time Reversible (GTR) : six types of substitution, unequal base frequencies

à 若須精確計算(演化、流病、分子時鐘等可先跑 LRT (likelihood ratio test)，決定選用的 model

à 一般採用 HKY85 , 鹼基頻率一致時採K2P即可

B. Phylip 操作

i. Use [ClustalW method] run multiple sequence alignment à output format : phylip à rename output file as “infile”

ii. Use [dnadist.exe] run distance matrix table to get “outfile” à rename “outfile” as “infile”

iii. Run [neighbor.exe] to generate “outtree”

iv. View as [treeview]

Character state methods:

1. Maximum parsimony method (最大簡約法) à 依據最簡約的演化步驟所得演化樹，以sequence 特性畫圖, 但只計算 information site (至少其中二組序列含相同突變才考量)。(優點: 可得多種演化樹供選擇, 缺點: 不遵循minimal evolution不適用, 例HIV演化)

A. 演算法則:

i. Exhaustive search : 一定可得最短演化路徑

ii. Branch and bound : 以巧妙運算所得最短演化樹, 但可能非最短演化路徑

iii. ….

Parsimony 方法可能會有 Hill-climbing algorithms, 所以輸入的序列必須選擇Random order, NJ法沒此問題。

會得到多種演化樹選擇, 可重新畫出 consensus tree

B. Phylip 操作

i. Run [dnapars.exe]

l S à choose “rearrange on one best tree”

l J à Randomize ….--> 給一個奇數 à Mix 次數 1

l Y à get “outtree” à rename it as “intree”

ii. Run [consensus.exe] to generate “outtree”

iii. View as [treeview]

2. Maximum likelihood method à 不假設演化為簡約方式，以所有序列計算最大likelihood, 故短序列 (<500bp)採用最佳。演算基於統計方法, 可得P value, 不須再進行 Bootstrap. (缺點: 演算慢)

A. Phylip 操作

i. Run [dnaml.exe] to generate “outfile” and “outtree”

ii. View “outtree” with [treeview], “outfile” with [notepad]

Statistical test for phylogenetic tree

l Bootstrap analysis à 演化樹經重新取樣後, topology 還是可重覆出現的比率，至少75%以上才為可信的單系群, 70%可勉強接受, 若值低於70%以下, 若不同演算法可得相同的結果, 也可佐證。

1. Run “seqboot.exe” to generate 1000 rearrangement sequences file à rename “outfile” as “infile”

2. In N-J method

u Run “dnadist.exe” to get distance matrix file à rename “outfile”as“infile”

※ M: yes, 1000 data sets

u Run [neighbor.exe] to generate 1000 “outtree” à rename “outtree” as “intree”

※ M: yes, 1000 data sets

u Run [consensus.exe] to generate consensus “outtree”

u View “outfile” with [notepad]

※ 發表時須使用 “original tree” 標上bootstrap 統計值, 不可直接使用 bootstrap 所得之演化樹

3. In Parsimony methods

u Run [dnapars.exe]

※ S: choose “rearrange on one best tree”

※ J: Randomize ….--> 給一個奇數 à Mix 次數 1

※ M: D, 1000 data sets

※ Y: get“outtree”rename it as“intree”

u Run [consensus.exe] to generate “outtree”

u View “outfile” with [notepad]

※ 發表時須使用 “original tree” 標上bootstrap 統計值, 不可直接使用 bootstrap 所得之演化樹

4. In Likilihood methods

u Run [dnaml.exe]

u Y: get “outtree” à rename it as“intree”

u View “outtree” with [treeview], “outfile” with [notepad]

※ 發表時須使用 “original tree” 標上bootstrap 統計值, 不可直接使用 bootstrap 所得之演化樹

Guideline

0 意見:

張貼留言

無風的日子

2010年5月13日星期四

Phylogenetic analysis

0 意見:

搜尋此網誌

Labels

熱門文章

Blog Archive

總網頁瀏覽量

關於我自己

無風的日子

2010年5月13日 星期四

Phylogenetic analysis

0 意見:

搜尋此網誌

Labels

熱門文章

Blog Archive

總網頁瀏覽量

關於我自己

2010年5月13日星期四