0% found this document useful (0 votes)

194 views12 pages

Molecular Phylogeny - Introduction

DU Notes

Uploaded by

rashmirani952.2021

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

194 views12 pages

Molecular Phylogeny - Introduction

DU Notes

Uploaded by

rashmirani952.2021

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 12

Molecular Phylogeny- Introduction

Subject: Bioinformatics

Lesson: Molecular Phylogeny- Introduction

Lesson Developer: Shailendra Goel

College/ Department: Department of Botany, University of Delhi

0
Molecular Phylogeny- Introduction

Table of Contents

Chapter: Molecular Phylogeny

Introduction

How to generate trees

Positive and negative selection
Understanding Trees
Cladograms vs Phylograms
Rooted vs Unrooted trees
Tree Terminology

Methods of Phylogenetic reconstruction

Distance Method
UPGMA distance based method
Neighbor joining method

Statistical methods of phylogeny

 Summary
 Exercise/ Practice
 Glossary
 References/ Bibliography/ Further Reading

1
Molecular Phylogeny- Introduction

Introduction
Mutation is the basis of evolution driven by the process of selection. All life forms are
expected to be part of a tree of life, which should be able to explain their origin and
evolution. Practically, this may not happen due to extinction of species and further
complications arising from ways by which organisms can acquire genes (e.g. lateral transfer
of genes). Phylogenetics exploits available comparative information to generate trees, which
can explain evolution. Traditionally morphological features were used to compare data and
generate trees. More recently molecular sequences are used for comparisons among
species, helping in defining species, families and other taxa, hence named as “Molecular
Phylogeny”.

How to generate trees

Trees are generated by comparing traits among organisms. For classical phylogeny these
traits are morphological traits but for molecular phylogeny we can use DNA, RNA or protein
sequence data. As a general rule DNA has more phylogenetic information as compared to
proteins. Proteins are derived through triplet code, in which third bases follow the “wobble
hypothesis” leading to loss of phylogenetic information. DNA sequences comprise coding
and non-coding regions that have differing rates of evolution. The rate of evolution also
depends on the type of organism.
Comparison of sequences can only be done after aligning them. Without alignment it is very
difficult to decide which nucleotide/amino acid should be compared with which one
(homology). Proteins show two types of changes- synonymous and non-synonymous. A
synonymous change does not result in change in the coded amino acid.

Positive and negative selection

Traditionally, any change which is favored by natural selection is called positive selection. It
is favored by natural selection because it helps in the survival of organism. Similarly, any
trait which is not favored by natural selection is normally eliminated and is called negative
selection.

2
Molecular Phylogeny- Introduction

Similar kind of selection also operates for molecular sequences. It is common among genes
to go through duplication. A duplicated copy of gene is free to undergo mutation and create
variation. This variation goes through positive/negative selection and often leads to
neofunctionalization, leading to new genes with new functions.

Understanding Trees
Cladograms vs Phylograms
Trees fall under two categories – Cladogram and Phylogram. Cladogram just provide the
information about relationship between different organisms while phylograms also provide a
measure of the amount of evolutionary change, as seen in the branch-lengths. Due to this
fact, branch length has no meaning in cladograms while it has meaning in phylograms.

Figure: Phylogram Figure: Cladogram

Source: Author Source: Author

Rooted vs Unrooted trees

The root in a tree denotes the ultimate common ancestor and provides direction in time. At
times, it is not possible to have this information hence there are both types of algorithms
available- those we do apply a common ancestor hypothesis and those we does not. A

3
Molecular Phylogeny- Introduction

common way to decide the root of tree is by using an outgroup. An outgroup is a taxon
from a group closely related to the ingroup, which includes the taxa under study.
Another way to identify the root is to use midpoint as the rooting point for the longest
branch.

Figure: Midpoint Rooted Tree Figure: Outgroup rooted tree

Source: Author Source: Author

Tree Terminology
Trees can be described based on branches and nodes. Terminal branches represent
Operational Taxonomic Unites (OTU’s). When two branches are connected, it results in
internal nodes. When two terminal branches are directly connected to each other, they are
called sister branches.

4
Molecular Phylogeny- Introduction

Figure: Defining Trees

Source: Author
If two lineages (branches) originate from one internal node, it is called bifurcation or
dichotomy. If there are more than two branches are coming out of one internal node, this is
called as polytomy and tree is said to be multifurcating.

Methods of Phylogenetic reconstruction

Various methods have been proposed to build a phylogenetic tree. We will only consider
three here: distance based method (UPGMA and NJ), maximum parsimony (MP) and
maximum likelihood (ML).

Distance Method

Distance based methods start with calculating pairwise distances between sequences based
on pairwise alignment. These distances form a distance matrix which is used to generate
the tree. Commonly known methods to generate the tree from this matrix are Unweighted
Pair Group Method using Arithmetic mean (UPGMA) and Neighbor Joining (NJ). Distance
based methods are fast but overlook substantial amount of information in a multiple

5
Molecular Phylogeny- Introduction

sequence alignment. Distance is calculated as dissimilarity between the sequences of each

pair of taxa.

Figure: Triangular and rectangular matrix. Notice that upper part in rectangular matrix is
identical to the lower part and, therefore, is redundant.
Source: Author

UPGMA distance based method

It is no longer a popular method and distance based tree now use NJ as a method of choice.
In UPGMA is a progressive clustering method. All the sequences are first considered in
calculating the matrix. Now closest taxa are considered as a group. Again matrix is
calculated considering this group as a node, subsequent to which taxa with minimum
distance are considered as a group. Now matrix is calculated again and so on...continue till
only two groups are formed and connect them also. UPGMA assumes that rate of nucleotide
or amino acid substitution is constant due to which branch length reflects actual dates of
divergence. This assumption is often not true hence can produce an inaccurate tree.
Midpoint rooting is applied in this method.

Neighbor joining method

It allows different rates of evolution in different branches of tree. It starts with connecting
OTU’s with minimum distance and the node thus created is used for subsequent calculation.
The tree is not rooted because it does not assume a constant rate of evolution but can be
rooted using an outgroup.

6
Molecular Phylogeny- Introduction

Figure: How NJ tree is made. OTU’s with lowest distance are connected first (Shown as
orange). This work as a node and next OTU with lowest distance is connected (shown as
blue).
Source: Author

Corrections: Observed distances are not always a good measure of evolutionary distance.
Because they do not take into account hidden changes due to multiple hits. Due to this
reason converting a measure of distance to a measure of evolution requires correction. Two
such common corrections are Jukes–Cantor and Kimura-2 parameter models.
The Jukes-Cantor one parameter model considers that each nucleotide is free to convert to
others with equal rates for transition and transversion hence any nucleotide has equal
chance to covert to other three. It also assumes that four bases are present in equal
frequencies.
Usually, transition rate is higher than transversion rate. Kimura two parameter model
adjusts pairwise distances taking into account the transition transversion ratio. Various
other models have been developed that are more sophisticated.

Figure: Jukes cantor model Figure: Kimura two parameter model

7
Molecular Phylogeny- Introduction

rate of transition=rate of transversion (x) rate of transition (y) ≠ rate of transverion (x)
Source: Author Source: Author

Maximum Parsimony
Parsimony based method work on the principle of choosing the most parsimonious tree. The
maximum parsimony works on the idea of minimizing the number of evolutionary changes.
It works as follows:
 Identify informative sites in a dataset. Sites which represent alternative possibilities for
OTU’s are considered informative.
 Construct trees. All possible trees are constructed and evaluated. Score is based on
number of evolutionary changes required to generate the particular tree.
 The trees with minimum score are retained. It is possible to retain more than one tree if
they have equal minimum score.

Figure: For the column shown in color, there are three possible unrooted trees.
Source: Author

Figure: Numbers of changes (steps) are counted and trees with minimum score are
selected (changes marked with bullet). For the example given here, there are 15 possible
rooted trees for one column in sequence alignment, but we are showing only four as
example.
Source: Author

Statistical methods of phylogeny

8
Molecular Phylogeny- Introduction

Distance and Maximum parsimony method are often criticized for lack of a statistical
approach. Both these methods do have criteria to select trees but are unable to calculate
the probability of one tree being the true tree over the other. Various methods have been
proposed to overcome this drawback. Two such methods are provided by likelihood and
Bayesian approaches.
In simplistic terms, likelihood can be considered as the probability assigned to
each dataset (observed characters such as nucleotides) generated for a particular
hypothesis (tree and model of evolution). In a way this is similar to maximum parsimony
because each tree is assigned a score, but this score is a likelihood score based on
statistical analysis. The best tree is the one, which has highest probability for a particular
model of how changes occur. Both maximum parsimony and maximum likelihood are
computationally exhaustive exercise and hence are slow. A detailed discussion about
likelihood can be found in referenced text books.
Another statistical method for phylogeny is Bayesian method. In maximum
likelihood we calculate the probability of observing data for a given hypothesis, in Bayesian
method, probability is calculated for a particular hypothesis.

Summary
Molecular Phylogeny is to study evolutionary relationships based on molecular sequence
data. Different methods have been proposed for studying phylogeny. Earlier methods were
distance based and considered constant evolutionary rates. These methods used more
exhaustive and computationally exhaustive methods like maximum parsimony. These
methods are now being supplemented or replaced with more sophisticated statistical
methods like maximum likelihood and Bayesian method. The benefits and pitfalls of these
methods are still debated and their applicability may depend upon the situation. A basic
understanding of these methods is a must for effective use of them for reconstructing
phylogeny.

Exercises
1. Define phylogenetics.
2. How will you define an Alignment?
3. Name three methods to draw a phylogenetic tree?
4. Define bootstrap?

9
Molecular Phylogeny- Introduction

5. How will you differentiate a dendrogram from a cladogram?

6. What is the difference between a distance based method (NJ) and maximum parsimony
(MP) methods?
7. What is the difference between UPGMA and NJ method?
8. Differentiate between maximum parsimony (MP) and maximum likelihood method (ML).
9. What is the difference between positive and negative selection?
10. Explain the priniciple of parsimony.
11. How will you explain a monophyletic group?
12. How will you explain a paraphyletic group?
13. How will you explain a polyphyletic group?
14. Differentiate between cladogram and phylogram.
15. How will you define root in a phylogenetic tree?
16. What is an outgroup?
17. Explain Jukes-Cantor model for calculating distance.
18. Explain Kimura-2 parameter model for calculating distance.
19. Differentiate between triangular and rectangular matrix.
20. How will you define sister clades?
21. Explain polytomy.
22. Explain how neighbour joining method of phylogenetic tree construction work?

Glossary
Monophyly: when a group include its ancestor all its descendants.
Polyphyly: When different species in a taxon evolve from different ancestors.
Polytomy: When phylogeny is not resolved.
Bootstrap: A statistical method to assess confidence of groupings through random
resampling of data.
Homoplasy: A condition when similarity is a coincidence and not due to common lineage.
Outgroup: A taxon from a closely related group.
Informative sites: sites which represent alternate forms for different OUT’s

References

10
Molecular Phylogeny- Introduction

Suggested Readings
1. Bioinformatics and Functional Genomics by Jonathan Pevsner. YEAR Publisher Wiley-
Blackwell
2. Evolution by Nicholas H Barton, Derek E.G. Briggs, Jonathan A Eisen, David B. Goldstein,
Nipam H Patel.2007-2010. Publisher CSHL press

Web Links
http://evolution.genetics.washington.edu
http://evolution.genetics.washington.edu/phylip.html

Phylogenetic Analysis
No ratings yet
Phylogenetic Analysis
47 pages
Phylogenetic Tree Construction
No ratings yet
Phylogenetic Tree Construction
6 pages
Phylogenetic Tree Methods Guide
No ratings yet
Phylogenetic Tree Methods Guide
27 pages
Phylogenetic Tree
No ratings yet
Phylogenetic Tree
31 pages
Phylogenetic Tree
No ratings yet
Phylogenetic Tree
25 pages
Phylogenetic Tree
No ratings yet
Phylogenetic Tree
25 pages
Molecular Phylogenetic Analysis: - Humans-flies-Mollusks - Common Phenotype?
No ratings yet
Molecular Phylogenetic Analysis: - Humans-flies-Mollusks - Common Phenotype?
35 pages
Molecular Phylogeny Basics
No ratings yet
Molecular Phylogeny Basics
39 pages
Phylogenetic Trees (BIOINFORMATICS)
No ratings yet
Phylogenetic Trees (BIOINFORMATICS)
7 pages
4rth Phylogeny by MAtti Ullah KHanNiazi
No ratings yet
4rth Phylogeny by MAtti Ullah KHanNiazi
9 pages
A Review: Phylogeny Construction Methods: Priyanka Shaktawat, Parvati Bhurani
No ratings yet
A Review: Phylogeny Construction Methods: Priyanka Shaktawat, Parvati Bhurani
4 pages
Class16-Introduction To Molecular Phylogenetics
No ratings yet
Class16-Introduction To Molecular Phylogenetics
14 pages
Phylogenetic Analysis1
No ratings yet
Phylogenetic Analysis1
62 pages
Molecular Phylogenetics
No ratings yet
Molecular Phylogenetics
29 pages
Module 2 Unit - 2 EVOLUTIONARY TREES AND PHYLOGENY
No ratings yet
Module 2 Unit - 2 EVOLUTIONARY TREES AND PHYLOGENY
39 pages
Phylogenetic Tree Construction - Methods
No ratings yet
Phylogenetic Tree Construction - Methods
7 pages
Phylogenetic Tree
No ratings yet
Phylogenetic Tree
12 pages
Molecular Phylogeny
No ratings yet
Molecular Phylogeny
78 pages
Lecture 9 - Phylogenetic Tree
No ratings yet
Lecture 9 - Phylogenetic Tree
16 pages
Phyl o Genetics
No ratings yet
Phyl o Genetics
58 pages
Phylogenetics 1 and 2
No ratings yet
Phylogenetics 1 and 2
30 pages
Ceng465 Week8
No ratings yet
Ceng465 Week8
40 pages
Introduction To Phylogeny
No ratings yet
Introduction To Phylogeny
57 pages
Phylogenetic Analysis Extra
No ratings yet
Phylogenetic Analysis Extra
13 pages
PHYLOGENY
No ratings yet
PHYLOGENY
17 pages
Phylogenetic Tree Reconstruction: I519 Introduction To Bioinformatics, 2012
No ratings yet
Phylogenetic Tree Reconstruction: I519 Introduction To Bioinformatics, 2012
40 pages
L13 PhylogenyTrees
No ratings yet
L13 PhylogenyTrees
19 pages
BDMH Phylogenetic
No ratings yet
BDMH Phylogenetic
32 pages
College of Agriculture, Rajendranagar, Hyderabad-500030: Professor Jayashankar Telangana State Agricultural University
No ratings yet
College of Agriculture, Rajendranagar, Hyderabad-500030: Professor Jayashankar Telangana State Agricultural University
34 pages
Swami
No ratings yet
Swami
11 pages
Molecular Phylogenetics Guide
100% (1)
Molecular Phylogenetics Guide
19 pages
Introduction To Molecular Evolution: Mike Thomas October 3, 2002
No ratings yet
Introduction To Molecular Evolution: Mike Thomas October 3, 2002
32 pages
Molecular Phylogenetics Guide
No ratings yet
Molecular Phylogenetics Guide
49 pages
Phylogenetic Tree
No ratings yet
Phylogenetic Tree
9 pages
College of Agriculture, Rajendranagar, Hyderabad-500030: Professor Jayashankar Telangana State Agricultural University
No ratings yet
College of Agriculture, Rajendranagar, Hyderabad-500030: Professor Jayashankar Telangana State Agricultural University
34 pages
College of Agriculture, Rajendranagar, Hyderabad-500030: Professor Jayashankar Telangana State Agricultural University
No ratings yet
College of Agriculture, Rajendranagar, Hyderabad-500030: Professor Jayashankar Telangana State Agricultural University
34 pages
Phylogenetic Tree Bioinformatics
No ratings yet
Phylogenetic Tree Bioinformatics
4 pages
Phylogenetics PDF by Matti Ullah KHan NIazi
No ratings yet
Phylogenetics PDF by Matti Ullah KHan NIazi
4 pages
Lab 3
No ratings yet
Lab 3
6 pages
Phylogenetic Tree Construction
No ratings yet
Phylogenetic Tree Construction
3 pages
Phylogenetic Analysis Guide
No ratings yet
Phylogenetic Analysis Guide
29 pages
BE Phylogenetics
No ratings yet
BE Phylogenetics
6 pages
Disclaimer
No ratings yet
Disclaimer
36 pages
Swami
No ratings yet
Swami
12 pages
Molecular Phylogenetics
No ratings yet
Molecular Phylogenetics
17 pages
Phylogenetic Trees
No ratings yet
Phylogenetic Trees
11 pages
BIOL 401 - W22 - Lecture - Phylogenetic Inference
No ratings yet
BIOL 401 - W22 - Lecture - Phylogenetic Inference
39 pages
Slides Week03
No ratings yet
Slides Week03
49 pages
Phylogenic Tree
No ratings yet
Phylogenic Tree
42 pages
Intro To Phyl o Genetics
No ratings yet
Intro To Phyl o Genetics
44 pages
Phylogenetic Analysis
No ratings yet
Phylogenetic Analysis
6 pages
Phylogenetic Analysis
No ratings yet
Phylogenetic Analysis
11 pages
Disclaimer
No ratings yet
Disclaimer
9 pages
Bscol 7
No ratings yet
Bscol 7
29 pages
Molecular Phylogenetics Guide
No ratings yet
Molecular Phylogenetics Guide
20 pages
BIL-Note 2 Last
No ratings yet
BIL-Note 2 Last
44 pages
Lecture 11 (Phylogenetic)
No ratings yet
Lecture 11 (Phylogenetic)
24 pages
Understanding Phylogenies
No ratings yet
Understanding Phylogenies
6 pages
Phylogeny Analysis
No ratings yet
Phylogeny Analysis
49 pages
S10 Q3 Enhanced Hybrid Module 6 Week 6 Final
No ratings yet
S10 Q3 Enhanced Hybrid Module 6 Week 6 Final
15 pages
Physical Anthropology Course Guide
No ratings yet
Physical Anthropology Course Guide
2 pages
General Biology
No ratings yet
General Biology
8 pages
Introduction To Genetics Reading Comprehension Grade9
No ratings yet
Introduction To Genetics Reading Comprehension Grade9
2 pages
Biology 1010 Final Exam Take Home
No ratings yet
Biology 1010 Final Exam Take Home
3 pages
1 Systematics
No ratings yet
1 Systematics
15 pages
Globin Gene & Molecular Clock PDF
No ratings yet
Globin Gene & Molecular Clock PDF
36 pages
Mendelian Genetics Worksheet PDF
100% (1)
Mendelian Genetics Worksheet PDF
4 pages
BIO2 11 - 12 Q4 0706 Establishing Species Relationships Using A Cladogram and Phylogenetic Tree
No ratings yet
BIO2 11 - 12 Q4 0706 Establishing Species Relationships Using A Cladogram and Phylogenetic Tree
55 pages
Unit 2 Concept of Breeding in Crops
No ratings yet
Unit 2 Concept of Breeding in Crops
3 pages
Evolution Insights for Biology Enthusiasts
No ratings yet
Evolution Insights for Biology Enthusiasts
11 pages
Module 3 - Biodiversity and Evolution
100% (12)
Module 3 - Biodiversity and Evolution
39 pages
Charles Darwin
100% (2)
Charles Darwin
3 pages
POWERED BY MR NOTES CLASS 10th NOTES FOR STUDENTS For 2023
No ratings yet
POWERED BY MR NOTES CLASS 10th NOTES FOR STUDENTS For 2023
7 pages
Understanding Genetic Mutations
No ratings yet
Understanding Genetic Mutations
5 pages
Evidences of Organic Evolution
No ratings yet
Evidences of Organic Evolution
12 pages
16.2 Answers
0% (1)
16.2 Answers
2 pages
Instant Download Introduction To Conservation Genetics 1st Edition Richard Frankham PDF All Chapters
100% (11)
Instant Download Introduction To Conservation Genetics 1st Edition Richard Frankham PDF All Chapters
51 pages
Evolutionary Theory Explained
100% (1)
Evolutionary Theory Explained
5 pages
LizardsEvoTree StudentHO Film
No ratings yet
LizardsEvoTree StudentHO Film
5 pages
Fossils
No ratings yet
Fossils
1 page
AI Report-2
No ratings yet
AI Report-2
20 pages
Evidence For The Theory of Evolution
No ratings yet
Evidence For The Theory of Evolution
30 pages
SSLC Bio Eng Term 1 Model QN 2025 (Hsslive)
No ratings yet
SSLC Bio Eng Term 1 Model QN 2025 (Hsslive)
6 pages
Module 29 Edited
No ratings yet
Module 29 Edited
16 pages
Questions 21-31 Are Based On The Following Passage and Supplementary Material
No ratings yet
Questions 21-31 Are Based On The Following Passage and Supplementary Material
4 pages
Mot Ch4 MSC 2024
No ratings yet
Mot Ch4 MSC 2024
77 pages
096 - Dihybrid Aliens
No ratings yet
096 - Dihybrid Aliens
1 page
Science 10 Quarter 3
No ratings yet
Science 10 Quarter 3
10 pages
Principle of Inheritance and Variation - DPP 07 (Extra) - Yakeen NEET 2.0 2025 (Legend)
No ratings yet
Principle of Inheritance and Variation - DPP 07 (Extra) - Yakeen NEET 2.0 2025 (Legend)
3 pages

Molecular Phylogeny - Introduction

Uploaded by

Molecular Phylogeny - Introduction

Uploaded by

Molecular Phylogeny- Introduction

Lesson: Molecular Phylogeny- Introduction

Lesson Developer: Shailendra Goel

College/ Department: Department of Botany, University of Delhi

Chapter: Molecular Phylogeny

How to generate trees

Methods of Phylogenetic reconstruction

Statistical methods of phylogeny

How to generate trees

Positive and negative selection

Figure: Phylogram Figure: Cladogram

Rooted vs Unrooted trees

Figure: Midpoint Rooted Tree Figure: Outgroup rooted tree

Figure: Defining Trees

Methods of Phylogenetic reconstruction

sequence alignment. Distance is calculated as dissimilarity between the sequences of each

UPGMA distance based method

Neighbor joining method

Figure: Jukes cantor model Figure: Kimura two parameter model

Statistical methods of phylogeny

5. How will you differentiate a dendrogram from a cladogram?

You might also like