A Practical Guide to Phylogenetics for Nonexperts

Damien O'Halloran

doi:10.3791/50975

Method Article

A Practical Guide to Phylogenetics for Nonexperts

DOI:

10.3791/50975

⸱

February 5th, 2014

Damien O'Halloran¹

¹Department of Biological Sciences and Institute for Neuroscience, The George Washington University

Summary

$$\rightleftharpoonup{xx}$$ $$\longleftharp{xx}$$, $$\longrightharp{xx}$$,

Here we describe a step-by-step pipeline for generating reliable phylogenies from nucleotide or amino acid sequence datasets. This guide aims to serve researchers or students new to phylogenetic analysis.

Abstract

$$\rightleftharpoonup{xx}$$ $$\longleftharp{xx}$$, $$\longrightharp{xx}$$,

Many researchers, across incredibly diverse foci, are applying phylogenetics to their research question(s). However, many researchers are new to this topic and so it presents inherent problems. Here we compile a practical introduction to phylogenetics for nonexperts. We outline in a step-by-step manner, a pipeline for generating reliable phylogenies from gene sequence datasets. We begin with a user-guide for similarity search tools via online interfaces as well as local executables. Next, we explore programs for generating multiple sequence alignments followed by protocols for using software to determine best-fit models of evolution. We then outline protocols for reconstructing phylogenetic relationships via maximum likelihood and Bayesian criteria and finally describe tools for visualizing phylogenetic trees. While this is not by any means an exhaustive description of phylogenetic approaches, it does provide the reader with practical starting information on key software applications commonly utilized by phylogeneticists. The vision for this article would be that it could serve as a practical training tool for researchers embarking on phylogenetic studies and also serve as an educational resource that could be incorporated into a classroom or teaching-lab.

Introduction

$$\rightleftharpoonup{xx}$$ $$\longleftharp{xx}$$, $$\longrightharp{xx}$$,

In order to understand how two (or more) species evolved, it is first necessary to obtain sequence or morphological data from each sample; these data represent quantities that we can use to measure their relationship through evolutionary space. Just like when measuring linear distance, having more data available (e.g. miles, inches, microns) will equate to a more accurate measurement. Ergo, the accuracy with which a researcher can deduce evolutionary distance is heavily influenced by the volume of informative data available to measure relationships. Furthermore, because different samples evolve at different rates and by different mechanisms, the method that w....

Access restricted. Please log in or start a trial to view this content.

Protocol

$$\rightleftharpoonup{xx}$$ $$\longleftharp{xx}$$, $$\longrightharp{xx}$$,

1. Basic Local Alignment Search Tool (BLAST): Online Interface

Click on this link to visit the BLAST¹ web server at the National Center for Biotechnology Information (NCBI). - http://blast.ncbi.nlm.nih.gov/Blast.cgi (Figure 1).
Input a FASTA formatted text sequence (see Figure 2 for example) into the query box.
Click the appropriate BLAST program and relevant database or individual species of interest to use in the search and then click “BLAST”.
Note: FASTA formatted sequence begins with a descripti....

Access restricted. Please log in or start a trial to view this content.

Results

$$\rightleftharpoonup{xx}$$ $$\longleftharp{xx}$$, $$\longrightharp{xx}$$,

Finding similarities to a query allows researchers to ascribe a potential identity to new sequences and also infer relationships between sequences. The file input type for BLAST¹ is FASTA formatted text sequence or GenBank accession number. FASTA formatted sequence begins with a description line indicated by a “>” sign (Figure 2). The description must follow immediately after the “>” sign, the sequence (i.e. nucleotides or amino acids) follow the description on the next line. Wh.......

Access restricted. Please log in or start a trial to view this content.

Discussion

$$\rightleftharpoonup{xx}$$ $$\longleftharp{xx}$$, $$\longrightharp{xx}$$,

Our hope for this article is that it will serve as a starting point to guide researchers or students that are new to phylogenetics. Genome sequencing projects have become less expensive over the last few years and as a consequence the user demand for this technology is increasing, and now the production of large sequence datasets is commonplace in small labs. These datasets often provide researchers with sets of genes that require a phylogenetic framework to begin to understand their function. Furthermore, because phylog.......

Access restricted. Please log in or start a trial to view this content.

Disclosures

$$\rightleftharpoonup{xx}$$ $$\longleftharp{xx}$$, $$\longrightharp{xx}$$,

We have nothing to disclose.

Acknowledgements

$$\rightleftharpoonup{xx}$$ $$\longleftharp{xx}$$, $$\longrightharp{xx}$$,

We thank members of the O’Halloran lab for comments on the manuscript. We thank The George Washington University Department of Biological Sciences and Columbian College of Arts and Sciences for Funding to D. O’Halloran.

....

Access restricted. Please log in or start a trial to view this content.

Materials

List of materials used in this article
Name	Company	Catalog Number	Comments
BLAST webpage			http://blast.ncbi.nlm.nih.gov/Blast.cgi
BLAST executables			ftp://ftp.ncbi.nlm.nih.gov/blast/executables/blast+/LATEST/
Preformatted BLAST databases			ftp://ftp.ncbi.nlm.nih.gov/blast/db/
Clustal			http://www.clustal.org/
Kalign			http://msa.sbc.su.se/cgi-bin/msa.cgi
MAFFT			http://mafft.cbrc.jp/alignment/software/
MUSCLE			http://www.drive5.com/muscle/
T-Coffee			http://www.tcoffee.org/Projects/tcoffee/
PROBCONS			http://toolkit.tuebingen.mpg.de/probcons
Se-Al			http://tree.bio.ed.ac.uk/software/seal/
BSEdit			http://www.bsedit.org/
JalView			http://www.jalview.org/
SeaView			http://pbil.univ-lyon1.fr/software/seaview.html
ProtTest			https://code.google.com/p/prottest3/
Java Runtime			http://www.java.com/en/download/chrome.jsp
Readseq			http://iubio.bio.indiana.edu/cgi-bin/readseq.cgi
jModelTest			https://code.google.com/p/jmodeltest2/
PhyML			https://code.google.com/p/phyml/
MrBayes			http://mrbayes.sourceforge.net/download.php
TreeView			http://taxonomy.zoology.gla.ac.uk/rod/treeview.html
TreeDyn			http://www.treedyn.org/

References

$$\rightleftharpoonup{xx}$$ $$\longleftharp{xx}$$, $$\longrightharp{xx}$$,

Altschul, S. F., Carroll, R. J., Lipman, D. J. Weights for data related by a tree. J. Mol. Biol. 207 (4), 647-653 (1989).
Akaike, H. A new look at the statistical model identification. IEEE Trans. Automat. Contr. 19 (6), 706-723 (1974).
Schwarz, G.

Access restricted. Please log in or start a trial to view this content.

Reprints and Permissions

Request permission to reuse the text or figures of this JoVE article

Request Permission

A Practical Guide to Phylogenetics for Nonexperts

In This Article

Summary

Abstract

Introduction

Protocol

Results

Discussion

Disclosures

Acknowledgements

Materials

References

Reprints and Permissions

Tags

Related Articles