Seevolution User's Guide

Seevolution User's Guide and Tutorial

Seevolution provides the ability to interactively browse evolutionary histories of chromosomes. This guide describes how Seevolution works, including an overview of the display and an introduction to constructing custom datasets for Seevolution. The guide takes the form of a tutorial, with concepts introduced by example.

An illustrated example

Let's get started. Click here to launch seevolution with a simple mutations file. Java will begin downloading Seevolution and if you are running Firefox on Windows, you may have to interact with a series of screens that look like the following:

Seevolution Web Start download dialog
Check the "Do this automatically..." box and click "OK"

When you see this...
Seevolution download progress bar
simply wait for Seevolution to download

When you see this...
Setting security for Seevolution
Important! Check the "Always trust content..." box and click "Run"

If all went well, Seevolution should now be loaded and will look like this:
Seevolution screenshot
On the right side is the main Seevolution window. The example data presently loaded consists of a single circular chromosome. The chromosome is colored with four randomly selected colors, representing contiguous segments of the chromosome that are free from internal rearrangement. Five buttons exist at the top of the window and are from left to right: "Open a mutations file," "Configure Seevolution Preferences," "Play the mutation animation," "Load the online help," and finally "Load the example page."
The chromosome viewer is an interactive 3D entity. Clicking and dragging in the chromosome window will rotate the chromosome, right-clicking and dragging will shift the display, and operating the mouse scroll-wheel in the chromosome window causes the display to zoom in and out.
On the left side is the Seevolution TreeViewer window. For this simple example the tree consists of an ancestor and a single child. The nodes are colored yellow and green; yellow indicates the start node for animation, green indicates the end node. All mutation events between the start node and the end node will be animated when the play button is clicked. A blue tracker indicates progress along the branch as mutations occur.

Let's click play and watch the Seevolution animation. The short video below shows what will happen:

As you can see, Seevolution displays two inversion events during the course of evolution from the ancestor to the child node. Of course, Seevolution doesn't just magically know what mutations to display, instead it reads an input file that specifies the mutations. The input file for the simple example shown above looks like this:

<?xml version="1.0" encoding="ISO-8859-1"?>
<genomemutations id="1">
	<organism id="Organism">
		<chromosome id="main" circular="true" length="30"/>
		<speciation name="child">
			<chromosome id="main" circular="true" length="30"/>
			<inversion left="0" right="10" chromosome="main"/>
			<inversion left="8" right="22" chromosome="main"/>
		</speciation>
	</organism>
</genomemutations>

The first line specifies that the file is an XML format file. The second line contains a genomemutations tag, indicating that the next portion of the file will contain a listing of zero or more genome mutations. The third line defines an organism, with the name of "Organism." The fourth line defines a circular chromosome of 30 nucleotides in length. The fifth and sixth lines indicate the speciation of a child organism, and that the child has a chromosome that is also circular and 30 nucleotides long. The following two lines specify that inversion events take place with particular left-end and right-end nucleotide breakpoints. The remaining lines close the previously opened speciation, organism, and genomemutations XML elements.

One way to create data to view in Seevolution is by creating the appropriate mutations XML in a text editor and saving it to a file with the .mutations file name extension. Of course, such an approach would be tedious or impossible for large and complex mutation histories, so further below we will discuss automated methods to infer mutation histories and create mutation XML files.

Phylogenetic trees in Seevolution

The simple example above demonstrated Seevolution with only a single ancestor and single child node. Seevolution can also display phylogenetic relationships among multiple organisms and can thus be a fun and intuitive way to explore phylogenetic histories. Here we illustrate a simplistic phylogeny of three organisms. Click here to load Seevolution with the phylogeny example.

A simple phylogeny displayed in Seevolution
In this example, there are three modern organisms named "George Bush," "Chimpanzee," and "Human." The ancestor of the Chimpanzee and the Human is the "Proto-chimp." These mythical organisms each have one large circular chromosome and one smaller linear chromosome (drawn inside the circular chromosome). As above, the chromosome colors delineate chromosomal segments that are internally free from rearrangement.

The yellow dot on the tree indicates the node where animation will begin when the user clicks the "play" button. The green dot indicates the target node. As shown above, the source is currently set to the Chimpanzee, and the target is set to the Human. If instead we want to visualize the mutations separating the Proto-chimp from the George Bush, we can do so by clicking on the Proto-chimp node and right-mouse-button-clicking on the George Bush node. Clicking on a node sets it as the start node, and right-clicking on a node sets it as the end node for animation. Note to Mac users: hold down the Control key when clicking to emulate a right-click. Setting the start and end node is shown in the video below.

Now let's discover what mutations occurred in the George Bush lineage:

The first mutation involves a large inversion covering about half of the main chromosome. The second mutation is a transposition, again in the main chromosome. The third mutation is an inversion in the "aux" chromosome. The final mutation is a deletion of 5 nucleotides from the main chromosome in the George Bush lineage.

The mutations XML code for this phylogenetic tree example is somewhat more complex and shown below:

<?xml version="1.0" encoding="ISO-8859-1"?>
<genomemutations id="1">
	<organism id="Ancestor">
		<speciation name="root">
			<chromosome id="main" circular="true" length="1000"/>
			<chromosome id="aux" circular="false" length="400"/>
			<speciation name="George Bush">
				<chromosome id="main" circular="true" length="1000"/>
				<chromosome id="aux" circular="false" length="395"/>
				<transposition left="329" right="429" insertionsite="200" chromosome="main"/>
				<inversion left="100" right="250" chromosome="aux"/>
				<deletion left="950" right="955" chromosome="main"/>
			</speciation>
			<speciation name="Proto-chimp">
				<chromosome id="main" circular="true" length="1000"/>
				<chromosome id="aux" circular="false" length="400"/>
				<inversion left="222" right="679" chromosome="main"/>
				<speciation name="Chimpanzee">
					<chromosome id="main" circular="true" length="1000"/>
					<chromosome id="aux" circular="false" length="400"/>
					<inversion left="170" right="400" chromosome="main"/>
					<inversion left="804" right="830" chromosome="main"/>
				</speciation>
				<speciation name="Human">
					<chromosome id="main" circular="true" length="1000"/>
					<chromosome id="aux" circular="false" length="400"/>
					<inversion left="10" right="30" chromosome="main"/>
				</speciation>
			</speciation>
		</speciation>
	</organism>
</genomemutations>

Most importantly, the phylogenetic tree topology is encoded by the nesting of speciation tags. Also note that the chromosomes are defined after each speciation event, followed by the mutations occurring on the branch. Seevolution draws branch lengths proportionately to the number of mutations on a given branch.

Changing display settings in Seevolution

Seevolution has many customizable settings that control how it displays chromosomes. To investigate, click the preferences button in Seevolution:

Upon loading the preferences panel, you will be confronted with four different configuration tabs. We'll discuss each one in turn. First up is the "Appearance" tab:
Seevolution appearance preferences

Let's discuss coloring first. Seevolution has three main options for coloring: Solid, Gradient, and Random.

Random Coloring.: By default, the random color scheme is selected in Seevolution. In random coloring, each segment of the genome which has not been disrupted by a genome rearrangment event is assigned a unique color. That is, colors change at chromosome rearrangement breakpoints. Every time Seevolution runs the random colors are chosen differently, so if you don't like a particular random coloring simply restart the program. Variety is the spice of life!
Gradient Coloring.: In the gradient coloring scheme, Seevolution blends a fixed number of colors around the chromosome, spaced uniformly. To use gradient coloring, one must first designate the number of colors to be at least 2. Then each color should be selected by clicking on the numbered black tiles to open a color chooser. For a two-color gradient for example, Color 0 and Color 1 must be assigned in the preference panel. One context in which gradient coloring can be useful is investigating the overall movement of chromosome segments relative to a reference point such as the origin of replication. If the origin of chromosome replication is at position 0 in a circular chromosome, and a two-color gradient is applied, the origin will be colored with the most intense value of Color 0, with segments at greater distance from the origin having colors that more closely match Color 1.
An example of gradient coloring with yellow and blue is shown below:

Solid Coloring.: Solid coloring is like gradient coloring, except that each contiguous segment will be drawn in a single solid color. In a two-color solid scheme, the gradient is "discretized" into chunks that correspond to the rearrangement-free segments (synteny blocks) of the chromosome. The above dataset, when displayed with a 2-color solid scheme instead of a gradient scheme, looks like:

Also in the Appearance tab is a selection of rendering modes. Seevolution can apply lighting and optionally draw the chromosomes as wireframes. These options are a matter of personal preference, and may incur additional CPU overheads. The "Textures" option can be enabled to display a custom texture map on the chromosome.

Finally, the "Appearance values" group contains a number of settings that affect the shape of chromosomes. The Linear ratio and Circular Ratio control the thickness of the chromosome. Values in the range of 0.01 to 0.2 make reasonable choices. The Number of surfaces setting controls how round chromosomes appear, and the Number of sections controls how circular a circular chromosome will appear. For example, setting the number of sections to four would cause circular chromosomes to appear as a square, five would make a pentagon, six = hexagon and so on.

Let's move on to the next tab, Animation:
Seevolution animation preferences
This tab allows one to specify colors for different types of mutation events. Clicking on the color for a given mutation type will bring up a color chooser. The user also has the option to rotate the chromosomes and the genome.

The next tab specifies settings for animation timings:
Seevolution timing preferences
The value at top, Time delay between frames, specifies the overall speed at which Seevolution will attempt to animate the chromosomes. With an adequately powerful CPU and video card, a setting of 10ms would yield 100 frames of video per second. To the average human eye, video appears smooth at speeds above 30 to 60 frames per second. The remaining settings dictate the number of frames used to draw each mutation type and hence, how long in real time they take to animate.

Finally, we have the Miscellaneous tab which contains Language settings. This tab is self-explanatory and not shown here.

Capturing the Seevolution video

It is possible to record videos made with Seevolution for use in presentations and web sites. For Windows users we recommend a piece of free, open-source software called CamStudio to perform this task. Once CamStudio has been installed, prepare Seevolution for recording, and then use CamStudio to select the region of the screen to record. To do so, open the "Region" menu in CamStudio and select the "Fixed Region..." option. In the window that pops up, click the "Select" button, then click and drag the mouse to draw a box around the Seevolution window for recording. Finally, click the record button in CamStudio or hit F8 (keyboard shortcut).

Users of Mac OS X might try one of these other software options.

Capturing a screenshot of Seevolution

On Macintosh, a screenshot can be captured by pressing the keys Apple-shift-3 at once. On windows, screenshots can be captured by pressing Control-Shift-PrintScreen simultaneously. Often the Print Screen key is also the F15 key and on some laptops it's mapped to a special function key. Please consult your hardware documentation. Linux users have many screen capture software options available, please inquire with professor Google.

Converting output from other programs

Complex chromosome histories can be difficult to construct manually. It is often desirable to infer the evolutionary history using an automated software tool and then convert the output to the mutation XML format used by Seevolution. As a large number of inference programs exist we cannot create file format converters for all of them, but we have created a converter for one such program called BADGER/barphlye which reconstructs inversion phylogeny. The C++ source code for the format conversion program is given below:

makeSeevolutionXml.cpp

To use this program on Linux and Mac OS X, simply download it to a folder, then launch a terminal window and run:

g++ -o makeSeevolutionXml makeSeevolutionXml.cpp

to compile the program. Mac users will need Xcode installed. Now the program can be run with the following arguments:

makeSeevolutionXml <output file> <LCB size filename> <Matrix file> <Outgroup genome ID> <Reference genome ID> 
                   <Ref genome length> <circular|linear> <[permutation file 1] ... [permutation file N]>

Where output file is the desired XML file name, LCB size filename is the file containing LCB lengths generated by the Mauve aligner, Matrix file is the LCB permutation matrix generated by the Mauve aligner, Outgroup genome ID is the numerical index of the outgroup organism in the permutation file (counting from 0), Reference genome ID is the numerical index of the organism to use as a reference for chromosome size and block lengths, Ref genome length is the total length of the reference genome, and circular|linear specifies the type of chromosome. The permutation files are the .prm files created by running BADGER/barphlye.

Windows users will need to obtain either Cygwin or the free Visual C++ express edition compiler and build this program.

Please see the BADGER/barphlye web site for instructions on how to reconstruct chromosome inversion phylogenies and generate the files needed for input to the makeSeevolutionXml program.

Converting ClonalFrame output to Mutation XML

The ClonalFrame software provides a method to infer the history of nucleotide substitutions and gene conversion (recombination) events among a group of closely-related bacteria. A perl script is available to convert output from ClonalFrame to the Mutation XML format. The perl script requires the ClonalFrame posterior output file and the source alignment XMFA. From these two files, the script extracts the inferred consensus tree topology and a set of recombination events that occurred on each branch of the tree. By default, only recombination events longer than 500nt with at least 0.5 Bayesian posterior support over their entire length are displayed.

Download clonalframe2seevolution

To use the script, first make it executable chmod 755 clonalframe2seevolution then run it as clonalframe2seevolution <ClonalFrame posterior file> <XMFA alignment file> <Seevolution XML output file>, substituting your own values in for the portions in <>.

An example Seevolution file for the gene conversions (recombinations) inferred by ClonalFrame on a group of six Staphylococcus aureus genomes is available on the example input page.

Seevolution Mutation XML tag reference

Seevolution 1.0 currently supports the following mutation XML tags. User-specified numeric variables are given by ###, alphanumeric variables by @@@. DNA sequence of arbitrary length is indicated by "DNA" and a single nucleotide is just "D".

Speciation tags can be used to encode a tree strucutre. Nesting of speciation tags encodes the tree strucutre. Since Seevolution assumes all trees are rooted trees, a rooted tree represented in Newick parenthesis format by (A,B) can be expressed in Seevolution as:

<speciation name="root">
	<speciation name="A">
	</speciation>
	<speciation name="B">
	</speciation>
</speciation>

The examples in the User's Guide above demonstrate how chromosomes are defined in Seevolution XML and also how mutations on each tree branch can be expressed.

Example Mutation XML files

We have provided a set of example files online.

All material on this page Copyright 2008 Aaron Darling.
This work is licensed under a Creative Commons Attribution-Share Alike 3.0 United States License.