CACTUS 1.13:

Comparative analysis of continuous traits using statistics

Copyright Dylan W. Schwilk, Prickly Software 1999 - 2001

Available from: http://www.pricklysoft.org/.

Permission is granted to use or distribute but not to sell.

Comments and bug reports to "schwilk" at pricklysoft.org.

When referring to CACTUS please cite:

Schwilk D. W., Ackerly D. D. 2001. Flammability and serotiny as strategies: correlated evolution in pines. - Oikos 94: 326-336.

Documentation as of 11/29/01. Information here supersedes that in the help files available through the CACTUS application. I plan to discontinue the windows help file in CACTUS in favor of this easier to maintain HTML document. Please contact D.W. Schwilk if you have any questions which aren't yet answered here.

Contents

Overview

Although recent advances in in comparative biology have made it possible to conduct explicit tests of correlated evolution, these tests are often computationally intensive and require specific software. CACTUS is a computer program for the automation of comparative methods. It was designed with the following goals in mind:

  1. Efficient implementation of parsimony statistics and independent contrasts.
  2. Simple user interface.
  3. Compatibility with existing phylogenetic software through the use of the NEXUS file format.
  4. Efficiency in analyzing data over alternative phylogenies.

CACTUS calculates descriptive statistics and correlation coefficients based on independent contrasts for continuous characters measured on a set of taxa, given one or more phylogenetic trees describing the relationship among these taxa.  Similar programs are available from Emilia Martins (Univ. of Oregon), Andrew Purvis (Oxford University) and Ted Garland (University of Wisconsin), and David Ackerly (Stanford University) among others (see links to many programs at the Berkeley phylogenetic software page).

Platforms

Currently, CACTUS runs only on Windows computers (windows 9x, NT, and 2000). The back end, including NEXUS file writing and reading, is written in ansi C++, but porting the front-end to another would require quite a lot of front-end work. I do not plan to add many more features or analyses to the main CACTUS program, rather I am writing small stand-alone console modules for new analyses. These require the user to have CACTUS to make NEXUS files, prune trees, and select trees and characters, but run as simple command-line programs. This allows me to keep CACTUS itself from becoming too bloated, but avoid having to add much user interface to stand-alone modules since CACTUS will act as the main method of manipulating files and the stand-alone modules will require NEXUS files with CACTUS blocks

Release History

CACTUS evolved from Dr. David Ackerly's PASCAL program ACAP (Another Comparative Analysis Program) and is indebted to his work.

Date

Action

December 1998

First test release: CACTUS 0.8

May 25 1999

2nd test release: CACTUS 0.9

October 30, 1999

Released v.1.0.

May 17, 2000

Released v.1.1

December 15, 2000

Released v.1.11

March 1, 2001

Released v.1.12

September 15, 2001

Released v.1.13

November 5, 2001

I was alerted to a huge set of bugs that had emerged from a library conflict created in the last (Sept 15th) build. I took CACTUS off the web and found and corrected the errors. If you have a copy of CACTUS 1.13 please delete it and download a new one. The corrected build is build 4 or higher.

November 27, 2001

Further testing uncovered a few more ramifications of the library conflict problem. Current build is build 9 -- Results are testing correctly against other software (ACAP) now.


|Contents|


Revision Notes

Bug fixes in version 1.13:

  1. Fixed problem with CACTUS being unable to read negative exponents in branch lengths in tree descriptions.
  2. Changed resampling randomizations to conform better to standards.

Changes in version 1.12:

  1. Added ability to set the file output numerical precision under the Files..Properties dialog. And added the ability to set the matrix view numerical precision through the Tools..Options Dialog.
  2. The file output precision setting is now stored in the CACTUS PROPERTIES statement.

Changes in version 1.11:

  1. This is primarily a rebuild of v1.1 using SGI's iostream library rather than Microsoft's. Rebuild also fixed a few numerical problems that occured only under WIN95.
  2. The Tools..Options dialog now provides a way to change the precision in which floating point numbers are written to the output files.

Changes in Version 1.1:

  1. Saving contrasts to a text file no longer saves duplicate contrasts with the sign reversed in the second entry. This was intended behavior and a carryover from ACAP, but not well documented and probably unexpected for most users.
  2. When the file property "overwrite CONTINUOUS block" is enabled (File...Properties), CACTUS will now save a continuous block even if one did not previously exist in the file and will place it directly after the first DATA block.
  3. Data...Sort Matrix command was added. This allows the user to sort the data matrix by any column. This does not affect how the matrix is written upon saving, since saving is always done in Taxon Id order to eliminate problems with programs that don't fully implement the NEXUS standard. Note that this means that switching from Data View to Tree View and then back to Data View will resort the matrix by ascending Taxon Id.
  4. An additional CACTUS block property, WRITE_SETS, has been added to the properties statement. This allows set writing to be turned on and off since it turns out that MacClade will balk at unrecognized sets. To make NEXUS files most compatible with MacClade, this option should be turned off via the File..Properties dialog. NEXUS files created with previous CACTUS versions should have this statement added (example: WRITE_SETS=TRUE) in the properties statement so that properties are properly interpreted. See the revised example.nx.

Bug fixes in v.1.1:

  1. The application will no longer crash if no tree is selected during the Draw Tree Dialog.
  2. Number of trees appended to document via the Trees..Import Trees command is now reported correctly in the log file.

|Contents|

Getting started

After downloading, unzip the file (something like cact0113.zip) to a temporary directory then run setup.exe and follow the instructions on the screen.

Characters and trees

Cactus works on sets of trees and sets of characters. CACTUS saves these sets in the NEXUS file SETS block. Any operations you perform in CACTUS will only use those characters and those trees which you have selected through Trees...Select Trees or Characters...Select Characters. You can change the name of the set objects in which the tree and character sets are stored through the Options dialog.

The log window

A new session log is created every time you start CACTUS. This window records the actions of a CACTUS session. You can save the contents of this window to a text file through File...Log or print through File...Print.

The Tree View window

This view of the NEXUS document shows a graphical representation of the trees in the TREE BLOCK. The Draw tree dialog allows the user to select which currently loaded tree is to be displayed in the tree view window. The tree view window can display one tree at a time and will fit the tree to the window. The tree view shows correctly proportioned branch lengths and can be useful to see the effects of pruning on branch lengths.

The Data View Window

This view of the NEXUS document shows the CACTUS character data matrix and allows simple editing of character values and the renaming of taxa and characters. The Data...Sort Matrix command allows the matrix to be sorted according to the data in any column.

Data output

Cactus outputs results in tab-delimited text files which can be easily pasted into a spreadsheet program. Both the descriptive statistics and independent contrasts dialog boxes allow a summarize over trees option which summarize descriptive or correlation statistics over all selected trees rather than reporting values for each tree.

Creating a NEXUS file

You will likely use a program such as MacClade or PAUP to create a NEXUS format file with one or more trees. CACTUS can read such a file even if it lacks any character data and you can use Tree menu commands to manipulate trees, but to do any analysis you will need data on characters. There are two easy ways to import character data into your CACTUS-readable NEXUS file. The first is to create a CONTINUOUS block in the MacClade continuous character data editor -- CACTUS will import from the CONTINUOUS block automatically if there is no CACTUS block in the file. The second method is to edit the character data manually using CACTUS's Data View window. You can also save a spreadsheet of the character data in a simple text format and then import this data through the Characters..Import character matrix command. See "Importing a Character Matrix" below.


|Contents|

Nexus file format

Cactus uses the open and extensible NEXUS format that is used by several computer programs, including MacClade 3. and PAUP. CACTUS stores some of its data in a CACTUS block which it creates upon saving a NEXUS file from within CACTUS. Comparative methods require two main types of data: taxon character data (stored in a continuous matrix nested within the CACTUS block) and phylogenetic information contained in the TREE block of a NEXUS file.

Open and save NEXUS (*.nx) files through the File menu.  Any NEXUS file you open should contain a TREE block and either a CONTINUOUS block or a CACTUS block. If the file has no CACTUS block yet (you've opened it in CACTUS for the first time) then the program will look for character data in a CONTINUOUS block.

Under the File..Properties menu item, you can choose several NEXUS file format options.

  • Write branch lengths: enabling this option result in branch lengths written to the tree descriptions according to the Newick tree standard.
  • Overwrite continuous block: When this option is enabled, CACTUS will replace the CONTINUOUS block in the NEXUS file with a CONTINUOUS block containing data for the currently selected characters.
  • Write branch lengths: this option determines whether CACTUS will write the selected tree and character sets and the prune character set to the NEXUS file.
  • Data matrix missing values indicated by... : this option determines the character or token to use as placeholder for missing data files when writing the character matrix to file.
  • Output numerical precision: here you can change teh number of digits written to floating point numbers in output files.
  • Set names: here you can rename the sets CACTUS saves to the SETS block.

Importing a character matrix

If you wish to import character data into the currently open NEXUS file, use the Characters…Import Character Matrix command. This will read in a simple text format spread sheet type file with taxon character data. The format is as follows: the first line should be the character names separated by white space. Each subsequent line must start with a taxon name followed by character values for that taxon separated by white space. Missing values should be indicated by a question mark ('?'). You can easily create such a file by choosing to save a spreadsheet as tab-delimited text in a spreadsheet program such as Microsoft Excel, but remove any header over the taxon name column as this will be interpreted as a character name and be sure to replace missing data with question marks.

This command replaces the current character data with the newly imported data.

Loading tree files

If you use a program such as MacClade or PAUP to create trees, you may save NEXUS files which have only tree blocks ('tree files'). You can open these files as regular NEXUS files through the File…Open menu command. If the file has no character data, a default CACTUS block will be created the first time the file is saved.

If you wish to import trees from a tree file to an open NEXUS file, use the Trees…Import Trees command.

Transferring files between Macs and Windows machines

Both Macintosh and Windows machines store ascii text, but Macs use a single new line character ('\n') to denote a new line while Windows uses a new line and a line feed ('\n', '\r'). When you open a file in a text editor after transferring it from a Macintosh to a Windows computer, you may see only one LONG line. If your file needs no editing by hand, you need do nothing, since CACTUS will read both new line methods equivalently. If viewing the text file is a problem, however, you can 1) set MacClade to save NEXUS files in PC format or 2) you can paste the test into a word processor (such as MS Word)which usually automatically translates from one format to another and then paste the text back into your ascii text file.


|Contents|

Manipulating trees

Branch lengths

The Newick tree standard endorsed by the NEXUS standard (Maddison et al. 1997) allows trees to contain branch length information. CACTUS can write trees with branch lengths and this option can be toggled on/off via the Tools...Options dialog.

Through the Trees..Branch lengths dialog, CACTUS will also calculate arbitrary branch lengths from the tree topology according to three different methods: All branch lengths equal, GRAFEN, or MINIMUM EXTENSION. Both the GRAFEN and MINIMUM EXTENSION methods assign a relative height to each node (with tips at zero and root at 1.00) and then calculate branch lengths as the difference between the height of two nodes. These methods insure that the total branch length from root to any tip is constant. Grafen's (1989)method sets node height from the tip proportional to number of descendent terminal node (taxa)minus one. The minimal extension method is similar, but makes node height proportional to the daughter clade with the most daughter nodes plus one rather than to the total number of terminal taxa in both daughter clades. Use Draw tree command to see how these branch length methods work.

NOTE: In the absence of any branch length information, the arbitrary branch length method which produces the lowest type I error rates is likely to be equal branch lengths (Purvis et al. 1994).

Pruning trees

Cactus allows users to prune trees down to taxa of interest (this is done automatically, but temporarily, when CACTUS encounters missing character data during analysis. The Trees...Prune Trees dialog allows the user to select taxa to remove and either save the resulting trees or replace the currently selected trees.

Condensing trees

To eliminate duplicate trees use the Trees...Condense Trees dialog. This function is analogous to PAUP's Condense Trees procedure, but has the advantage of not adding previously pruned taxa to every tree as a basal polytomy. The Condense Treesfunction is very useful if you have pruned a large set of trees down to the taxa of interest and now wish to eliminate topologically equivalent trees. Note that this procedure does not compare branch lengths when comparing tree topologies. The user can select to have duplicate trees removed or to save the resulting set of distinct trees to a separate tree file.


|Contents|

Analyses

Descriptive statistics

CACTUS will calculate the Quantitative conVergence Index or QVI (Ackerly and Donoghue 1998). By choosing the Do significance tests option in the descriptive statistics dialog, you can test whether the calculated QVI is significantly different than it would be under a null model of no relationship between character values and the phylogeny. This randomization method tests for levels of homoplasy greater than or less than that expected by chance as in Ackerly and Reich -- that is, if character values in terminal taxa were shuffled randomly across the tips of the phylogeny.

Output files

The non-summary output file for QVI analyses has the following columns: TreeID, TreeName, Character, QVI, ExpQVI and PVal. The first three are self-explanatory. QVI is the calculated Quantitive Convergence Index for that tree and character, ExpQVI is the mean QVI calculated for that tree with character values randomized, PVal is the proportion of random QVI values less than or equal to the observed. If this value is greater than 0.5, however, 1-PVal is reported here. Note that this is a one tail test, if you require two tailed values, these should be doubled. This test is really only exactly appropriate when testing the one-tailed hypothesis that the observed QVI is less than that expected by chance.

The summary QVI output file will report the mean, median, min and max QVI values and P-values over all selected trees. Additionally, the output file can report for how many trees the p-value was below a set value. Note again that this is only exactly appropriate for the one-tailed test for an observed QVI less than expected by chance.

Independent contrasts and coefficients of correspondence

CACTUS will calculate independent contrasts and the coefficient of correspondence to test for patterns of correlated evolutionary change among particular pairs of traits. To meet assumptions of parametric statistics, independent contrasts are standardized by dividing them by the standard deviation of the expected amount of change along each branch . The user can select to have CACTUS either use the current branch lengths for each tree (extending lengths appropriately when taxa must be pruned due to missing character data) or to re-calculate branch lengths according to one of three methods after any pruning occurs. Even if no taxa are pruned from the tree prior to analysis, choosing to recalculate branch lengths change branch lengths just prior to calculation of independent contrasts. These branch lengths calculations during independent contrast analyses do not affect the permanent trees stored in the NEXUS file, unlike branch lengths changed through the Trees...Branch lengths dialog.

The output results show pairwise correlations (coefficients of correspondence, CC) among selected characters. You can also choose to output the standardized contrast values themselves to a text file for further analysis such as principle components analysis. If so chosen, the output file will contain a list of contrasts for each character for each tree.

By default, CACTUS tests CC values for significant departure from zero using parametric methods. The Analysis...Independent Contrasts dialog also allows a test via randomization option.


|Contents|

Files included in this package

File name

Notes

cactus.exe

Main program executable

cactus.hlp, cactus.cnt

Windows help files for CACTUS -- Note: My current plan is to eliminate windows help in future releases and keep all information in the html manual.

manual.html

HTML Manual

example.nx, char_data.txt

Demo NEXUS file and demo character data file.


|Contents|

References

Ackerly, D. D. 1998. Another Comparative Analysis Program. http://www.stanford.edu/~dackerly/ACAP.html.

Ackerly, D. D., and M. J. Donoghue. 1998. Leaf size, sapling allometry, and Corner's Rules: phylogeny and correlated trait evolution in maples (Acer). American Naturalist 152: 767-791.

Ackerly, D. D., and P. B. Reich. 1999. Convergence and correlations among leaf size and function in seed plants: a comparative test using independent contrasts. American Journal of Botany In Press.

Felsenstein, J. 1985. Phylogenies and the comparative method. American Naturalist 125: 1-15.

Felsenstein, J. 1993. PHYLIP: Phylogeny inference package. Department of Genetics, University of Washington, Seattle.

Garland, T., Jr., P. H. Harvey, and A. R. Ives. 1992. Procedures for the analysis of comparative data using phylogenetically independent contrasts. Systematic Biology 41: 18-32.

Grafen, A. 1989. The phylogenetic regression. Philosophical Transactions of the Royal Society of London 326: 119-157.

Harvey, P. H., and M. Pagel. 1991. The comparative method in evolutionary biology. Oxford University Press, Oxford.

Maddison, D. R., D. L. Swofford, and W. P. Maddison. 1997. NEXUS: an extensible file format for systematic information. Systematic Biology 46: 590-621.

Maddison, W. P., and D. R. Maddison. 1992. MacClade: Analysis of phylogeny and character evolution. Sinauer, Sunderland, Massachusetts.

Purvis, A., J. L. Gittleman, et al. (1994). “Truth or consequences : effects of phylogenetic accuracy on 2 comparative methods.” Journal of Theoretical Biology 167(3): 293-300.

Schwilk D. W., Ackerly D. D. 2001. Flammability and serotiny as strategies: correlated evolution in pines. - Oikos 94: 326-336.

Swofford, D. 1993. PAUP: Phylogenetic analysis using parsimony. Smithsonian Institution, Washington, D.C.


|Contents|