The New York Times has an article talking about constructing and especially visualizing the tree of life called “Crunching the Data for the Tree of Life“. Its interesting, especially since I think it touches on many issues concerning tree size that even phylogenetic biologists haven’t really considered. There are lots of talk of “big” trees, sometimes only a few thousand OTUs, and a new tree of plants containing 13,533 species[1]. Carl Zimmer over on the Loom writes that this is the biggest tree he knows of. It might be the biggest published tree I know of too, but Morgan Price on the FastTree site has a 16S rDNA tree to download containing “186,743 distinct sequences”. Its 48MB when compressed. It will be interesting to hear of strategies to visualize a tree of this size while still mantaining associated information. The temptation I’m sure will be just to make it pretty, but not ultimately very useful. ARB can display trees this size (I think) although I still haven’t got to grips with automated collapsing and labelling of groups yet.
The Smith paper looks really interesting, but I’ve only had chance to skim it so far.
—
[1] Stephen A Smith , Jeremy M Beaulieu and Michael J Donoghue
Mega-phylogeny approach for comparative biology: an alternative to supertree and supermatrix approaches
BMC Evolutionary Biology 2009, 9:37 doi:10.1186/1471-2148-9-37
I agree, visualization of large trees is very hard, especially when information content and not beauty is the goal. For scanning trees to see mistakes, I have found dendroscope to be the most useful tool. It can handle very large trees, but I haven’t tried associating data with it. I find that that is best done at the command line. I would be interested in what you think of my paper.
LikeLike