FastTree 1: Compiling and testing

This is how I downloaded, compiled and got FastTree working. Its a bit obvious in places but I think detailed instructions are a good thing to have out there and Google findable. I am using a multicore MacPro 2.8GHz with 4GB RAM and OSX 10.5.4 (I’m not sure the 8 cores make any difference whatsoever if the code isn’t written to take account of them).

  • I downloaded FastTree from www.microbesonline.org/fasttree.
  • I had to use Safari to do this as Firefox wouldn’t let me right click and download.
  • FastTree 1.0.0 is available as binaries for Windows and Linux. Unfortunately there are no Mac binaries
  • I downloaded the C code file (156kb)
  • I installed the developer tools “XcodeTools.mpkg” from system software install DVD number 2. It took about 15 minutes. This allowed me to use the gcc compiler to actually make the application.
  • I opened terminal, moved to the location of the c file and issued the command: gcc -lm -O2 -Wall -o FastTree FastTree.c
  • It didn’t like that much and gave the error: FastTree.c:212:20: error: malloc.h: No such file or directory

After some Google searching I came across a couple of indications that malloc.h (I had no idea what this was) was outdated.

malloc.h not supported, use stdlib.h (http://developer.apple.com/technotes/tn2002/tn2071.html)

Most helpful was this:

Mac OS X for unix geeks
http://unix.compufutura.com/mac/ch05_01.htm
5.1.2. malloc.h

make may fail in compiling some types of Unix software if it cannot find malloc.h. Software designed for older Unix systems may expect to find this header file in /usr/include; however, malloc.h is not present in this directory. The set of malloc( ) function prototypes is actually found in stdlib.h. For portability, your programs should include stdlib.h instead of malloc.h. (This is the norm; systems that require you to use malloc.h are the rare exception these days.) GNU autoconf will detect systems that require malloc.h and define the HAVE_MALLOC_H macro. If you do not use GNU autoconf, you will need to detect this case on your own and set the macro accordingly. You can handle such cases with this code:

#include
#ifdef HAVE_MALLOC_H
#include
#endif

  • So I opened the C file in a text editor and searched for malloc.h, I found it on line 212. I then deleted that line (#include {malloc.h}) and inserted the four lines from above.
  • I repeated the gcc command to build the application from above and it worked. No errors and it produced the application in 1 second.
  • I tried to launch it and display the help file using the terminal command: ./FastTree -h
  • It didn’t work at all, so for no logical reason I just dumped the application and rebuilt it with the same gcc command. This time ./FastTree -h did launch the application and it displayed the help file. Success!
  • Using the simple instructions from the FastTree page I tried to run it on some lizard DNA sequences I had lying around. These were fasta sequences, although it claims to work on phylip also. The command was: ./FastTree -nt lizards.fasta > lizards.tre
  • It produced a tree with only the first taxon name and nothing else. The input file had mac line endings though and when I corrected that to unix it was fine.
  • I noticed that some of the names were truncated and wondered if they had been chopped at spaces. I replaced with underscores and got better results. [In the spirit of full disclosure these last two points were with the previous version of FastTree and I didn’t try to replicate these errors with version 1.0.0]
  • Another problem I had with the previous version was when I accidentally had a repeated taxon in the matrix. It complained about a “non unique name X in the alignment” and wrote an empty treefile.
  • Having got around these teething problems it ran perfectly. Almost instantly writing a treefile (input fasta N=112, 923bp). The tree looked quite sensible and it had support values (local bootstraps). These are on a 0 to 1 scale, so 0.95 is 95%

Fairly straight forward (except for the malloc error) on the whole. Next I’m going to report on some bigger runs and start timing them properly. My immediate goal is to get a tree of all ~20k metazoan 18s rDNA sequences. Of course a tree of that size will bring its own problems, how to visualize it.

3 Comments Add yours

  1. Mike says:

    Hi Dave,Thanks for these instructions, I used them a while ago to compile FastTree on my old Leopard (pre unibody) Macbook Pro and ran into the malloc problem you describe. I've just reinstalled on my new, shiny Snow Leopard Macbook Pro and I thought I'd let you know that the 2.1.3 version of FastTree on this platform compiles fine without editing.I'm not terribly savvy with compiling from source code, and I noticed that both your method of compiling and that on the FastTree site worked, but there are differences. They use the option -O3 instead of -O2 and include -finline-functions and -funroll-loops. Do you know what effect this has?ThanksMike Cox

    Like

  2. Dave Lunt says:

    Hi Mike, thanks for the comment. No FastTree2 doesn't have the malloc issue, glad to hear it compiles fine for you. Regarding compilation options, my guess is that either would be fine, I'm no expert though so I would go with their suggestions regarding these options not mine. What a great program FastTree is, I'm still really pleased with it! I keep meaning to write a more modern post on FastTree2.

    Like

  3. Mike says:

    Thanks for the reply, I completely forgot to respond! I now have a nice small tree that belies the amount of data in it :)Mike

    Like

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Google+ photo

You are commenting using your Google+ account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s