Pride before a (data) fall

I’m pretty proud of some parts of my workflow: electronic lab notebook, reproducibility, open data files, (semi-obsessive) automated data backups etc etc. But pride often comes before a fall. I had a bad experience this week where I thought I had lost some important phylogenetic data files (I found them eventually), and I’m writing this…

Reproducible research in phylogenetics

I’ve been reading a lot recently about reproducible research (RR) in bioinformatics on several blogs, and Google+ and Twitter. The idea is that it is important that someone is easily able to reproduce* your results (and even figures) from your publication using your provided code and data. I’ve been thinking that this is a movement…

Calculating intron density

I have a project going at the moment to examine changes in intron diversity, size and location in animal genomes. I am always a bit frustrated with the way introns are treated in many genome characterisation papers- “the genome contained Y introns with mean intron size Xbp” is usually all we get. This sort of…

FastTree 3: Timing some runs

I downloaded some datasets from the SILVA96 database. These are structurally aligned SSU rDNA sequences. I browsed through the taxonomic groups and chose annelids (N=1050) and nematodes (N=5048) as smallish tests. I downloaded these as fasta files. I started with the annelids file. The file contain a LOT of gaps, because it comes from an…

FastTree 2.5: Update

The prediction I made before about a long silence once this year’s students turned up was sadly accurate. Anyway, students dealt with, grant proposal submitted, lectures (mostly) given, bureaucracy reduced (a bit), time to get on with some phylogenetics. I was playing before with FastTree. Although it looks to have been quite well tested by…

FastTree 2: Timing runs

In order to to see how quickly FastTree runs for me I need some automated method of timing it. While some programs like phyML return a runtime at the end FastTree doesn’t seem to. So I searched the web and found bits of perl code to put a script timer together. I have uploaded the…

FastTree 1: Compiling and testing

This is how I downloaded, compiled and got FastTree working. Its a bit obvious in places but I think detailed instructions are a good thing to have out there and Google findable. I am using a multicore MacPro 2.8GHz with 4GB RAM and OSX 10.5.4 (I’m not sure the 8 cores make any difference whatsoever…

Actual Science…

So when I started writing this blog I thought I would use it to outline some of the things I was working on as I went along. Not real projects, which I will write up and publish, but side projects and how I got them to work (or otherwise). Unfortunately there hasn’t been much of that,…

Genetic tests of ancient asexuality

I’ve just had a manuscript published at BMC Evolutionary Biology so thought I’d share a synopsis.I’ve been interested in root knot nematodes for a while as they are a powerful system for evolutionary genetics and amazingly successful parasites of plants- especially crop plants. Trudgill and Blok (2001) estimate that they “have host ranges that encompass…

Perl scripts for phylogenetics

The script I referred to in my last post is actually seqConverter.pl written by Olaf Bininda-Emonds, with a few minor modifications to send the output directly to phyml. I thought I would flag up his site which has a large and very useful collection of perl scripts for phylogenetic data wrangling. These are open-source scripts…