Tuesday, January 13, 2009

Aggregate in R

For data that needs to be collapsed, the aggregate() function works great. For example:



If you need to collapse these so you only have one row per GLEAN, and have the Female and Male columns be the average of the collapsed rows, aggregate() can do this in one line:

collapsed_data<-aggregate(x=original_data, by=file_with_GLEAN_names_only, FUN=mean)

I used this to find the average expression levels of genes from a table that had the expression levels of each exon. Since some genes had one exon while others had many exons, aggregate was perfect for this situation. It did take a few minutes to compute, but I had over 57,000 rows and 8 columns, so that's understandable.

No comments: