Thursday, November 19, 2009

It always comes back to Jurassic Park, doesn't it?

I just got back from a seminar by George Weinstock of Washington University, St. Louis about the human microbiome project. I was planning on doing a blog entry on the human microbiome project--and still will write one--but during the talk I was struck by the amount of data the Wash. U. genome sequencing center was producing, and how a huge percentage of the talk centered on the challenges of storing and processing all that data.

For example, the center needs an additional 4 terabytes of storage each day. They've built an entire storage facility which is in great part air conditioners and electrical equipment to maintain the data storage. It uses the same amount of electrical power it takes to light a New York Skyscraper.

It made me think back to all those Cray supercomputers used to process the ancient dinosaur DNA sequences in that most-influential novel "Jurassic Park" by the late Michael Crichton. The speed of current DNA sequencing technology is blinding in comparison to what those ol' Cray computers would have been capable of. But though the speed of the sequencing has gotten faster, all that data requires a huge space to house the technology to store it.

1 comment:

  1. I recently attended a presentation at WVSTA (West Virginia Science Teachers Association) about using distributed computing, via a screensaver app to analyze data for SETI research. It is interesting that the SETI@home project is crunching through data from the Arecibo radio telescope, using idle computers like yours and mine when they go into screensaver mode. This technique is being used to analyze various data, perhaps piles of DNA sequencing be sifted through also.

    Check out