November 19th, 2010

Over at BioStar, Keith asked:

In five years time, how would the bioinformatics landscape be and what will probably be the main focus(es) in bioinformatics i.e the hottest areas in bioinformatics?

Perhaps you’re looking for daring predictions, but I see lots of incremental progress, especially on the following fronts:

If by “hottest”, you mean “number of employees”, I think that there will be large number of openings for Masters-level (or lower) bioinformatics staff. These are the folks who will handle routine munging of huge data sets at most sequencing centers. At the present, a lot of that is still handled by either PhDs or grad students. As tools and standards get entrenched, though, you’ll see more and more offloaded to technical staff.

There’s bound to be a lot of movement in the health informatics field, building tools that can take in your personal genome sequence and spit out useful medical advice (in a format that’s useful to both patients and clinicians). This involves not only genomics skills, but also mining of the medical literature and building useful and searchable databases.

Though systems biology has been muted a little as the hype wears off, it’s poised to undergo a huge leap forward. With high-throughput data from tens or hundreds of thousands of cells, our models of how the cell works at a network or pathway level are only going to improve.

Other things that will be in demand:

Database and other “big data” skills – how are you going to store and access data from millions of genomes? We’re talking petabytes of information here.

Visualization – the larger the data gets, the less we’re able to really wrap our heads around it. A few good pictures can often tell us more than a million lines of data.

Truly interdisciplinary scientists. Not CS people who picked up a little bit of biology, or Bio majors who hack a little perl. We’re going to see the first generation of scientists who have really been trained to straddle the boundary between the two. They’re going to be well-poised to not only do solid research on their own, but be the lynchpins of successful collaborations.

Now, if you asked me where I saw the state of genomics in 5 years, or the state of cancer research, I’d think I’d have a lot bigger, bolder predictions. I just don’t see that the basic computational and statistical skillsets that bioinformaticians use today are likely to change tremendously. They’ll just get applied to bigger data, become more parallel, and be more in demand.

