SEMINARS
|
|
Spring 2009
STATISTICS
COLLOQUIUM
Friday, March 6, 2009
2:30-3:00—Refreshments
3:00-4:00—Talk
Yost Hall, Room 101
Craig Zirbel, Ph.D.
Associate Professor,
Department of Mathematics and Statistics, Bowling Green State University
RNA multiple sequence alignment using stochastic context-free grammars and 3D structural data
RNA molecules perform a variety of functions in the cells of all living organisms. For example, proteins are assembled by ribosomes, which are primarily made of RNA. Different organisms have slightly different RNA molecules as a result of their different evolutionary histories. The differences appear primarily in the sequence of nucleotides (A, C, G, U) of which the RNA molecule is composed.
At the same time these RNA molecules have many common structural features because they play the same functional role in their respective organisms. This leads to a problem of probabilistic
modeling: what variability is allowed in the RNA sequence and what variability is not? Stochastic Context-Free Grammars (SCFG) have been used to model variability in RNA sequences for just over a decade, and have been successful in aligning RNA sequences from different organisms in order to infer their structural similarities.
In the last few years, 3D crystal structures of entire RNA molecules have become available. We illustrate how to use the wealth of information they give about non-Watson-Crick basepairs and RNA motifs to create highly accurate sequence alignments. Accurate alignments have many uses, for example, inferring the evolutionary tree and the place of each organism in it. 3D structure-based alignments can help us infer 3D structure from sequence data alone, and can help in the search for new RNA molecules in genomic data.
|