Time: Tues/Thurs 9:30 - 10:50 -- Ruffner Hall Rm 174
Fri: 9:00 - 10:00 Chemistry Rm 411
Professor: William Pearson
Office: Jordan Hall 6-057
The Bioinformatics and Functional Genomics course will introduce the computational, statistical, evolutionary, and genetic concepts at the foundation of modern genome analysis, and address research problems in gene and genome structure using popular computer programs and biological and genome databases. The goal of the course is to introduce students to the computer algorithms, statistical approaches, and biology that interact to allow biologists to make inferences from large genomics datasets, so that they have a clear understanding of the foundations of computational approaches, which will be extended to gain practical experience addressing biological questions using genome datasets.
Students will become familiar with the Linux command-line environment and learn simple programming/scripting skills, which will allow them to perform medium-scale analysis of sequence biological data. The first part of the course will focus on similarity searching, homology, and phylogenetic reconstructions, combining programming and algorithms with evolutionary biology and protein structure. This material is covered by Part I of the Pevner textbook.
The second part of the course will focus on functional/expression analysis at the genomic level. Strategies for quantifying differential gene expression will be explored, leading to micro-array expression and RNA-seq expression analysis. Coordinately expressed gene-sets will then be used to explore methods for biologial pathway analysis, and identification of regulatory signals.
The course will focus on fundamental concepts in biological sequence alignment, statistics, evolution and phylogenetics, motif finding, gene structure and expression, and biological pathway analysis, with emphasis on understanding the strengths and weaknesses of different analysis strategies.
This course will present a brief introduction to programming (Python) to facilitate programmed, reproducible, medium scale analysis of protein families and genome features. It is open to students coming with computing, statistical, chemistry, and life science interests. It does not assume a knowledge of the Linux command line and programming -- those topics will be taught -- but Linux/Unix command line experience will be helpful. It does assume some knowledge of basic molecular biology, the central dogma, the building blocks of DNA and proteins, and the structure of genes in prokaryotes and eukaryotes. Students applying for permission to enroll will be asked to fill out a brief form outlining their programming experience (if any), and their biology (possibly high school) course work. Mathematical and statistical concepts will not require calculus, but will require comfort with advanced algebra.
After taking this class students will be able to:
The course will be a hybrid lecture/lab course, with two 1.5 hr lectures on programming and a 1 hr "lab/discussion" each week.
First/third quarter exams: 33.3%
Weekly problem sets: 33.3%
Final projects (2): 33.3%
You are expected to attend all of the lectures for the course, and participate in the lab/discussion sections. If you have to miss a class, please notify me a before the planned absence.