Hui-Hsien Chou

Gen/ComS/BCB 596X: Genomic Data Processing

Philosophy of teaching

Genomic data processing

Principles and practices of compiling

Credit 3. Prereq: Com S 208 or 228, and Com S 311.

Modern biological research requires the use of computers to solve many computational problems. This is especially true for molecular biology, where vast amount of sequence data are accumulating far faster than human can manually manage thanks to new automatic DNA sequencing machines. A new course is therefore needed to systematically introduce the theoretical and practical aspects of modern large scale genomic data processing.

Emphasizes are placed on projects that actually carry out all major genomic data processing steps. Many bioinformatic tools licensed from various genome research centers will be used in the class projects. Students may also be asked to develop some additional bioinformatic programs. Topics covered in the class include genome database construction, search and update; sequence alignment and comparison methods; sequence quality assurance, vector trimming and contaminant removal details; shotgun assembly procedures and algorithms; specialized bioinformatic hardware; genome closure techniques; annotation tools and methods for microbial and higher order organisms; protein structure recognition and prediction algorithms; data collection and dissipation through the Internet; and scripting languages for linking together an automatic biological data processing pipeline. Some topics in post-genomic research will also be discussed toward the end of the semester.

This course was offered in Spring 2000 and will be offered again in Fall 2001. For more information please go to the class home page.