DCI 2004

The DIMACS Bio-Math Connect Institute

Rutgers University
July 18 - July 30, 2004


Outline of Topics to be Covered
Description of Helpful Mathematical Background Material




Outline of Topics to be Covered


Week 1

Monday - 2 Hours

Introduction to Hox genes

BLAST demo on Hox genes

Global alignment problem
   - statement of problem

Introduction to dynamic programming
   - revisit Dijkstra's algorithm
   - rephrase it in terms of dyn. prog.

Global alignment
   - really a "shortest path" problem
   - use matrices

Variation
   - modify so that end gaps "don't count"
Tuesday - 2.5 hours

General gap penalty functions
   - matrices with 3 entries
   - affine gap penalties, for example

Local alignments
   - read off of our dyn. prog. matrices

How BLAST works
   - description of PAM matrices
   - algorithm for protein sequences
Wednesday - 1.5 hours

How BLAST works
   - algorithm for nucleotide sequences

FAST algorithm
   - FASTP
   - FASTA

PAM matrices
   - derivation
Thursday - 1.5 hours

Introduction to parsimony
   - general idea in various contexts
   - specifically for trees

Given a tree and one base at each leaf
   - find all optimal reconstructions
   - count all optimal reconstructions

Generalize to longer (aligned) sequences

Find the "average" optimal parsimony tree
Friday - 1.5 hours

Reconstructing phylogenies
   - maximum parsimony method
       - quick summary
       - pictures
   - definition of distance
   - neighbor joining method


Week 2

Monday - 1.5 hours

Suffix Trees
   - use
   - construction (intuitive method)

Genome rearrangement
   - some examples
   - sorting by reversals

Algorithm for oriented case
   - breakpoints
   - reality and desire diagram
Tuesday - 1.5 hours

Genome rearrangement (cont'd)
   - cycles in the R&D diagram
   - components
   - hurdles
   - summary of algorithm

Unoriented case
   - mention of NP-completeness
   - non-optimal approximate algorithm








Description of Helpful Mathematical Background Material

Combinatorics and Probability
Graph Theory
Mathematics
Other

Most Essential Material

Counting Principles
   - multiplication (product) rule

Basics of probability
   - outcomes and events
   - equally likely outcomes
   - multiplication (product) rule

Conditional Probability


What is a graph
   - vertices (nodes)
   - edges (arcs)

Cycles and Paths

Regular graphs
   - in particular, 2-regular

Edge-weighted graphs

Trees
   - leaves
   - binary trees
   - rooted vs. unrooted

Components
   - connected graphs
Logarithms
   - logs base 2 and 10
   - facility with computations
   - log rules
       - log(ab) = log a + log b
       - log(ar) = r log a

Summation (Sigma) notation
 

Strings over an alphabet
   - substrings
   - superstrings
   - enumeration of strings

Less Essential Material
Likelihood Trees
   - depth-first search

Shortest Paths
   - Dijkstra's algorithm
Matrices
   - matrix multiplication
Computational complexity
   - analysis of running time
   - polynomial-time algorithms
   - NP-completeness