DCI 2005

The DIMACS Bio-Math Connect Institute

Rutgers University
July 17 - July 29, 2005


Outline of Topics to be Covered
Description of Helpful Mathematical Background Material
Math Pre-Program, Biology Pre-Program
Topics Recommended for Dropping in 2006, in order
  1. Neighborliness method for phylogeny reconstruction
  2. Synonymous substitutions, Kimura's model
  3. Details of the Rearrangement Algorithm
  4. PAM matrices
  5. Neighborliness method of Saitou and Nei
Suggested additional (advanced) Comp. Bio. topics
  1. RNA secondary structure prediction
  2. Gene duplication and subfunctionalization
 


Outline of Topics to be Covered


Week 1

Monday


Parallel Sessions

Math Background - 1.5 hours
Intro. to vertex-edge graphs
   - Directed and undirected graphs
   - Directed acyclic graphs

Dijkstra's algorithm on DAGs

Biology Background - 1.5 hours
Genes as sequences of nucleotides
Theory of evolution
   - mutation and selection
   - the neutral theory of evolution
   - similar genes appear across species
   - orthology and homology


Plenary Session #1

Computational BioMath - 1 hour
Rephrase Dijkstra's algorithm in terms of dynamic programming
   - what smaller solutions would help?
   - keeping track of smaller solutions

Global alignment
   - really a "shortest path" problem
   - use matrices to memoize solutions
   - read off of our dyn. prog. matrices


Parallel Sessions

Math Background - 1.25 hours
A 2-regular graph is the union of disjoint cycles

Counting
   - multiplication rule
   - number of ACGT strings of length n

Elementary probability
   - equally-likely outcomes
   - multiplication rule
   - complement probability

Biology Background - 1.25 hours
DNA, Genomes, genes, introns and exons
Purines and pyrimidines
Amino acids and proteins
The genetic code
Transcription and Translation

Tuesday


Parallel Sessions

Math Background - 1 hour
Trees
   - (connected, n-1 edges, no cycles)
   - binary, rooted and unrooted

Matrix Multiplication

Biology Background - 1 hour
Phylogenies
   - some examples and pictures
   - species vs. gene phylogeny

Various types of mutations
   - point substitutions
   - deletions and insertions
   - replication
   - rearrangement

Synonymous versus non-synonymous substitutions


Plenary Session #2

Computational BioMath - 1 hour
Variation
   - modify so that end gaps "don't count"

Local alignments
   - cutting our losses


Plenary Session #3

Computational BioMath - 1.25 hours
How BLAST works
   - searching for a 'seed'
   - algorithm for protein sequences
   - description of PAM matrices

PAM matrices
   - derivation of 'M' matrices
   - PAM matrices from 'M' matrices
Wednesday


Plenary Session #4

Computational BioMath - 1.5 hours
Introduction to parsimony
   - general idea in various contexts
   - specifically for trees

Given a tree and one base at each leaf
   - Fitch's algorithm for reconstruction

Generalize to longer (aligned) sequences

Reconstructing phylogenies
   - maximum parsimony method
       - quick summary with pictures

Dynamic Programming and Parsimony
   - if time permits, which it probably won't
   - richer data structure

Week 1 Continued

Thursday


Plenary Session #5

Computational BioMath - 1.5 hours
Distance-based Phylogeny Reconstruction
   - definition(s) of distance
   - UPGMA

Neighborliness method of Sattath and Tversky

Neighbor joining method of Saitou and Nei

Friday


Plenary Session #6

Computational BioMath - 1.5 hours
Introduction to Phylogenetic Footprinting
    - footprinting by global alignment
    - Blanchette's FootPrinter algorithm

Synonymous substitutions
   - Kimura's parameters
   - Li's parameters


Week 2

Monday


Plenary Session #7

Computational BioMath - 1.5 hours

Suffix Trees
   - use
   - construction (intuitive method)

Genome rearrangement
   - some examples
   - sorting by reversals

Algorithm for oriented case
   - breakpoints
   - reality and desire diagram
Tuesday


Plenary Session #8

Computational BioMath - 1.5 hours

Genome rearrangement (cont'd)
   - cycles in the R&D diagram
   - components
   - hurdles
   - superhurdles and fortresses
   - summary of algorithm

Unoriented case
   - mention of NP-completeness
   - non-optimal approximate algorithm








Description of Helpful Mathematical Background Material

Combinatorics and Probability
Graph Theory
Mathematics
Other
Counting Principles
   - multiplication (product) rule

Basics of probability
   - outcomes and events
   - equally likely outcomes
   - multiplication (product) rule
What is a graph
   - vertices (nodes)
   - edges (arcs)

Regular graphs
   - in particular, 2-regular

Edge-weighted graphs

Trees
   - leaves
   - binary trees
   - rooted vs. unrooted
Logarithms
   - logs base 2 and 10
   - facility with computations
   - log rules
       - log(ab) = log a + log b
       - log(ar) = r log a

Matrices
   - matrix multiplication
 

Strings over an alphabet
   - substrings
   - superstrings
   - enumeration of strings