Reconstructing DNA Copy Number by Penalized Estimation and Imputation

Start: 04/20/2011 - 4:15pm
End  : 04/20/2011 - 5:00pm


Chiara Sabatti (UCLA)


Recent advances in genomics have underscored the surprising ubiquity of DNA copy numbervariation (CNV). Fortunately, modern genotyping platforms also detect CNVs with fairly high reliability. Hidden Markov models and algorithms have played a dominant role in the interpretation of CNV data. Here we explore CNV reconstruction via estimation with a fused-lasso penalty as suggested by Tibshirani and Wang (2008). We mount a fresh attack on this difficult optimization problem by: (a) changing the penalty terms slightly by substituting a smooth approximation to the absolute value function, (b) designing and implementing a new MM (majorization-minimization) algorithm, and (c) applying a fast version of Newton’s method to jointly update all model parameters. Together these changes enable us to minimize the fused-lasso criterion in a highly effective way.

To make the best use of the available information to infer copy number states, we tackle the problem of joint analysis of multiple signals (which can include re-sequencing together with genotyping data). This requires coordinating measurements at different genomic locations. Fortunately, the MM framework is flexible enough to work around this difficulty.

Roberts North 15, Claremont McKenna College