black microarray DiGE ECCB Satellite Meeting on
Probabilistic Modelling in Computational Biology
22 Sept 2008 Cagliari, Sardinia

Probabilistic Methods for
Dynamic System Models and Data Integration in Computational Biology

Affiliated with ECCB 2008

Motivation and Outline of Sessions

Probabilistic methods are an obvious choice for integrative inference because the posterior information about a biological system that arises from one source of information can be regarded as the prior for incorporating information from another source. This approach can be applied iteratively and so lends itself naturally to data integration. Bayesian theory provides the axiomatic framework for a unified approach to important challenges of computational biology: Besides allowing sequential belief updates, it supports optimal decision making under uncertainty and provides means for learning models from heterogeneous data sources.

One such example of an application of probabilistic data integration is the joint analysis of microarray data from different biological systems for common molecular mechanisms as witnessed by gene activity. In addition, the very same type of models can be exploited to increase the statistical power and hence reduce the cost of experiments by combining different microarray studies of the same biological question. In contrast to approaches where data from multiple laboratories or technologies has to be normalized before a joint analysis is possible, this meta level method directly combines information extracted from the data.

Similar models can also be used to integrate heterogeneous data sources. A typical application, for example, is the combination of database information like functional gene annotation or protein interaction measurements with microarray gene expression data. This increases the statistical power of model inference and thus helps overcome the limitations due to the small numbers of microarray samples available in most studies.

A key feature that distinguishes a modern approach to Systems Biology is the aim of linking modelling of the interactions of system components with the huge volume and diversity of contemporary cellular and molecular data, such as that coming from high-throughput, genome-wide, and imaging technologies. One is therefore interested in the development of probabilistic methods for the analysis and modelling of such data. Traditionally, the dynamics of complex biochemical processes, for example, transcriptional control and cell signalling, have typically been modelled by systems of coupled ordinary nonlinear differential equations. However, when these reactions are considered at the level of a single biological cell, stochastic effects can no longer be ignored. There are also important stochastic aspects of such modelling arising from incomplete knowledge of the system parameters, missing data, and competing scientific hypotheses. Contributed papers will be sought relating to parameter inference in a probabilistic framework, dealing with missing data and chemical species, and inference over the structure of the underlying networks.

The above application areas in computational biology suggest themselves focus points for individual sessions. We will therefore devote one session to dynamic system models and a second session will explore the probabilistic integration of heterogeneous data sources.

  Vienna Science and Technology Fund (WWTF)     Boku Bioinformatics