Most biological circuitry that regulates cellular stress responses involve many molecules, and at several levels. Computational modeling of regulatory networks therefore has to integrate varied, large-scale data sets.
Cytoscape is the worldwide standard open-source platform for network visualization and data integration. This project, which we started at the Institute for Systems Biology in Seattle, provides a universal, customizable tool for the network-based visualization and analysis of larger-scale datasets. The Cytoscape core software is now co-developed in the United States, Canada, and Europe, and is regarded as an innovative model for scientific collaboration. As part of the NIH-funded National Resource for Network Biology, we currently extend Cytoscape towards the learning of networks from omics data. Our Cyni network inference toolbox for Cytoscape 3 provides an infrastructure for the inference of networks from global measurements, such as transcriptomics.
Mining patterns in genomic and epidemiological data of infectious disease
Observations in multidimensional datasets from long-term studies about the modalities of malaria/dengue infection can point to potentially subtle, but significant factors that control the occurrence or severity of these infections. Genotyping data can point to important genetic determinants and molecular mechanisms. In collaboration with Anavaj Sakuntabhai (Institut Pasteur, Paris), we use existing and newly developed data mining approaches to make observations in these large-scale datasets that may lead to insight, und ultimately, novel prevention and therapeutic approaches.
Immune regulation in response to infection
The immune reaction of human epithelial cells to microbial challenges has to be finely controlled. In collaboration with Philippe Sansonetti (Institut Pasteur, Paris) we explore what the introduction of quantitative models on the basis of QT-PCR and RNAseq transcriptome data may add to our understanding.
Bacterial pathogens enter host cells through tightly regulated, conserved molecular mechanisms. During the invasion process they localize to different subcellular niches, such as pathogen-containing vacuoles or the host cytosol, for replication and spread. The individual subcellular localization changes of the bacteria are instantly sensed by the host cellular immune system. In collaboration with Jost Enninga (Institut Pasteur, Paris) we employ cell sorting-based single-cell transcriptomic assays to obtain a precise understanding of the determinants and relationships of cellular events and immune response signaling pathways triggered during the early stages of infection of host cells by pathogens.
Characterization of the healthy immune response network
The immune reaction of humans in the general population is controlled by a complex network of genetic and environmental factors. Knowledge about this ‘immune response network’ represents a potentially important substrate for personalized diagnosis and treatment. In the context of this larger 10-year project (Milieu Intérieur, coordinated by Matthew Albert and Lluis Quintana-Murci at Institut Pasteur), our group applies modern statistical models to find unexpected patterns in flow cytometry and proteomic profiling data, and helps define the immune response network, and model its dynamic behavior.
Stress regulation in Leishmania infection
Virulent and avirulent strains of the Leishmania parasite differ only slightly on the genomic and transcriptomic levels. We are helping our collaborators in the lab of Gerald Späth to explore and interpret these differences using integrative computational analysis approaches.
Stress regulation in Arabidopsis thaliana
Based principally on a compendium of transcriptome time series, we attempt to identify key players, and their interactions in a panel of abiotic and biotic stress responses in Arabidopsis thaliana. Our experimental partner in this project is Heribert Hirt (Évry, France).
Stress regulation in Bacillus subtilis
In the context of the BaSysBio EU project, we have studied the responses of Bacillus subtilis to a large panel of environmental changes. Using statistical network-based and multi-scale modeling approaches, and in collaboration with the laboratory of Jan Maarten van Dijl (Groningen, Netherlands), we aim to identify players and mechanisms in a surprising induction of competence under a mild nutrient change.
Bioinformatics for novel large-scale measurement technologies
We help evolve Systems Biology by developing computational approaches to extract a maximum of meaningful information from large-scale measurement technologies as the basis for descriptive and predictive models. Most current projects revolve around mass spectrometry based-proteomics, the current technology of choice for the comprehensive characterization of biological systems at the level of proteins.
Large-scale identification of glycoproteins
Glycosylated proteins are key players in processes such as infection and many diseases, but their complex structure makes them particularly difficult to identify on a large scale. In the GlycoHIT project, we collaborate with Zohar Yakhini (Technion/Agilent Labs, Haifa), and Janne Lehtiö (Karolinska Institut, Stockholm) to push the limit of the deep study of the glycoproteome.
Exploiting increasing accuracy and precision of mass spectrometry
Mass spectrometry is able to determine the mass of peptides more and more accurately. The information in peptide (MS) peaks is usually not exploited. In collaboration with the labs of David Goodlett and Tina Guina (U. Washington, Seattle) have shown that, the information in these peaks is rich enough to identify even without further peptide fragmentation.
Computational identification of additional proteins through networks
Despite technological advances, the detection of low-abundance proteins, and their abundance changes, remains challenging. Together with the laboratory of Florence Pinet (Institut Pasteur, Lille), we developed a computational approach based on protein-protein interaction (PPI) networks to identify a list of proteins that might have remained undetected in differential proteomic profiling experiments, and demonstrated the proof-of-concept.
Computational optimization of mass spectrum search engine parameters
Correct adjustment of spectrum search engine parameters is key in successful proteomic data analysis. However, few guidelines are available, and parameters are often set intuitively. We have developed a computational approach to optimize the parameters of standard spectrum search engines, and demonstrated that this approach can lead to the identification of twice as many proteins as standard approaches. Partner in this project was Fabio Cerqueira (Viçosa, Brasil).
Tools for analyzing data from combined fragmentation modes
It is the fragmentation of small peptides in the mass spectrometer that ultimately leads to protein identification. Modern mass spectrometers offer different fragmentation modes (such as collision-induced fragmentation and multistage activation) that can be used alone or in combination. In collaboration with Delphine Pflieger (Évry, France) we evaluate different the efficacy of different fragmentation modes, and approaches and software to interpret experiments that use combined fragmentation modes.