This is a selection of current and previous research projects, in addition to those listed on the funding page and the individual members‘ homepages. Code and more information is provided via the GitHub repository.
Main contacts from our group: Mercè Garí, Ronan Le Gleut, Christiane Fuchs
Under the lead of the Division of Infectious Diseases and Tropical Medicine, we participate in the prospective COVID-19 Cohort Munich (KoCo19) study. We contribute through statistical analyses and stochastic modelling, especially using stochastic differential equation models which account for disease transmission across regions and within households.
More information: Project webpage
Selected references
Main contacts from our group: Rui Maia, Kainat Khowaja, Annette Möller, Houda Yaqine, Jonas Bauer, Julian Wäsche, Christiane Fuchs
How will the climate develop, how secure is our energy supply, and what chances does molecular medicine offer? The rapidly increasing amount of data offers radically new opportunities to address today’s most pressing questions of society, science, and economy: Data, outcomes and predictions are, however, subject to uncertainties. The goal of the project Uncertainty Quantification is to understand these uncertainties through methods of probability theory, and to include them into research and outreach. The project connects applied researchers from the four research fields Earth & Environment, Energy, Health, and Information among each other and with Helmholtz data science experts, as well as external university partners from mathematics and econometrics.
More information: Project webpage
Main contacts: Houda Yaqine, Julian Wäsche, Christiane Fuchs
Diffusion processes are a promising instrument to realistically model the time-continuous evolution of phenomena in biology as they combine the advantages of probabilistic models and differential equation models. However, both the correct approximation of such dynamics in terms of diffusion processes and the statistical inference for diffusions proves to be challenging in practice.
We are intimately involved in diffusion modelling and the development of Bayesian estimation techniques for diffusions. The application of diffusion processes to fluorescence microscopy data and single-cell data yields promising results and shows the potential of this approach.
References
Main contacts: Hannah Marchi, Christiane Fuchs
It seems evident that spatial proximity between researchers may lead to more frequent or more intense collaboration than between scientists who work at large distance from each other. We hypothesize that the spatial organization within a research campus or even within a building influences interdisciplinary work. In a collaboration network study, we investigate which distance matters, how much researchers are influenced by people working around them and how scientific publishing changes depending on the heterogeneity among authors.
Outcomes of the study could provide valuable information about which spatial organization can foster (interdisciplinary) research and could be used for future plans of building structures.
References:
Main contacts: Turid Frahnow, Christiane Fuchs, Johannes Voit
The digital revolution has brought us not only a multitude of technical innovations, but also a flood of data that exceeds human capacity. The algorithms used to analyze this data are ubiquitous, whether in navigation devices or in weather forecasting. But algorithms are by no means just abstract procedures, they often have a very practical meaning for everyday actions. And beyond their practical use, they can even hold artistic potential.
As part of this project, we held an interdisciplinary seminar in which the aesthetic potential of algorithms and large numbers was explored. Outcomes were presented at the jubilee festival in September 2019. We have also contributed an exhibit piece to the university's showroom a sounding of a sorting algorithm using sounds from the university building.
Some student projects are presented here (in German).
Main contacts: Lisa Amrhein, Mercé Garì, Christiane Fuchs
Even when appearing perfectly homogeneous on a morphological basis, tissues can be substantially heterogeneous in single-cell molecular expression. As such heterogeneities might govern the regulation of cell fate, one is interested in quantifying the heterogeneities in a given tissue.
Gene expression measurements of single cells would be most suitable to detect and further parameterize a heterogeneous population if the dataset was large and error-free. Unfortunately, such measurements are often expensive and subject to substantial technical noise. Instead of considering single-cell data, we randomly select small numbers of cells and measure the subpopulation average expression levels.
We investigate how heterogeneities can be detected from such data by application of statistical methods, and how the proportions, mean values and standard deviations of the groups of differently expressed cells can be estimated.
Application to measurements from human breast epithelial cells reveals the functional relevance of the heterogeneous expression of a particular gene.
Source code, an R package and a webtool are provided on the StochasticProfiling project website.
Collaboration partner:
Prof. Dr. Kevin Janes, University of Virginia
References
Main contact from our group: Annette Möller
Weather prediction today is conducted via so-called numerical weather prediction (NWP) models. They consist of a system of differential equations describing the state of the atmosphere as accurate as possible, which are integrated in time to obtain predictions of future atmospheric states. Typically, the NWP models are run multiple times, each time with different initial conditions and/or model formulations to represent the uncertainty in these quantities. This results in an ensemble of forecasts, as each model run yields a single deterministic forecast.
However, ensemble forecasts are often uncalibrated and require so-called statistical postprocessing. Here, statistical models are applied to the ensemble forecasts in conjunction with observations to improve the quality of the forecasts. Furthermore, many postprocessing models obtain a probabilistic forecast, e.g. in terms of a full predictive probability distribution. These probabilistic forecasts allow to assess and quantify forecast uncertainty explicitly.
Several challenges arise when dealing with weather variables such as wind speed or precipitation, which exhibit many zero observations and/or heavy tail behaviour. The current research activities include mmodification or extension of existing models for normal distributed weather variables such as temperature to other weather variables, such as skewed distributed wind speed or precipitation which is often modelled by a mixture distribution.
Another important research area is concerned with incorporating dependencies in space and time or between different variables into the models. Current research activities in the area of postprocessing are concerned with developing different types of multivariate postprocessing models.
References
Baran, S., Möller, A. (2020): Various Approaches to Statistical Calibration of Ensemble Weather Forecasts. ERCIM News Issue 121, 30-31.
Lerch, S., Baran, S., Möller, A., Groß, J., Schefzik, R., Hemri, S., and Grater, M. (2020): Simulation-based comparison of multivariate ensemble postprocessing methods. Nonlinear Processes in Geophysics, Volume 27, 349–371, https://doi.org/10.5194/npg-27-349-2020.
Möller, A., Groß, J. (2020): Probabilistic temperature forecasting with a heteroscedastic ensemble postprocessing model. Quarterly Journal of the Royal Meteorological Society, Volume 146, Issue 726, 211 – 224.
Main contacts: Norbert Krautenbacher, Christiane Fuchs
Childhood asthma is a widespread disease. Many studies revealed that its onset is influenced by genetic and environmental factors like certain single nucleotide polymorphism (SNP) variants, family history or farming environment.
Our objective is to develop an asthma risk score especially for children between one and three years with which one can assess a child’s personal risk to develop the disease. The score should be based on few SNPs and the environmental variables. This shall allow a cost-efficient targeted treatment for exposed children.
Statistical aspects of this project are regularization and variable selection, gene-environment interactions, big data, inclusion of prior knowledge, stratification of the data, missing values, SNP imputation and validation.
Collaboration partners:
Prof. Dr. Erika von Mutius, Dr. Markus Ege, Prof. Dr. Bianca Schaub
Dr. von Hauner Children‘s Hospital
References
Main contacts at ICB: Ivan Kondofersky, Norbert Krautenbacher, Hagen Scherb, Christiane Fuchs
The Prostate Cancer DREAM Challenge attempted to improve survival prediction of prostate cancer patients. Participants were asked to build risk scores from a bulk of snapshot and longitudinal data tables within four months.
As "A Bavarian Dream" we participated in this challenge and finished up among the winning teams in both Subchallenges 1 and 2. Our work involved data and result management, data cleaning and preprocessing in close collaboration with a clinician, and model building ranging from classical Cox regression to machine learning, ensemble methods and model averaging. Final predictions were evaluated on an independent test set that had been withheld by the challenge organizers.
References
More information:
Main contacts: Norbert Krautenbacher, Christiane Fuchs
In many epidemiological applications, particular interest lies on the investigation of rare combinations of an exposure and a target variable. Representative samples from a population hence may not contain sufficiently many cases for a reliable analysis. For that reason, stratified samples are taken from the population to enrich the rare combinations. Well-known examples are case-control studies or two-phase studies. The enrichment comes at the cost of biased samples distorting estimates. We address issues arising in prediction on such biased samples, both for training and evaluation of a statistical model.
References
Main contacts at Biostatistics: Lisa Amrhein, Christiane Fuchs
Acute myeloid leukemia (AML) often results from the myelodysplastic syndrome (MDS). Here, the differentiation hierarchy from hematopoietic stem cells to mature, functional cells is disturbed. AML patients, again, frequently carry a mixture of different cancer cell types, so-called subclones. This is reflected by a mixture of genomic signatures and heterogeneous transcriptome profiles. Applying statistical and dynamical models to data from our clinical and biological collaborators, we want to identify altered differentiation hierarchies of MDS subclones and characterize the development of related heterogeneities in AML while the tumor undergoes evolution.
This project is funded as Subproject A17 of the Collaborative Research Centre (CRC) 1243 "Genetic and Epigenetic Evolution of Hematopoietic Neoplasms".
Further reading: