Improved analysis of metagenome data

News

10/04/2017

BRICS researchers define new standards of metagenome analysis

Microbes may be invisible to the eye, but they are ubiquitous. They are the influential forces behind the regulation of key processes in our environment, for example the carbon cycle. Many microorganisms are not yet known or cannot be maintained in culture. More information about the diversity of species is provided by modern molecular genetic methods, which nowadays allow the entirety of the genome of a habitat – the so-called metagenome – to be investigated. However, the complex analysis of this vast amount of data with a multitude of computer programmes still poses major problems for the researchers. For this reason, scientists of the Helmholtz Centre for Infection Research (HZI) in Braunschweig, the University of Vienna and the University of Bielefeld just started an initiative titled "CAMI – Critical Assessment of Metagenome Interpretation", that tests the tools of metagenome analysis and defines new standards and application options. The researchers published their results in the scientific journal Nature Methods.

To study bacteria from a specific habitat, one used to have to culture them in the laboratory. The cultures grown in the laboratory were then analysed and identified. The progress of molecular genetic methods has led to the development of metagenomic methods, which enable the study of the great diversity of microbes for which cultivation conditions are unknown.

Especially in microbiota research, it is very important to understand the interactions between the microbiota, the immune system and the pathogens for new therapeutic approaches. However, it is not all that easy to accurately study the microbiota: Sequencing, for example, provides a huge quantity of data on all bacteria that are present, known as metagenome, from which information on the individual types of bacteria then has to be filtered out tediously. While there may be many methods of doing so, it is often unclear which one is best for the task on hand.

"Since classical sequencing methods work only with microorganisms that can be grown in a pure culture in the laboratory, there is a stark contrast between the investigation of the metagenome and classical genome sequences of selected organisms," says Prof Alice McHardy, who is the head of the HZI department "Computational Biology of Infection Research" at the BRICS, the Braunschweig Integrated Centre of Systems Biology. "Metagenomics takes a very new look at the genetic information of the world of microbes." But the scientists usually face major problems during the analysis of metagenomes. "There are very many different methods one can use. But it is difficult for a researcher to find out which programmes can be best used for their specific data sets and analyses," says McHardy. "The suitability of tools for tackling different questions varies drastically. On the other hand, the software developers invest a lot of their time comparing the properties of newly developed software and software described earlier."

To address these problems, an international team of scientists directed by Alice McHardy and Alexander Sczyrba, who is the head of the "Computational Genomics" group at the University of Bielefeld, founded the initiative called "CAMI – Critical Assessment of Metagenome Interpretation". They organised a competition in which scientists tested methods of computational biology on various metagenome data sets and then jointly evaluated their results. "

CAMI aims to develop standards for the evaluation of the performance of software for metagenomic analysis followed by the uniform assessment of this software with biologically relevant data, to finally be able to make recommendations as to which programs should best be used for which kind of questions,"

Alice McHardy

The CAMI competition ran for three months in 2015. To test the computer tools, the organisers developed three metagenome data sets based on approximately 700 genomes of bacterial and archaebacterial isolates sequenced by the US Department of Energy's Joint Genome Institute (DOE JGI) and other institutes. A total of 19 teams entered the competition, of which 16 agreed to the publication of their results and tested 25 programmes from around the world. The results of the competition – defined standards and detailed information on program performances for different applications – were recently published in Nature Methods. The results are freely accessible at https://data.cami-challenge.org/results.

The organisers of CAMI are planning a sequel to the very successful competition to continue the testing of metagenomics software and the development of standards. "CAMI is an ongoing initiative," McHardy says. "We invite all microbiota researchers working on the generation or evaluation of Omics data to make a contribution to CAMI."

Original publication:

A. Sczyrba,P.Hofmann, P. Belmann, D. Koslicki, S.Janssen, J. Dröge, I. Gregor, S. J. Fiedler, E. Dahms, A. Bremges, A. Fritz, R. Garrido-Oter, T. Sparholt Jørgensen, N. Shapiro, P.D. Blood, A. Gurevich Yang Bai, D. Turaev, M.Z. DeMaere, R. Chikhi, N. Nagarajan, C. Quince, F. Meyer, M. Balvocˇiūtė, L. Hestbjerg Hansen, S.J. Sørensen, B.K.H. Chia, B. Denis, J.L. Froula, Z. Wang, R. Egan, D. Don Kang, J.J. Cook,, C. Deltel, M. Beckstette, C.Lemaitre, P. Peterlongo, G. Rizk, D. Lavenier, Y.W. Wu, S.W. Singer, C. Jain, M. Strous, H. Klingenberg, P. Meinicke, M.D. Barton, T. Lingner, H.-H. Lin, Y.-C. Liao, G.G.Z. Silva, D. A. Cuevas, R. A. Edwards, S. Saha, V. C. Piro, B.Y. Renard, M. P. H.-P. Klenk, M. Göker, N.C. Kyrpides, T. Woyke, J.A. Vorholt, P. Schulze-Lefert, E. M. Rubin, A. E. Darling, T. Rattei, and A. C. McHardy: Critical Assessment of Metagenome Interpretation – a benchmark of metagenomics software. Nature Methods, 2017, DOI: 10.1038/nmeth.4458

Research Groups

Computational Biology for Infection Research

Locations

HZI Campus