Solutions in Science 2025
SinS 2025: A roadmap for more confident metabolite identification in untargeted metabolomics
Jul 15 2025
Opening the proceedings from the SinS 2025 scientific meeting in Brighton, Professor Warwick Dunn of the University of Liverpool delivered his assessment of the ongoing barriers that limit the effectiveness of untargeted metabolomics, despite technical advances in the field over recent years. Speaking to the SinS 2025 audience of analytical chemists, biologists and scientists from a range of disciplines, Professor Dunn focused on what he described as the discipline’s most enduring challenge: the reliable identification of small molecules detected in high-resolution LC-MS experiments.
Untargeted metabolomics allows researchers to explore biological systems without bias by detecting all measurable metabolites within a sample. It is widely used in fields such as personalised medicine, environmental health, and nutritional science. However, while the approach is powerful, it is also problematic. As Professor Dunn explained, only around 10 to 15% of the features detected in a typical untargeted experiment can currently be identified with high confidence. The rest – often referred to as the metabolic ‘dark matter’ – remain unannotated and uncharacterised, posing a major obstacle to biological interpretation and clinical translation.
The task of matching signals to known metabolites is complicated by a range of factors. A single metabolite may generate multiple signals due to adduct formation, in-source fragmentation, isotopic variation, neutral losses or dimerisation. This redundancy inflates the number of features in the dataset and makes it difficult to determine how many unique molecules are present. Further complications arise from the variability in retention time caused by differences in chromatographic conditions such as mobile phase composition, column chemistry and temperature, which often differ between laboratories. This makes it challenging to share or reuse retention time libraries across different experimental setups.
Even when exact mass is known, multiple candidate structures may exist, particularly among isomeric or structurally similar compounds. Without MS² data or /complementary information, annotations based on MS¹ alone are vulnerable to misassignment. In addition, the coverage of public spectral libraries, such as MetaboLights and GNPS remains limited, and many compounds have no reference spectra or verified retention time data available. Public metabolomics repositories are also inevitably underutilised, as more than 80% of deposited features remain unannotated. As he observed, these limitations collectively restrict the reproducibility and utility of the libraries as a research tool in metabolomics.
To address these issues, Professor Dunn’s group at Liverpool has developed a suite of computational and experimental strategies. A key advance is the Liverpool Annotation and Matching Pipeline (LAMP), a tool that groups multiple signals arising from the same metabolite by using shared retention time, correlated peak intensities and known mass differences. This helps reduce data redundancy and avoids misidentification caused by in-source artefacts being mistaken for unique metabolites. The pipeline is available as open-source software and can be integrated into widely used platforms such as XCMS and MS-DIAL.
Retention time – previously underused in metabolomics – offers opportunities for improving annotation. While retention time is known to vary depending on the system, Professor Dunn’s group has shown that it can be harmonised across laboratories using calibration models based on reference compounds. Their results indicate that retention times can be transferred between platforms with tolerances of between 9 and 14 seconds, significantly increasing the potential for building shared retention time libraries, and far better than available computational modelling which has error range limitations of a minute or more.
Another area of innovation concerns MS/MS libraries. Most currently available spectra represent only the protonated form of each compound, yet in real biological samples many metabolites ionise more readily as sodium or potassium adducts. Professor Dunn’s team is now constructing expanded libraries that include fragmentation data for multiple adduct forms – including neutral examples such as Sodium Formate that are not present in solution as ions.
These data are acquired using commercially available metabolite kits and consistent acquisition protocols to ensure coverage and reproducibility. By including a broader set of ion forms, these expanded libraries greatly increase the chance of correct metabolite identification under real-world conditions.
In parallel with these analytical advances, Professor Dunn’s group has also undertaken a large-scale project to integrate and standardise data from existing metabolomics repositories. The Reported Metabolite Project – also known as MARIANA/MARIANA2 – collates and curates previously reported metabolite identifications from more than 100 LC-MS studies in MetaboLights and GNPS.
This effort has generated a list of approximately 6,000 unique metabolites that have been consistently observed across multiple laboratories and sample types. Lipids, particularly those found in blood and tissue, are among the most frequently reported compounds. The project provides a more reliable basis for designing targeted assays and validating findings across studies. However, Professor Dunn emphasised that the reuse of public data remains limited by inadequate metadata and inconsistent adherence to FAIR (Findable, Accessible, Interoperable, Reusable) data principles.
Having focused on the challenge of identification, Professor Dunn turned in closing to what he believes will be the next major concern for the field: quantification. While many commercial metabolomics kits are now available, they often rely on shared calibration curves for large sets of chemically diverse compounds. His group has demonstrated that such approaches introduce serious quantification errors due to differences in matrix effects, ion suppression and instrument response. Even when the same kit is used, variations in instrument tuning or solvent conditions can lead to inconsistent results. For clinical and regulatory applications, these inaccuracies are unacceptable. More rigorous validation of quantitative workflows is essential if metabolomics is to meet the standards expected in translational medicine.
Professor Dunn concluded by urging the community to take a more systematic and collaborative approach to these persistent problems. Researchers should make better use of orthogonal data such as retention time and adduct diversity, contribute annotated data to public repositories, and adopt harmonised experimental protocols. He also called for greater investment in infrastructure and training to support the development of robust, quantitative assays.
Metabolomics, he noted, has achieved considerable gains in sensitivity and coverage, but its full promise will only be realised if identification and quantification are addressed with the same scientific rigour.
Digital Edition
Lab Asia Dec 2025
December 2025
Chromatography Articles- Cutting-edge sample preparation tools help laboratories to stay ahead of the curveMass Spectrometry & Spectroscopy Articles- Unlocking the complexity of metabolomics: Pushi...
View all digital editions
Events
Jan 21 2026 Tokyo, Japan
Jan 28 2026 Tokyo, Japan
Jan 29 2026 New Delhi, India
Feb 07 2026 Boston, MA, USA
Asia Pharma Expo/Asia Lab Expo
Feb 12 2026 Dhaka, Bangladesh



