Abstract
In the near future, NMR-based metabolomics will become a valid means of support for pulmonary personalised and precision medicine http://ow.ly/CXOT30lpGUr
Metabolomics investigates the chemical end products of biological processes in living systems. It combines high-throughput analytical techniques with bioinformatics to investigate and quantify, in a global (untargeted) and/or specific (targeted) manner, the metabolites present in living systems in response to any exposure (including therapeutics), lifestyle, environmental and genetic stress [1]. The unique profile obtained can provide direct information on a physiological status or on a sequence of events in an organism. Since the metabolome (i.e. all the endogenous metabolites of molecular weight <1500 Da) is a downstream system of the proteome and transcriptome it is amplified covering several aspects of a pathology [2]. Therefore, selection of an appropriate biomatrix (exhaled breath condensate (EBC), urine, serum, plasma, bronchoalveolar lavage fluid (BALF), saliva or cell and tissue extracts) can shed light on several fields of medicine.
Metabolomics relies on a variety of analytical instruments, but only a few of these are routinely applied. The methods of choice are mass spectrometry (MS, in its several variants) and nuclear magnetic resonance (NMR) spectroscopy, although Fourier transform-infrared (FT-IR) and Raman spectroscopies have also been used. Each technique has advantages and limitations. One of the main advantages of NMR spectroscopy is its ability to determine a rapid metabolic picture of the sample with minimal pretreatment [3]. It is also non-destructive and samples can be investigated several times. By contrast, MS-based methods present higher intrinsic sensitivity and specificity, but may require extensive sample pretreatment and different experimental conditions for different chemical classes. FT-IR and Raman spectroscopy lack specificity and collect only general information because they study the properties of chemical functional groups, which may be present in several metabolites at the same time.
The predominantly used techniques are NMR and MS. Here we describe NMR-based metabolomics and its application to respiratory medicine. The characteristic workflow of an NMR-based metabolomics study is depicted in figure 1.
How does it work?
NMR investigates a sample at the atomic scale. The sample is placed in a static magnetic field, and the metabolite atomic nuclei are excited by an electromagnetic radiation from a pulsed source at a given frequency (the “resonance frequency”). When the radiation is turned off, the nuclei return (“relax”) to the fundamental state by emitting a characteristic electromagnetic wave, which is recorded. The Fourier transformation of this signal produces the classical frequency-domain trace (the “spectrum”, figure 1a). The position (the “chemical shift”) of each line (the “resonance”) in a spectrum depends upon the environment of the corresponding nucleus, and the parameters characterising that line (frequency, splitting, linewidth and amplitude) can be used to determine the molecular structure, the conformation and the dynamics of the molecule.
Most metabolomic applications use spectrometers operating at 600 MHz (i.e. 14.1 T), equipped with CryoProbe technology, which increases the sensitivity and reduces the electronic thermal noise [4]. The amplitude response of an NMR spectrometer is linearly dependent on sample abundance, which allows for concentration quantification.
Data acquisition should be carried out with the same experimental parameters if the study goal is to compare and quantitate biomarkers. Optimal experimental settings for acquisition (excitation pulse width, receiver gain, acquisition intervals, relaxation times and solvent suppression parameters) and processing (apodization, phasing, baseline correction, selection of integral regions, peak alignment, normalisation, centring, scaling and transformation) require strict control [5].
Since protons have a high natural abundance and inherent sensitivity, 1H-NMR is usually used for metabolic profiling. The identification (the “assignment”) of resonances to specific metabolites is obtained via two-dimensional spectra, which spread the signals in two dimensions, therefore avoiding ambiguous assignments due to signal overlap. For isolated resonances, it is possible to compare the observed chemical shifts with published reference data [6] or online data libraries (the Human Metabolome Database; www.hmdb.ca/).
Samples contain thousands of compounds. Therefore, the amount of data collected from an NMR spectrum turns out to be numerically large and biologically complex, comprising different sources of statistical variance and including both information of interest and redundant or unnecessary variance. In this context, metabolomics analysis tries to decrease data dimensionality and derive statistically significant information on the biological system under investigation. Multivariate analytical methods (namely, principal component analysis (PCA) and partial least squares projection to latent structures (PLS) [7]), together with the “filtered versions” orthogonal projection to latent structures (OPLS, O2PLS) [8] are able to generate new factors called latent variables or principal components (PCs). The following projection of the acquired dataset into the corresponding latent space results in dimensionality reduction together with an intuitive data visualisation.
The NMR profile dataset can be transformed into a matrix through a binning or bucketing process that defines spectral variables as chemical shift and integrated bin intensities [9]. After NMR spectral processing (spectral deconvolution, reference calibration, phase correction and peak identification) and transformation of row spectral data into clean spectra for the X matrix, the typical data processing pipeline usually includes alignment, normalisation and scaling [10]. These steps are performed using dedicated software (e.g. AMIX, Bruker Biospin, Rheinstetten, Germany; or SIMCA, Umetrics, Umëa, Sweden), or free online platforms (e.g. www.metaboanalyst.ca/).
After pre-processing, the data matrix is ready to undergo unsupervised (without a priori knowledge of sample categories or related metadata) or supervised (using extra sample information like metadata or class membership) procedures and discriminant methods to explore trends and similarities/differences, according to the biological variations generating the metabolic profile.
The visualisation of the obtained smaller number of orthogonal factors (the PCs) is achieved through “scores” and “loadings” plots, considered complementary in the dataset transformation, and therefore examined in parallel. Scores represent the new coordinates associated with each sample visualised as a point in the scores plot. Observations (i.e. patients) close to each other in the plot (e.g. the class identified by the blue circles in figure 1b) are characterised by the same variation in their metabolic profiles, while points placed distant to each other represent different metabolite concentrations (red circles in figure 1b). The metabolites responsible for the EBC samples distribution in the PC space are highlighted in the loadings plot, which indicates the influence of the original variables (metabolites) in the new reference system.
Model performance can be assessed with several dedicated methods [11], but testing an external dataset, not included in the calculations of the primary model, is considered the most valid way. To this end, it is possible to divide the whole dataset into a training set to build the statistical model, and a test set to evaluate its predictive ability, when a sufficient number of samples is available.
Finally, from the identified between-class discriminating metabolites, it is possible to identify the impact of the most relevant metabolic pathways that characterise the pathophysiological states (figure 1c).
What is the current state of the art in respiratory medicine using this technology/method?
Two levels of applications have been described in respiratory medicine. The first compares the pathological metabolic profiles with healthy profiles, used as the reference. This is an important point because possible biomarkers of each disease can be obtained and quantified, defining the specificity of the pathological phenotype. Applications cover several respiratory diseases such as asthma [12], chronic obstructive pulmonary disease (COPD) [13], cystic fibrosis (CF) [14], primary ciliary dyskinesia (PCD) [15], cancer [16], obstructive sleep apnoea syndrome (OSAS) [17] and inflammatory states [18]. The reported results certify that it is a panel of metabolites that characterises a pathological state and not a single metabolite. Furthermore, although the metabolites present in the biomatrices may be recurrent (short-chain fatty acids, amino acids, sugars, lipids, etc.), it is the difference in concentration that discriminates the pathological states.
The second level relates to phenotypic definition and its evolution via profiling. In this context, urine profiling finds phenotypic differences between stable and unstable asthma [12], as well as between asthma and COPD using EBC [13]. The influence on a phenotype (e.g. asthma) of a comorbidity (e.g. obesity) can easily be detected, unbiasedly identifying an obese–asthmatic phenotype that is completely different from obese and asthmatic phenotypes [19]. The phenotype evolution in different but related pathologies (such as COPD and pulmonary Langerhans cell histiocytosis) that have smoking habit as a causative factor can also be investigated [20]. COPD staging is attainable from serum [16], which is also useful to investigate its link with lung cancer [16]. Combination of serum, urine and EBC can helpfully discriminate COPD and OSAS [17].
Application of 1H-NMR spectroscopy can differentiate malignant from benign pleural effusions [21], and stable versus unstable CF using EBC [14]. Serum can be used to investigate ventilated patients who develop ventilator-associated pneumonia [22]. BALF has been used to successfully study bronchiolitis obliterans syndrome with different degrees of severity [23], and to monitor air pollution exposure [24]. Pharmacometabolomics information can be fruitfully obtained from EBC [25].
How is it likely to be used in future?
The study of small molecular compounds and pathways involved in respiratory medicine represents an area of active clinical investigation. The discovery of individualised metabolomic profiles in patients with respiratory disorders could reveal novel pathways of disease pathogenesis, or enable better recognition of patients with a biochemical susceptibility for respiratory diseases, the response to therapy and “predicting” possible non-responder patients, and the identification of new therapeutic targets. Metabolomics could also help to understand the complex gene–environment interactions involved in onset and progression of respiratory disorders. In this regard, an interesting point could be investigation of the exposome (“the integrated load of xenobiotics that an individual accumulated in her/his lifetime”) [26] and its effects on respiratory diseases.
Although NMR-based metabolomics studies involve minimally invasive biomatrices, to make a larger impact in respiratory medicine these studies should become part of an investigative strategy involving several countries and laboratories. This would help solve issues like the standardisation of the sampling across laboratories, the presence of confounding factors (e.g. comorbidity, demographic factors and sex differences), and finally, the contribution of microbiota and its alterations to respiratory disorders.
Footnotes
Conflict of interest: None declared.
- Received June 13, 2018.
- Accepted August 5, 2018.
- Copyright ©ERS 2018