Ion Mobility for Metabolite Identification in Metabolomics
Modern technologies in mass spectrometry (MS) provide a wealth of strategies to interrogate perturbations of the metabolome to gain further understanding of biological systems. Structural identification of ion features to relate MS signals to specific metabolites remains a primary bottleneck in MS based metabolomics studies. Identification begins by comparing MS2 spectra of unknown features against large databases of known molecules, though this approach often falls short as databases are incomplete. In silico prediction of MS2 spectra using tools such as CSI:FingerID1 has enabled further annotation of unknown metabolites not captured in databases but may predict hundreds of erroneous candidate structures for a given MS2 spectra. We aim to leverage ion mobility (IM) measured collision cross section (CCS) values to reduce the number of possible candidate structures to provide more accurate annotation.
Ion Mobility and Collision Cross Section
Ion mobility is an ultra-fast separation technique that separates molecules based on their CCS, which is the rotationally averaged two-dimensional projection of molecular size and shape. This inherent trait of every compound can be measured by IM and predicted using machine learning (ML) algorithms. As such, if we predict the CCS of every candidate structure, we can rule out those which do not match our experimentally measured value of the unknown feature. The number of candidate structures which can be eliminated is greatly dependent on the accuracy of both the prediction and measurement.
Histogram for a known molecule (N-acetyl aspartate) showing the error distribution of ML predicted CCS for MS2 generated candidate structures vs. the experimentally measured value. Dashed red lines represent the ±3% error band, solid green line represents the error for the ML predicted value of the correct structure vs. measured value. Hierarchy shows total number of pubchem results with the same elemental formula (616), MS2 predicted candidate structures (376), and total candidate structures within CCS error tolerance (57).
Challenges and Future Directions
While results for N-acetyl aspartate demonstrate promise for applying this strategy to unknown features, numerous challenges are present which prevent universal success. For many features, all predicted CCS for candidate structures fall within the 3% error band, necessitating more accurate measurement and prediction of CCS. Our lab is exploring new calibration strategies for IM measurement of CCS, as well as refining ML prediction strategies to better align with inter-lab measurement variability.
(1) Dührkop, Kai, Huibin Shen, Marvin Meusel, Juho Rousu, and Sebastian Böcker. “Searching Molecular Structure Databases with Tandem Mass Spectra Using CSI:FingerID.” Proceedings of the National Academy of Sciences 112, no. 41 (October 13, 2015): 12580–85. https://doi.org/10.1073/pnas.1509788112.