Overview

For many years CFM-EI was the “gold standard” for EI-MS spectral prediction. However, in 2019 the NEIMS program was published and the authors claimed that it exhibited comparable performance and much faster calculation times than CFM-EI. Earlier in 2023 a new program called RASSP appeared that appeared to offer substantially better performance than both CFM-EI and NEIMS.

All models were evaluated using the holdout set of 2,041 EI-MS spectra from the NIST 20 set. and a NIST 23 test set consisting of 2,008 compounds. The dot product similarity coefficient was employed to measure the spectral match quality for each predicted vs. observed EI-MS spectrum. Additionally, the average of the dot product scores from each class of molecules in the holdout set was calculated for each of the models and reported in the table below.

As seen from this table, NEIMS outperforms all other predictors, including the recently published RASSP predictor, in terms of dot product scores for all chemical classes or categories (with the exception of aldehydes and low-molecular weight (<150 Da.) compound sets, where RASSP have shown significantly better results).

None of our developed “peak-only” predictors were able to compete with NEIMS in terms of EI-MS spectra prediction. However, these predictors provided several advantages, which were incorporated with the original NEIMS program to achieve better performance with subformulae annotation for each predicted peaks. We assessed the performance of this finalized predictor, called the Adjusted NEIMS with MIIP and PA (called EI-MSpred), on a set containing six common compounds (methanol, hexane, 3-methyl pentane, benzene, toluene, and nitrobenzene) and the NIST 23 compounds.

We observed that making such an adjustment to the NEIMS program can even improve the performance of NEIMS. When tested on the six common compound set we achieved an average dot product score of 0.555 (vs. original NEIMS with common compound set: 0.517). When tested with the NIST 23 test set, we observed a slight but statistically insignificant improvement over the original NEIMS program (dot product score of original NEIMS: 0.62 vs. Adjusted NEIMS with PA (Peak Annotator) and MIIP (Molecular Ion Intensity Predictor): 0.621 shown below). This finalized predictor (EI-MSpred) is used on this webserver to make EI-MS predictions.
Molecule GroupsTest set sizeCFM-EINEIMSRASSPAdjusted NEIMS with PA and MIIP
Alkene3620.3160.7350.657 
Alkyne3500.330.740.61 
Aldehyde1000.4010.770.821 
Ketone3540.3020.730.64 
Light weight molecule3450.3720.760.84 
Silylated set1780.360.42490.41N/A 
Esters3520.2150.66<0.1 
Weighted Average20410.31040.70.55 
NIST2320080.320.620.540.621