Posted by: tvasailor | May 1, 2011

Unknown Identification, “Known Unknowns,” with Accurate Mass Mass Spectrometry and CAS Registry

Return to Home Page

An unknown to an investigator, in many cases, is often known in the chemical literature.  We refer to these types non-targeted species as “known unknowns.”  The term originated in a much different context in a quote by Donald Rumsfeld.

The CAS Registry/CAplus is the largest collection of known substances and associated references.  It is easily accessed by SciFinder which is a subscription web-based service offered by the American Chemical Society.  The CAS Registry contains >150 million compounds (Sept 2020) and >15,000 are added daily!  The compounds are associated with more than 36 million references and patents in CAplus.

We refer to this type of database as “spectra-less” because it contains no computer-searchable EI or CID mass spectra.  However, very useful results can be obtained by searching it with either a molecular formula (MF) or an average molecular weight.  Here is a distribution of the molecular weights of species in the database:


The resulting candidates are sorted by the number of associated references and key words to find the most likely identifications.  More recently, Adam Howard at Eastman has a lot of success filtering by substructure as shown in the following link:

New 2022:  Further Refine Data Set by Substructure

The resulting candidates are further evaluated by their mass spectral fragmentation and other ancillary information to arrive at the identification of the “known unknown.”

We presented our initial results at the ASMS meeting in Salt Lake City, Utah in 2010 and published additional information in the Journal of the American Society for Mass Spectrometry (JASMS) in 2011.  The article was featured on the cover of the Feb 2011 JASMS:

Prepress Version of ASMS Article with Screenshots Illustrating Examples:


Journal American Society for Mass Spectrometry, (2011) DOI: 10.1007/s13361-010-0034-3.

New June 2012:  Search SciFinder by Average MW with Screenshoots:

They have added the capability to search the web-based version of SciFinder by average molecular weight, see the following screenshots to illustrate example:


Poster Session ASMS 2010:


Additional Related Information:

A similar approach based on a free web-based product, ChemSpider, was published in the Journal of the American Society for Mass Spectrometry in 2012.

ChemSpider prepress article

NIST offers a similar approach as part of their NIST MS Search Software, with a somewhat smaller collection of compounds.


Roger Schenck from CAS mentions our approach in a brief internet article.


Return to Home Page

Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s


%d bloggers like this: