One of the major new features in Progenesis QI (the successor to Progenesis CoMet) is the ability to create fragmentation databases from your experimental data, which can subsequently be used to assist identification. This blog post will show you how to start building your own.
The first step in creating a fragmentation database is to analyse an experiment where you have measured ms/ms. This might be an experiment where you have spiked in known compounds with the sole intention of gathering fragmentation data for those compounds.
When you reach the Identity compounds stage, you can search for identifications using a number of search parameters:
- Neutral mass
- Retention time
- Collisional cross-sectional area
Hopefully these search parameters and theoretical fragmentation tools will give you fairly high confidence in the identifications of your most important compounds (especially if you have known spiked compounds).
If you are confident enough with a given identification, you can accept it as the true ID, by clicking the gold star in the possible identifications table:
The next step is to take your observed fragment spectra for compounds with accepted IDs, and export them to your fragment database.
So now you have a set of accepted identifications for your important compounds, which you are confident are correct.
It’s a simple step to export your observed fragment spectra for those compounds to a fragment database (MSP file).
To do this, simply choose the Export fragment database… option from the File menu at the Identify Compounds screen:
Here, I’m building a database of pain relieving drugs, and I’ve identified Phenacetin and Paracetamol. So, when I click the menu item I’m shown my two accepted IDs:
I’ve only identified the M+H adducts, but if you’ve identified more than one adducted form, it will be shown in the Adducts column.
So once I’ve clicked Export and chosen a name for my database file, it is exported to an MSP file, which is a simple text-based format as defined by NIST. Here’s what mine shows:
Name: 46506142 (Paracetamol) PUBCHEM_SID: 46506142 Precursor_type: [M+H]+ Comment: 5.17_152.0704m/z Formula: C8H9NO2 Num Peaks: 5 92.05 83 93.034 163 110.0606 999 134.0606 30 152.0712 400 Name: 49854487 (Phenacetin) PUBCHEM_SID: 49854487 Precursor_type: [M+H]+ Comment: 5.16_180.1034m/z Formula: C10H13NO2 Num Peaks: 7 92.05 79 93.034 135 110.0606 999 138.0919 503 152.0712 67 162.0919 31 180.1025 502
When I did this search, the SDF file I used was from PubChem, so the compounds have been given a unique PUBCHEM_SID. Crucially, when I use this MSP file for searching in future experiments, the fragment information listed here will be associated with any compound that has the same PUBCHEM_SID listed in the SDF file. For example if an SDF file was used which contained a compound with PUBCHEM_SID of 46506142, that compound would be associated with the Paracetamol fragments when searching.
You may run multiple experiments, and wish to collect the MS/MS data for all of these into one MSP database.
For example, here I’ve run a second experiment, where I’ve identified Misoprostol with high confidence. Again, I choose the Export fragment database… option from the File menu:
Note that here I’ve identified 3 different adducted forms, which will appear in the fragment database if they have associated fragment data. When I click Export, I choose the MSP file I created before, and I’m asked if I want to overwrite the database or append to it:
I choose append in this case as I’m gradually building up my drugs database. After the export is complete, my MSP database looks like this:
Name: 46506142 (Paracetamol) PUBCHEM_SID: 46506142 Precursor_type: [M+H]+ Comment: 5.17_152.0704m/z Formula: C8H9NO2 Num Peaks: 5 92.05 83 93.034 163 110.0606 999 134.0606 30 152.0712 400 Name: 49854487 (Phenacetin) PUBCHEM_SID: 49854487 Precursor_type: [M+H]+ Comment: 5.16_180.1034m/z Formula: C10H13NO2 Num Peaks: 7 92.05 79 93.034 135 110.0606 999 138.0919 503 152.0712 67 162.0919 31 180.1025 502 Name: HMDB15064 (Misoprostol) HMDB_ID: HMDB15064 Precursor_type: [M+Na]+ Comment: 8.55_382.2734n Formula: C22H38O5 Num Peaks: 2 199.0733 745.9995 299.1615 581.0762 Name: HMDB15064 (Misoprostol) HMDB_ID: HMDB15064 Precursor_type: [M+H]+ Comment: 8.55_382.2734n Formula: C22H38O5 Num Peaks: 3 199.0733 745.9995 299.1615 581.0762 361.2362 544.4017
Note that for Misoprostol my ID has come from HMDB, so the HMDB_ID field contains the unique ID of the new compound. Also note that there are two entries for the two adducts that contained fragment data (the M+NH4 fragment had no associated fragment data).
You can continue to append to a fragment database as much as you like, until you have a complete set fragment data for your particular needs. The next step is to use that database in future searches.
Suppose I’m now running a new discovery experiment and I’m just trying to figure out what’s in my sample; I can use my MSP fragment database by choosing it in the Fragment search method of the MetaScope search profile:
When I run the search, my fragment database is used:
Above you can see I’ve done a search and found a possible ID for Misoprostol. Now, not only have I got a mass error within the threshold, my previously measured fragmentation data also matches very well with what I’ve observed in this experiment. In fact, this ID had been given a fragmentation score of 95.1, giving me further confidence that what I have identified in this experiment is actually Misoprostol.
I hope you can see that Progenesis QI now offers a very powerful way to create and augment your own in-house fragment databases, based on the compounds you are interested in.
It then allows you to make use of these databases you have built up, to give you more confidence in your identifications in further discovery work.
For more information, see the FAQ page on fragment databases, and if you have a question about this or indeed any other feature, feel free to ask below or get in touch and one of the team here will get back to you.