Gas phase fractionation and Progenesis LC-MS: perfect partners

This post refers to Progenesis LC-MS, which has since been superseded by Progenesis QI for proteomics. All of the features described, however, remain relevant. A fuller description of the new product can be found here.

Update: since this blog post was first published, a new version of the software has been released with direct support for gas phase fractionation. A new window, available from the File menu immediately after peak picking, performs the calculation of m/z ranges for you. Download the software today to try it out for yourself.

At this year’s annual meeting of the Proteomics Methods Forum, Dr Duncan Smith of the Paterson Institute for Cancer Research gave a very impressive presentation. By making some simple changes to his analysis techniques, he has massively increased his proteome coverage, as well as sequence coverage, when compared to traditional methods. And Progenesis LC‑MS is a key to unlocking some of the benefits.

Gas phase fractionation

While Duncan’s presentation included a range of measures to optimise coverage, it’s his use of gas phase fractionation (GPF) on which I want to concentrate here.

Gas phase fractionation: a quick definition
Gas phase fractionation uses your mass spectrometer to fractionate peptide ions based on their m/z value. That is, you run the same sample N times, each time capturing data in a different m/z window. For example, if your full m/z range is 400-1500, the fractions may be set to capture 400‑600, 600‑900 and 900‑1500.

These restrictions can be applied to either the MS1 scans (and therefore also MS2), or only the MS2 scans. In the latter case, the MS1 data continues to be captured over the full m/z range for each run. This allows us to correct retention time drift across runs and to generate statistically robust measurements from the runs’ MS1 data.

Duncan’s experiments make use of technical replicates to provide robust quantitation. However, using the familiar DDA (data-dependent acquisition) mode of MS instruments gives little benefit for peptide identification in this situation; it tends to result in the same set of peptides being targeted for MS2 scans in each replicate run. Consequently, identifications are limited to those peptides with strong signals that the DDA is picking up.

To illustrate this, consider the following three (simplified) MS1 spectra, each collected at the same retention time in a different technical replicate:

Replicate MS1 traces showing the same peaks triggering MS2 collection in each run

As we can see, 5 peaks in each replicate have been selected for capture of MS2 data. However, because the DDA for each run is looking across the same m/z range each time, we pick up the same 5 peaks in each replicate. The lack of MS2 information for any other peaks means we’re quite limited in how many peptides we can identify; the use of replicates has added nothing to our ability to identify more peptides.

This is where Progenesis LC-MS and GPF come together to help. Remember that:

  • GPF allows you to limit the capture of MS2 data to a specific range of m/z values
  • In Progenesis, the alignment of peptide ions allows identifications from one run to be applied to the corresponding peptide ion in all runs

Traditionally, GPF has been used to limit both MS1 and MS2 capture. However, by limiting only the MS2 capture and capturing all MS1 data, we can retain all of the quantitative benefit of MS1 and still get identifications across the entire m/z range. Not only that, but by collecting the same number of MS2 traces in each fraction, we will be able to identify more of the low-intensity peptides.

In our example, we’ll create 3 fractions, collecting MS2 traces over different m/z ranges in each of our 3 replicates. The effect on overall coverage becomes clear:

Replicate MS1 traces showing different peaks triggering MS2 collection in each run

Clearly, using GPF gives us much greater coverage than from DDA alone. To give an idea of how much more coverage, by increasing the number of fractions to 5, Duncan Smith was able to quote a 3- to 4-fold increase in the number of identified peptides. As I said at the start of this article, very impressive! And remember that this is being done without sacrificing any of the quantitative MS1 data and without increasing instrument time. Again, very impressive.

Choosing the m/z ranges for your fractions

Now that we’ve seen the benefits, how do you optimise your gas phase fractionation? That is, how do you decide on the m/z ranges for your fractions?

You could simply divide the normal range evenly (as seen above), but it’s better to have the same number of peptides in each fraction. For that, you’ll need to run a pilot sample. In his presentation, Duncan presented an example of how he did exactly that, once again with the help of Progenesis LC-MS.

After exporting the feature data, sorting by m/z value, splitting them into 5 fractions, and noting the boundary values, it allowed him to create the following visualisation of those boundaries on the run’s ion map:

m/z ranges shown overlaid on the ion map

As you can see, the greatest density of peptides is in the low-m/z end of the spectrum. By concentrating the fractions at that end, we’re not wasting MS2 scans on less-reliable, noisy peaks at the high-m/z end of the spectrum.

And some more good news: to make GPF even easier, we’re planning to add direct support for calculating these m/z ranges in the next release of Progenesis LC‑MS. The technique’s benefits are so clear, we want to make it as simple as possible for our users, so that more of you can benefit from it.


In the near future, we’re hoping to expand on Duncan’s techniques in a full application note. Keep watching the blog for news on this. In the meantime, I hope this has highlighted a simple technique you can use to increase proteome coverage in your own research.

If you’re not already using it, click here to download Progenesis LC-MS and try it out for yourself.


  1. Yutaka Yoshida
    Posted 20 July 2011 at 8:45 am | Permalink

    The approach to increase proteome coverage by GFP is very interesting. I want know more experimental details about this method. Could you provide me with principle and practical method employed.

    Yutaka Yoshida
    Niigata University, Japan

  2. Duncan Smith
    Posted 20 July 2011 at 10:39 am | Permalink

    Hello Yutaka

    The process is very simple in essence. If you acquire 5 replicate injections of the same sample, you limit the MS2 selectable window to a different window on each injection thereby reducing competition for MS2 selction overall.

    The key is intelligently designing the ‘bins’ appropriate to the m/z distribution of your sample and the number of replicates you want to run.

    What instrument do you want to utilise this on?



  3. Jenny Hansson
    Posted 20 July 2011 at 9:05 am | Permalink

    I’ve had the exact same idea and tried it, however it did not improve the number of identified peptides with our Orbitrap-Velos. I think the approach is most suitable for the Orbitrap-XL which suffers more from long duty-cycle.

    Jenny Hansson,
    EMBL Heidelberg, Germany

  4. Duncan Smith
    Posted 20 July 2011 at 10:50 am | Permalink

    Hello Jenny

    I too have trialed this approach on an Orbi Velos and the improvements in peptide IDs were far less significant than on the XL as you suggested. This workflow only pays dividends if you are duty cycle limited and with the Velos, the duty cycle limits are not as severe as the XL.

    The thing to consider is why you didnt get anymore IDs with this workflow on the Velos. Was this really due to having already acquired MS2 on all potential precursors? For very complex samples, I doubt this is the case. More likely (and what I found) was that you acquired amny more MS2 spectra even on the Velos with this approach, yet these did not result in as impressive an increase in IDs as observed on the XL. In reality the majority of additional MS2 spectra were of poor quality-limited by intensity.

    This approach can be utilised usefully even on faster instruments as you can significantly reduce the gradient time versus runs you would typically do without GPF as you can now deal with greater complexity. In addition to saving lots and lots of time, it improves detection improves low analytes as your peaks get much sharper with faster gradients. For example, 5 GPF runs with a gradient space of 5 hours in total gives much much better coverage than a single 5 hour gradient. LC seperations perform much better over these time frames and you can let your GPF deal with the complexity.



  5. Jenny Hansson
    Posted 21 July 2011 at 12:32 pm | Permalink

    Hello Duncan,

    I was also surprised that I didn’t gain IDs. Although I only made a “two-fraction” approach with 2 GPF runs on 4 hour gradients and compared to 2 normal 4 hour runs, I actually identified far less with GPF on the Velos. I used a top15 method in both cases. The samples were complex (whole cell lysates), but I noticed that I did not reach saturation (15 MS2 per cycle) in the runs with GPF, resulting in a decrease in total number of MSMS. It would probably make sense to lower the number of MS2 scans. This would also increase the quality of the MS2 spectra as you suggest. The gardient time is of course also to consider.

    I think it is a really nice approach and it totally make sense considering that you otherwise is “waisting” lots of instrument time on fragmenting the same peptides over and over again for all your replicate runs.


  6. Duncan Smith
    Posted 22 July 2011 at 10:05 am | Permalink

    Hello Jenny

    In your top 15 experiments on the Velos, were you doing LTQ CID? If so, were you running in parallel (ie were you using FT preview master scans to drive the DDA?).



  7. Jenny Hansson
    Posted 22 July 2011 at 10:50 am | Permalink

    Hello Duncan,

    Yes I did LTQ CID, without the preview mode for FT master scans. What did you use?


  8. Duncan Smith
    Posted 25 July 2011 at 11:17 am | Permalink

    Hello Jenny

    Yes I did LTQ CID with Preview mode disabled so no difference there. What is your typical duty cycle on a workflow like you describe (top 15)? How many FTMS datapoints do you get across your chromatographic peaks?

    In my experiments, I need to limit to a top 3 experiment in order to get a minimum of 16 FTMS datapoints across my peak and this is a fundamental reason why I see a not far off linear increase in IDs when using GPF. My data strongly suggests you need a minimum of 16 datapoints across your chromatographic peak to maintain data quality and consistency. I have peaks araound 25 seconds so this means my maximum duty cycle is ~ 1.5s. Even on a Velos, the gaps in your FTMS data must be 4s plus I guess?

    In the data I presented at this years PMF, I clearly showed that acquiring optimal qualitative data and optimal quantitative data (EIC label free) are mutually exclusive. I only acquire top 10,15,20 etc in cases where FTMS peak definition are not important.

    I think this very much explains our differences even despite the clear XL/Velos differences. Essentially, I used 5 bin GFP to ‘end up’ with a top 15 like penetrance but without the sacrifice of loosing lots of FTMS data.

    Hope this helps.



  9. Mal Ross
    Posted 20 July 2011 at 9:33 am | Permalink


    Thank you for your comment – I’m glad you found the article interesting. At this time, we can’t provide much more detail about Dr Smith’s methods, as he has yet to publish his research, but there are other published papers that have already used this technique, albeit without the benefit of Progenesis. One such paper is publicly available:

    The zebrafish lens proteome during development and aging

    Also, as stated in the article here, we’re hoping to publish an application note soon. This will give further guidance on how you can apply GPF in your lab.

    Mal Ross

One Trackback

  1. […] NEW RELEASE for label-free LC-MS – Progenesis LC-MS  – this is the very first chance to see  the latest release which has an extended workflow for quantification and identification of proteins – including protein statistics. There is also a new feature to support Gas-Phase Fractionation (GPF), which helps increase proteome coverage, as shown in our previous post. […]

Post a Comment

Your email is never shared. Required fields are marked *