Do you only check data quality when something has gone wrong?

Generating proteomics data from an LC-MS platform is by no-means inexpensive, a great deal of time is invested into preparing samples, preparing the columns and optimizing the mass spec conditions to generate this complex and rich data. With so many parameters that can and do go wrong, can you really afford to throw your data into a “black box” and trust the results that come out of it?

I began writing this as I flew back from Berlin having had some great conversations about the importance of data quality with scientists congregated for the Potsdam Proteomics Forum. Conversing with Progenesis customers demonstrated to me the great value that the variety of visualizations are providing. These enable results that Progenesis users are confident about. One of our German customers told me about an experiment where everything seemed fine until protein identification was carried out and some of the runs were showing very few identifications. This flagged a potential issue and using Progenesis, he was able to look back at the QC metrics page (fig.1) to find that for some of the samples there were high numbers of missed cleavages in Trypsin digestion, indicating that it had stopped working well. Although this was a painful realization, there was a quick resolution to what could otherwise have been a very drawn out procedure of looking back step by step through all of the things which could have gone wrong. He was therefore very pleased about the time he was able to save here.

QC metrics from Progenesis QI

Figure 1 – QC metrics in Progenesis QI for Proteomics

Speaking to a customer from the Otto von Geuricke University of Magdeburg, highlighted an issue that we all recognize. “I do a search and get two different accession numbers for the same protein!” I can’t say that we are able to solve that issue, as it pertains to the quality of the libraries and database redundancies, however Progenesis does offer you more confidence in the assignments of peptides to proteins and therefore in the quantitative accuracy. Peptide correlation scores (see figure 2) can help you remove peptides that have been incorrectly assigned to a protein. Once you have refined your dataset to the proteins of interest (those that are significantly changing between conditions), you should expect in most cases that the peptides of a particular protein should show the same direction of change, i.e. up or down regulation, so if you see a peptide that is behaving differently, you can remove it from the protein to give you better, more confident quantitation.

(NOTE: watch out for the upcoming application note based on the analysis of an ABRF dataset that clearly highlights the benefit of peptide correlation scores.)

Peptide correlation's to qualify correct assignment of peptides to proteins

Figure 2 – Protein review in Progenesis QI for Proteomics

LC-MS data analysis inevitably comes with a variety of assumptions and those assumptions don’t always stand up to the test: – if your data analysis happens in a “black box”, it’s quite possible that the results are misleading you. This can result in spending valuable time researching false positives or neglecting the real interesting results due to false negatives, which are very costly.

 

Do you only check data quality when something has gone wrong?

Progenesis QI software presents you with 4 crucial ways to QC your data. Before, during and after analysis.

1) 2D ion intensity maps (see fig. 3) can flag sample running problems, this quick view gives you the ability to:

a) pick up on any samples that may need re-running

b) adjust your chromatography to improve separation

Import Screen 2D ion intensity maps to QC your runs

Figure 3 – Ion intensity map shown at the Import Data step

If you do need to re-run problematic samples then Progenesis is flexible enough to enable you to add those samples into your experiment at a later point, maximizing your time and resources.

If you want to hear more from a real life example, Prof. Paul Langlais gives a very informative and entertaining account entitled ‘From the Dark to the Light: How Progenesis Added Years to my Life’, offering some great insights about how he was able to use visual QC in Progenesis to optimize the LC-MS set-up in his lab.

2) The review alignment screen (see fig.4), allows quick visual assessments (and improvement if needed), of alignment quality. Progenesis provides percentage alignment scores and color coded views so you can easily assess the quality of alignment before you start drawing conclusions from the peak picking and co-detection

Review alignment step - a quick way to see if you have good alignment

Figure 4 – Review alignment screen

3) The PCA plot in the statistics screen (see figure 5) will allow you to quickly gauge whether your conditions are the primary reason for the separation in your experiment or if there is a systematic reason (/error) for separation between samples, such as the running order. The PCA plot below shows an experiment in which the samples are not clustering according to the experimental design and there are other factors that need to be controlled in order to get good results.

Principle Component Analysis- quickly find outlier samples or qualify your samples separate based on your experimental design

Figure 5 – PCA Plot showing poor separation of groups

4) QC Metrics screen (figure 1), this screen offers many useful metrics to help you make sure your system is running optimally and, in case you spot something strange happening, this screen can also offer insights to help you find the cause of problems such as the trypsin degradation that our friend from Bochum picked up on.

Quality data analysis now extended to MS1 labelled data

You can now confidently analyze your SILAC or di/tri- methyl labelled proteomics data with an export from Progenesis QI for proteomics into Proteolabels. You will benefit from the “no-missing-values” approach of Progenesis co-detection and gain a great advantage from Proteolabels’ ability to auto-detect and find pairs or triplets, even when only one of the doubles or triples has been identified. This, together with the many visual QC displays means that you can be confident of getting maximum information from your samples.

Figure 6 shows the benefit in sensitivity that you gain through Progenesis co-detection and Proteolabels.

Proteolabels slide showing benefits in terms of sensitivity gained by Progenesis co-detection

Figure 6 – Diagram to show benefits of peak co-detection

A couple of other Proteolabels features that will further increase confidence in your labelled data analysis are peptide scoring (figure 7) and the use of these scores in weighted averaging at the protein quantitation step (figure 8).

images showing QC graphics from Proteolabels to help you qualify the acuracy of your quantitation with peptide scoring

Figure 7 – Peptide scoring

Proteolabels protein inference and peptide scoring

Figure 8 – Protein inference and weighting factors in peptide ratio

Proteolabels gives many visualizations which will help you to QC your data analysis before you draw conclusions. We have only shown a few here. For more information on Proteolabels please get in touch with us via email at the address demolicence@nonlinear.com.

Finally, while on the topic of data integrity, you can automate even more of your data handling using Symphony Data Pipeline, thus removing some of the manual steps where ‘things’ could go wrong.

To summarize, Progenesis QI for proteomics offers data quality and assurance along with data transparency (QC metrics, alignment scores, etc.), as does Proteolabels (peptide scoring and weighting). This also means the benefits of co-detection are extended to your labelled analysis. Symphony reduces human error of repetitive tasks, allowing you to support data quality and thereby giving you confidence and reliability in your results.

If you’re using a “black-box” solution and would also like to have more transparency and confidence in your data analysis, get in touch with us by email at the address demolicence@nonlinear.com.

You can now analyse your labelled data with Progenesis!

During an upcoming webinar on 28th March, you can hear about a new offering for quantitative proteomics. Progenesis QI for proteomics (QIP) now has the capability to analyse samples where stable isotope labels have been added, including SILAC and dimethyl labelling. These capabilities are added through a new module called Proteolabels, from Omic Analytics Ltd. Proteolabels supports data sets from any vendor and search engines that are supported by Progenesis QI for proteomics.

How does it work?

In a regular label-free experiment in Progenesis QI for proteomics, different LC-MS runs are aligned and then, using the co-detection method, the same (peptide) ions are quantified in every LC-MS map. In your workflows, if you introduce an in vivo label on an amino acid (e.g. in SILAC) or following digestion (e.g. in dimethyl labelling), different samples can be multiplexed within the same run, with a mass shift introduced per peptide. Proteolabels is able to detect pairs (duplex mode) or triples (triplex mode) and produce peptide and protein ratios as well as statistics for differential expression analysis. Labs that employ label-free and label-based methods can now use the same software for both types of analysis!

slide

Slide from: Discovery and Analysis of Peanut Allergens using Proteomic approaches with Ion Mobility and High Resolution Mass Spectrometry, by Waters Corporation Food Research.

How does it interface with Progenesis QI for proteomics?

In Progenesis QIP version 3 onwards, there is an option to launch Proteolabels for those customers that have purchased the add-on module. Keeping with the usual workflow, Progenesis QI for proteomics performs alignment, peak co-detection and database searching; thereafter, clicking “Export to Proteolabels” will open the data in the Proteolabels module.

Screenshots showing the connection between Progenesis QI for proteomics and Proteolabels

Figure 1: The connection between Progenesis QI for proteomics and Proteolabels. If you have the Proteolabels module installed, clicking “Export to Proteolabels” launches the new software and performs downstream quantitative processing.

Detecting high-quality peptide pairs

Proteolabels first reads the identified peptides, works out the labelling strategy used, and the best thresholds for detecting pairs (or triples in triplex mode) using its “Auto-Detect” function. You can ask Proteolabels to quantify only peptides where both the light (unlabelled) and heavy (labelled) ions have both been identified in the search. However, Proteolabels is able to increase sensitivity by profiling these pairs, and then looking for a quantified peptide in the expected location of the LC-MS map to form pairs (without needing both to have been identified). Due to the way in which Proteolabels finds and scores the quality of peptide pairs detected, if you click the setting requiring only that a confident ID has been made for one of the light or heavy peptides (Blind Pairing), there can be major increases in the number of proteins quantified, with no loss of precision, as shown in Figure 2:

Chart showing a 30% gain in the number of proteins quantified with no loss of precision

Figure 2: Analysis of one experimental data set deposited in the ProteomeXchange repository (PXD003284 – http://www.proteomexchange.org (1)) exploring co-efficient of variance (CV) across replicates for protein-level ratios. Enabling the feature in Proteolabels to quantify peptides without requiring both to be identified, gives a 30% gain in the number of proteins quantified, with similarly high levels of precision.

Since Proteolabels follows on from Progenesis QI for proteomics, there are also sensitivity gains from the co-detection method. This means there are no missing values when analysing multiple replicates or multiple sample conditions, as peptides can be quantified with identification evidence propagated across multiple runs. For example, we ran a simple test on data from one fraction of a public data set – PXD003284 (http://www.proteomexchange.org/) to see the difference between running only a single replicate, versus analysis of that same file alongside two further replicates. As shown in Figure 3, in this analysis, co-detection gains around 75% in the number of peptides quantified, and 39% in the number of proteins.

Chart showing the benefits of co-detection to the number of peptides and proteins quantified

Figure 3: Analysis of one fraction, one sample from PXD0003284 versus the same sample, co-detected with two additional replicates. The co-detection feature in Progenesis QI for proteomics gives a 75% gain in the number of peptides quantified and a 39% gain in the number of proteins in that sample via a Proteolabels analysis.

Improving data quality with the Peptide Score

Proteolabels applies a “Peptide Score” to all pairs (or triples) based on profiling the chromatogram match and drift time (where ion mobility separation has been applied). Peptides with a low pair score get down-weighted when it comes to protein-level quantification. As an example in Figure 4, both peptide pairs have been confidently identified, and there is a good elution time and mass/charge match. Most software packages would accept this as a reliable quantification value. Proteolabels is able to detect that the elution profile of the light and heavy peptide on the right panel do not match well and this quantification is likely to be less reliable.

Screenshots showing improved data quality via Proteolabels' peptide score

Figure 4: Peptide scoring in Proteolabels detects poorly matched pairs or triples, up-weighting the most reliable quantitative values at the protein-level.

How is protein quantification performed?

For most proteins, there are multiple peptides reported that could contribute to the final protein quantification value. Proteolabels first performs grouping of proteins based on shared peptides, and then applies a novel weighted averaging of signals based on the Peptide Scores and the signal intensity of peptides to arrive at the protein-level ratio. In other label-based approaches, it is common for the protein ratio to be inferred from the median peptide ratio to remove outliers. In Proteolabels, weighted averaging is superior to the median peptide ratio, especially for proteins quantified by a small number of peptides, as it allows all peptides to have some contribution towards the final protein-level quantification value. The combination of co-detection, peptide profiling/scoring and intelligent protein quantification affords both high precision and high accuracy quantification (Figure 5).

A volcano plot showing how the high precision of co-detection and pair finding gives the ability to detect differential expression with confidence

Figure 5: A volcano plot of data from one experiment of PXD003284 data set, processed with Proteolabels. The high-precision from co-detection and pair finding enables reliable detection of differential expression (FDR corrected p-values<0.05), down to modest fold change values. 

How to check the quality of your data?

Proteolabels has a variety of intuitive QC metrics and plots for examining your data (Figure 6), and is interactive at each stage, enabling you to be confident that the data is high-quality, ready for downstream interpretation. Different plots can show you how well the instrument was calibrated, the distribution of identification and peptide scores, and any relationships between the abundance of peptides and the reliability of the quantification.

A selection of the plotting and data exploration features in Proteolabels

Figure 6: Proteolabels provides a range of plotting and data exploration features.

Do you want to hear more about Proteolabels?

If you would like to learn more, please register for the webinar and/or read the press release.  If you would like to try Proteolabels, please contact us.

Prof. Andy Jones

References

  1. Patella, F., Neilson, L. J., Athineos, D., Erami, Z., Anderson, K. I., Blyth, K., Ryan, K. M., and Zanivan, S. (2016) In-Depth Proteomics Identifies a Role for Autophagy in Controlling Reactive Oxygen Species Mediated Endothelial Permeability. J Proteome Res 15, 2187-2197

Report research with confidence – Who else wants to publish small molecule LC-MS analysis confidently?

Meaningful results

Over 800 groups worldwide are using Progenesis QI routinely to generate small molecules and proteomics results that really reflect the effect of the conditions in the experimental design, so they can have confidence in presenting these results to their peers, with minimal fear of false positives. How can we claim that only Progenesis gives this sort of confidence? The answer is in the unique and powerful co-detection approach Progenesis takes to peak picking of LC-MS data. Read on to find out about some of the interesting applications Progenesis technology is being applied to…

Research publications news from Asia-Pacific

We’ve seen a huge growth of Progenesis QI sales in Asia-Pacific over the past 3-4 years and it’s good to see that this is now being followed by a similarly impressive boost in publications citing Progenesis QI from the region, particularly from China where there were 16 publications citing Progenesis QI last year and 3 already this year.

Each year, more and more publications are citing the use of Progenesis

Figure 1: The number of publications per year (worldwide).

There have also been publications from Institutes in Japan, South Korea, Taiwan, Singapore and India in the same period. In view of this I thought it would be a good time to review the applications covered by these publications and to highlight a couple of particularly interesting ones in a little more detail.

Broad applications

The broad application fields covered by the publications include clinical and health science, food and nutrition, plant science, natural products and environmental research. Natural products research in China is dominated by research into Traditional Chinese Medicine (TCM) for which there are large departments in many universities and even entire universities specialising in this field. The research involves investigating the mechanism of action, active ingredients and safety of traditional remedies, definitely an interesting case of “old meeting new”. Traditional remedies which may have been used for hundreds of years are now being analysed with cutting edge technology including high resolution LC-MS systems and Progenesis QI software in the hope that isolating the active ingredients and understanding how they work could lead to development of new drugs. Food and nutrition is another field where modern omics research is being applied to a traditional industry for purposes such as improving food quality and taste as well as food safety and quality control. It’s the exceptional ability of Progenesis QI to “find the needle in the haystack” – detect subtle differences in the profiles of complex metabolite mixtures that enables success in these different fields of research and analysis.

Some markets which already use Progenesis QI software

Figure 2: Some markets which already use Progenesis QI software

I’d now like to highlight two very recent publications which illustrate the high level and quality of research being performed with the help of Progenesis QI in Asia. Firstly, a publication from China which shows that in addition to the large amount of plant and natural products research, there is also high level clinical research taking place.

Lipid Profiles in Maternal Plasma

A group at Chongqing Medical University have performed a lipidomics study on the mechanism of Gestational Diabetes Mellitus (GDM) by monitoring changes in lipid profiles in maternal plasma throughout pregnancy. The ultimate goal is to understand the mechanism and cause of a condition which can lead to serious long term consequences for both mother and foetus.

The study involved 61 participants, 34 controls and 27 diagnosed with GDM. Plasma lipid levels were monitored at different stages of pregnancy (at each of the trimesters) through separation and detection on a Waters Acquity UPLC I-class system coupled to a Waters Xevo G2 QTof MS. The data was processed in Progenesis QI including relative quantification via the unique and powerful co-detection method which generates no missing values leading to more reliable statistics and therefore better results, plus identification using Progenesis MetaScope and ChemSpider. As Progenesis QI can export data using a flexible *.csv format, sophisticated multivariate statistical analysis can easily be performed in external software to extract some meaningful results from the experiment despite the large biological variance always found in clinical studies. In this way trends in relative levels of a large number of lipid species throughout pregnancy were monitored in both control and GDM subjects and a number of polyunsaturated or chemically modified phospholipids were found to be present at significantly lower relative abundances in GDM compared to control subjects throughout pregnancy. These results will contribute to better understanding of the mechanism and causes of GDM.

Progenesis Alignment and co-detection Workflow infographic

Figure 3: How the Progenesis workflow enables 100% matching and no missing values, meaning reliable statistics which will lead you to confident biological discoveries.

Singapore Phenome Centre Publishes Research to Benefit the Environment

The second publication I’d like to highlight is significant as I believe it is the first to be produced by the Singapore Phenome Centre (SPC) based at the Lee Kong School of Medicine, Nanyang Technological University (NTU). The SPC is a member of the International Phenome Centre Network (IPCN) and was set up in association with the National Phenome Centre at Imperial College, London and the Waters Corporation to conduct research in two main areas, clinical and environmental. This initial publication is in the environmental field and is concerned with profiling the metabolites and lipids that form the Soluble Microbial Product (SMP) contained in the effluent from Bioreactors used in wastewater treatment. Since SMPs can lead to issues such as fouling of membranes in the bioreactors, it’s important to understand their composition, origin and the system parameters that can influence their production.

The experiment consisted of analysing filtered samples taken from a continuously stirred tank anaerobic bioreactor at intervals of 0, 4 and 48 hours after batch feeding it with a synthetic feed mixture. To ensure maximum metabolite coverage, liquid-liquid extraction was used to separate samples into polar metabolite and lipid fractions which were both analysed on positive and negative mode (three replicate injections per condition). Analysis was performed using Waters Acquity UPLC system with HSS T3 column for polar and CSH C18 for lipid separations, coupled to a Xevo G2-XS Q-Tof MS. Again, the power of Progenesis QI co-detection was used to process the data and find the compounds that were significantly changing in relative abundance between the different time points, as well as for identification of compounds of interest. For further multivariate analysis including OPLS-DA, data was exported from Progenesis QI directly into EZinfo. Identification used both Progenesis MetaScope and ChemSpider in both of which it is possible to search many databases for maximum coverage while filtering by isotope distribution and theoretical fragmentation analysis to improve discrimination of results.

Image of a Waters Xevo G2 XS QTOF machine

Image of a Waters Xevo G2 XS QTOF machine

Due to good experimental technique and the alignment and co-detection capabilities of Progenesis QI, technical replication was excellent enabling very clear distinctions between the conditions as shown with PCA. Using multivariate statistics such as OPLS-DA and the S-plot (in EZinfo) it was possible to extract the compounds that contributed most significantly to the pair-wise differences between the conditions. Of these compounds, those that increased in relative abundance from 0-4 hours (fermentation stage) included both polar metabolite and lipid species while those that increased from 4-48 hours (methanogenesis stage) were all lipids, mainly phospoholipids and cardiolipins (diphospholipids). As this study was the first to include both polar metabolite and lipid SMPs, a more complete picture than previously of the metabolic processes occurring at different stages of wastewater treatment could be obtained.

I’m sure we will see many more great publications citing Progenesis QI in the coming months so perhaps I’ll get the opportunity to give some further updates on them quite soon. In the meantime I invite you to download Progenesis QI and see how it can help you to generate high quality data for publication.

New Year, new release

With the new year comes a new release of Progenesis QI, version 2.3. We are making this available free of charge to all our users running v2.0, v2.1 and v2.2.

Here are 3 reasons you might want to upgrade.

1 Symphony support

Symphony_Logo.pngThis upgrade will allow you to use Progenesis QI in conjunction with Symphony data pipeline, so you can conduct the most time-consuming parts of the informatics workflow in parallel to acquisition: Here’s a representation of a timed comparison using the conventional workflow compared to the Symphony workflow.

Diagram showing the informatics workflow speed-up of almost twice using Symphony

In the words of one of our Progenesis QI and Symphony users:

Dr. Paul Skipp

“Symphony offers a solution to address many challenges, providing a platform with automated, flexible and adaptable workflows for high-throughput handling of proteomic data. Just the simple step of being able to seamlessly and automatically copy raw files to a remote file location whilst a column is conditioning, maximises the time we can use the instrument for analysis. Previously, the instrument could be idle for 1-2 hours whilst data is copied to a filestore in preparation for processing. With three Synapts generating data 24/7 in our laboratory, this alone is a major advance.”

“Symphony’s flexibility of being able to execute sample specific workflows directly from the Masslynx sample list will have a major impact on our productivity.”

Paul Skipp, Associate Professor in Proteomics and Centre Director, Centre for Proteomics Research, University of Southampton, UK.

2 SONAR support

SONAR logoProgenesis QI v2.3 has support for SONAR, an exciting new Waters technology that gives increased confidence and speed to busy mass spec/Omics research laboratories that need to get the answer right the first time.  With efficient workflows, SONAR offers new possibilities, with an acquisition mode that collects MS/MS results from a Data Independent Acquisition (DIA) experiment.

3 Usability improvements

We have listened to our users and made some changes that make the Progenesis QI workflow smoother to use. With better zooming, improved flexibility using adducts and exporting known unknowns, Progenesis QI is looking in good shape for 2017!

Would you like to give it a try?  You can download v2.3 here or contact us if you would like to renew your maintenance plan.

Endings and beginnings

It’s been a busy year at Nonlinear, seeing new releases of both Progenesis QI and Progenesis QI for proteomics.  To all of our loyal users, we’d like to say:

Elements from the periodic table spelling out Thank you

We very much appreciate your continued support and partnership in the further development of Progenesis QI.

In addition, we have had some new starters; here’s a little information about them:

Janusz Debski PhD, Applications scientist

Photograph of new starter Dr Janusz Debski, Applications specialist

Janusz joins us with years of experience in mass spectrometry, having spent his career in the laboratory. Prior to joining Nonlinear, he was working as Core Facility Manager in the Mass Spectrometry Laboratory at the Institute of Biochemistry and Biophysics, Polish Academy of Sciences, Warsaw.

Janusz has experience of great use to us here at Nonlinear, as he really understands the needs of our users, having been responsible for all the aspects of running a core facility.  He understands the pressures that senior scientists have from ensuring tight quality control, writing grant applications, training new staff, designing experiments, data analysis, managing collaborations, and more besides.  It’s really useful for us to have that perspective within our four walls.  Also, Janusz has used mass spectrometers from a number of manufacturers, which is again very helpful, given the multi-vendor nature of Progenesis QI.  We are so pleased to have Janusz on board. 🙂

Paul Henesey: a new addition to the test team

Photograph of new starter Paul Henesey, Test scientist

In Paul’s own words…
“I was born and bred in Wigan. I’ve been an Everton fan from birth, not being built for rugby! Since graduation from Newcastle University with a Genetics degree, I’ve worked as a Cytogeneticist in Leeds, Wellington (NZ) & Sydney (OZ). I subsequently worked for a company that supplied medical devices to the Cytogenetics and Pathology markets. Over the years there, I fulfilled many roles including service & support, QA and R&D test manager.

I’m married to Donna and we have 2 kids, Noah and Alice. Unfortunately for Noah, he also supports Everton! When I’m not taxiing kids to places, I enjoy a round a golf, watching football and I enjoy drinking beer. Since joining Nonlinear, I’ve rediscovered a forgotten love of playing tabletop games and pool!

It’s great to be at Nonlinear. It’s an exciting new challenge to be the latest addition to the development team, specifically as a tester in the test team. I hope to quickly learn about the products, the science and test methods used by the guys here, whilst also bringing my experience to the team to help continue delivering quality software.”

So, now you know about our new starters…

You will be pleased to learn that we have lots of exciting plans for 2017, so do keep reading here. If you have any suggestions for the blog, or would like to contribute, please don’t hesitate to submit them to us – we appreciate your feedback and contributions.

We are now nearly upon the holiday season and our offices will be running at reduced capacity from Friday 23rd December through to Tuesday 3rd January.  Finally, we would like to wish you peace and joy at the close of 2016 and every success in your research for 2017.

“…we can’t manage this amount of data with normal software…”

Dr Daniel CarrizoHere at Nonlinear, we love to learn about how researchers use Progenesis QI and how it helps them in their day-to-day lives. Below, Dr. Daniel Carrizo tells us in his own words about his use of Progenesis QI to assess exposure to persistent organic pollutants (POPs).

Daniel has two affiliations:

Astrobiology Centre (CSIC-INTA)
Dept of Planetology and Habitability
Torrejón de Ardóz 28850
Madrid
Spain
  Institute for Global Food Security
Queen’s University Belfast
Belfast
UK

“I am working with human samples exposed to background levels of contamination. There are two conditions; high and low levels of exposure to organic pollutants. By comparing the lipidomic profile in human serum samples, I try to find any significant differences and ascertain whether they are related to different levels of exposure. Of course, the idea is to find a biomarker or metabolites for this exposure, in this case, high exposure to POPs.

Interior of the Astrobiology Centre (CSIC-INTA)
Interior of the Astrobiology Centre (CSIC-INTA)

In this experiment, I used liquid chromatography-quadrupole time-of-flight-mass spectrometry in ESI (− and +). At the beginning of the experiment, I used 10 pooled samples representative of all the sample set, so if I had 100 samples, I took 10 µl of each, I then homogenized the pooled sample and took an aliquot (300 µl approx.). Then from the 10 target samples I ran 2 of these pooled samples, as a QA/QC routine. I ran 3 replicates of each target sample, which amounted to between 300 and 500 runs ready for analysis.

PASC chamber, for planetary atmospheres and surfaces simulations
PASC chamber, for planetary atmospheres and surfaces simulations

Of course, the data generated is too complex to analyse without specific software like Progenesis. When you have 3000 or 4000 ions of interest and 300 samples, it is impossible to manage this amount of data with normal software. I have found Progenesis QI is robust and easy to use and the technical support is excellent.

Progenesis QI helps me to overcome problems with background peaks, experimental design, I can search easily for potential identified compounds.

The most important aspect is the power of the analysis and robustness of the data generated, as well the easy design for setting up experiments within the software. With Progenesis QI, you can do or redo the experimental setup on imported data as you need and explore the data generated over time. Progenesis QI has helped hugely in the identification of key compounds linked to lipid metabolism, which were responsible for the homeostasis of the metabolism.

Looking ahead to our use of Progenesis QI in the future, I think the key point is the robustness of the data. Firstly, we find important evidence of novel biomarkers related to our experimental conditions; that is, high vs. low exposure levels. Then, as we have this nice and sharp data, we will design and explore other types of samples and analysis/conditions.

I advise people to try Progenesis QI because of its robustness and easy to design experiments. Another important feature about this software, from my experience, is the simple process to find real identification of the possible metabolites or biomarkers you find.”

So that was Daniel’s story; how about you?

  • Do you have a research story involving Progenesis QI that you’d like to share with us? Please contact us – we’d be happy to hear from you.
  • Having read this account of Daniel’s work, would you like try Progenesis QI yourself? Download it here.
  • Would you like to read more stories about people using Progenesis QI? Here are 3 recent blog posts relating to researchers’ experiences with Progenesis QI:

The Good, the Better, and the Best of Progenesis QI
Kai P. Law & Ting-Li Han
China-Canada-New Zealand Joint Laboratory of Maternal and Fetal Medicine
Chongqing Medical University and Auckland University

Progenesis QI helps streamline data processing for lipidomics research
Jace W. Jones, PhD
Research Assistant Professor of Pharmaceutical Sciences

How Progenesis QI helps to rapidly quantify and effectively identify compounds in complex metabolomes such as Garcinia buchananii samples
Dr. Timo Stark
Food Chemistry & Molecular Sensory Science
Technische Universität München

For more user feedback, you can read some independent reviews about both Progenesis QI and Progenesis QI for proteomics, on the independent site, SelectScience.

It just remains for me to say a big “Thank you!” to Daniel for sharing his research with us. 🙂

How Progenesis QI resolves the problem of missing values

I’ve been at Nonlinear Dynamics for ten years now. In that time, we’ve seen the Progenesis range develop beyond just proteomics and, in 2013, we were acquired by Waters, although Progenesis QI will work on label-free data from any major MS vendor. I was originally brought into Nonlinear Dynamics to generate leads, so after 2 days of training, I started calling people to tell them about this unique technology. I loved my product and was really keen! However, sometimes people were so busy doing their research and subsequent data analysis that they were too busy to fully understand why Progenesis QI was so different. They had no time to save themselves some time! This can still be the case, even though independent reviews of Progenesis QI say things like this:

“Gold standard for label-free LCMS data analysis across all instrument platforms.”

Mark McComb, Boston University, US

So how can we get people to quickly understand why Progenesis QI is different? In order to do that, researchers need to understand the major problem in Omics data analysis: the holes in experimental data – known as missing values – that can be introduced by inefficient software. So, to help us get our point across in an easily-digestible, quick-to-read format, we produced this infographic to help you understand what switching to Progenesis QI means for your research. Please do have a look. If this piques your interest, at the end there is a 16 minute video in which Dr Paul Goulding describes in detail the scale of the missing values problem and how Progenesis QI uniquely resolves this.

Visual guide to the missing values problem and the unique Progenesis QI solution

Interested in learning more? Whatever instrument you use, why not download Progenesis QI or Progenesis QI for proteomics and analyse ALL of your data?

Would you like another clue? Look in the library.

Download the Waters Metabolic Profiling Collisional Cross Section (CCS) library

In metabolomics, we are detectives, gathering corroborative evidence from various parameters, such as accurate mass, retention time, etc. in order to draw a valid and correct conclusion.

Dr. Lee Gethings looking through a magnifying glass

Noun: corroborative evidence – additional evidence or evidence of a different kind that supports a proof already offered previously”

Progenesis QI has just been given another piece of decisive corroborative evidence: you can now search using the new Waters Metabolic Profiling CCS library. The initial version of the library includes 956 metabolites and lipids, over 900 of which include CCS measurements and two thirds of which have MS/MS spectral information.

What is CCS?

Ion Mobility Separation (IMS) is a process that differentiates molecules as they tumble through a gas – their progress is related to their average rotational collision cross section, or CCS. This is what makes IMS so powerful, because CCS is determined by unique molecular properties.

CCS is an important distinguishing characteristic of an ion, related to its:

  • chemical structure (mass, size)
  • 3-dimensional conformation (shape), where conformation can be influenced by a number of factors, including the number and location of charges

CCS is a robust and precise matrix-independent physicochemical property of an ion which can provide many powerful analytical advantages.

You can read more in this poster about the Design and application of a CCS and MS/MS Metabolic Profiling Library.

Diagrammatic representation of CCS

What are the advantages of CCS? Why would researchers be interested in this technology?

  • Ion Mobility Separation provides orthogonal separation for increased confidence in results; you can distinguish between co-eluting compounds of identical mass and elemental composition.
  • False positives can be removed and false negatives avoided using CCS values as a screening parameter
  • Mobility resolution facilitates spectral clean-up for both precursor and fragment spectra
  • CCS improves identification where retention time or mass shifts have been observed

Yesterday, I was talking to two Waters chemists who had been trying out the new CCS library for the first time with a customer. Here’s what they had to say:

“…during a customer demo, we used the new CCS library to screen clinical research samples in metabolomics experiments with special interest in steroid profiling. The correct identifications and relative quantification of the steroids – which partly have same elemental composition – is the intention of this project for a better understanding of diseases like hypertension caused by primary aldosteronism. This was my first time using the new Metabolic Profiling CCS Library with real samples. The last few months I’ve heard a lot of discussion regarding comparability of CCS on different instruments and some of these statements were misleading as CCS is a physicochemical parameter and the CCS value must be independent from the platform of acquisition. Therefore I was very curious to see the outcome of the experiments and it was a very positive surprise to see the library entries and the measured CCS values gave fantastic matches with deviations better than expected. This is even more amazing because it is regardless of lab, instrument, operator and continent. These results give my customers and me great confidence in the added value of CCS for correct identifications during non-targeted research projects and especially here with complex matrices and steroids having same accurate mass…”

Gunnar Weibchen, PhD, Mass Spectrometry Sales Specialist in Germany

“I totally agree, it was excellent to see such good correlation in our results from a CCS database created in a different lab on different instruments by different users – this will be a major factor in helping us build customer confidence in the benefit of adding Ion Mobility measurements to our datasets and will help validate and standardise routine CCS analysis for our customers.”

Jonathan Fox, PhD, Principal Applications Chemist, European Applications Laboratory, Waters U.K. Limited

The determination of CCS values allows an extra measure of confidence for compound identification in Progenesis QI. Each ion’s CCS value can be compared against established values held in a supplementary database file – an additional properties file – as part of the identification process, and this increases the specificity of compound identification. As well as the new Waters Metabolic Profiling CCS Library, you can use an existing database of known CCS values or build one up based on empirical data from your own samples, for use in future experiments. Even if you don’t have ion mobility data, the MS/MS values in the library will be useful to you for identifications. We are looking to build libraries for people to share, so if you are interested in contributing please contact us.

So there you have it, another piece of corroborative evidence to help you identify your compounds with more confidence. The answer you are looking for could well be in the library Smile.

To find out more about the Progenesis QI software and how it can alleviate your identification issues please get in touch.

3 benefits for your compound data analysis you don’t want to miss!

Kai and Ting-Li have very kindly written an article about how they have benefited from an easy to use interface, were able to overcome challenges around correcting retention time shifts, and gained confidence in their ability to identify small molecules, correctly, by employing Progenesis QI to intelligently rank possible identifications. Moreover, they recount how Progenesis QI has empowered them with the ability to gain from the additional resolution of Data Independent Acquisition (DIA) data that Progenesis has deconvoluted “exceedingly well”. Here’s their account…

The Good, the Better, and the Best of Progenesis QI

Kai P. Law & Ting-Li Han
China-Canada-New Zealand Joint Laboratory of Maternal and Fetal Medicine
Chongqing Medical University and Auckland University
Email: kai.law1@virginmedia.com; morgan_han_0816@hotmail.com

kai-p-law-and-ting-li-han

Dr. Law (left), Dr Tan (right) and Waters’ engineer Mr. Da (middle)

Introduction

Metabolomics is a cross-disciplinary subject. Although 15 years have passed since it was first proposed, the field of metabolomics is still relatively young. Many challenges have presented themselves in the course of its development. One difficult challenge lies in the processing of highly complex, multi-dimensional datasets produced by mass spectrometry.

We are specialists in mass spectrometry and metabolomics. Our works range from clinical trials and molecular biology to method and technology development. Clinical studies and biological samples, collected by clinicians, are the most challenging. Not only have these clinical studies required a large number of samples (ranging from hundreds to thousands) to have adequate statistical power, but little control is imposed over the patients. Sample type and quality vary considerably. Some of those are longitudinal cohort studies. To handle these challenges, innovations are imperative.

We choose Waters systems over other manufacturers, because of their technological innovations, their high quality of service in the UK and China, and their easy to use informatics tools. One of the technological innovations of Waters’ Q‑ToF systems is data independent analysis (DIA). Waters called their approach MSE and this was introduced commercially in 2007. During MSE data acquisition, the energy of the collision cell is dynamically switched between low-energy and elevated-energy states. This produces alternating composite mass spectra of all intact molecular ions, followed by chimeric mass spectra of all product ions. Similar approaches were adopted by other manufacturers subsequently, such as Thermo AIF (all-ion fragmentation) and Agilent All-ion MS/MS in their Orbitrap and Q-ToF systems. DIA was developed to address the shortcomings of data dependent analysis (DDA) and found applications both in metabolomics and in proteomics. However, processing of the DIA data comes with its own challenges. Progenesis QI is one software, in our view, that processes MSE data efficiently.

 

The Good (Easy to Use Interface)

The Progenesis QI graphical user interface (GUI) has been designed to streamline data processing, from data importing, chromatographic alignment, peak picking, deconvolution, data normalization and spectral feature annotation to data analysis. This is in contrast to other R-based or MATLAB-based pipelines or toolboxes, which normally use a command-line interface (CLI). Though flexible and extendable from developers’ and power users’ points of view, CLIs, with their steep learning curves, deter general users, whereas a nicely designed GUI empowers users at all levels. The Progenesis QI interface is not only easier to learn and use, but it allows users to fine-tune the chromatographic alignment and ion deconvolution. Commonly used statistical functions are available to assist data interrogation.

Screenshot from Progenesis QI in experiment design setup showing options for: (1) Between subject design and (2) Within subject design

Figure 1: Unlike other similar commercial or academic data processing software, Progenesis QI has both standard between-subject experimental design, and within-subject experimental design (repeated measurement of the same experimental subject). The latter design determines the p-value of a variable using paired-ANOVA analysis that eliminates genetic, diet, and/or environmental effects between experimental subjects, thus allowing us to focus on the disease or condition we are investigating and not the natural variations among our patients.

The Better (Easy to Fine Tune)

Most popular data processing tools align chromatographs very well. It was not so several years ago. A challenge in chromatographic alignment is non-linear shifts of retention time of metabolites. The retention time shift could be relatively large in large-scale studies since the samples cannot be analyzed in one batch. MarkerLynx, introduced by Waters, aligned chromatographic data to an internal reference and assumed a linear retention time shift. This assumption rarely holds true for most metabolites from complex biological matrices. Consequently, the chromatographic binning window had to be set to a relatively large value and the results sometimes missed out important information. XCMS was the first software to allow non-linear alignment. However, MarkerLynx could still perform better than XCMS, which can only align chromatographic peaks with a high degree of similarity.

Progenesis QI uses vectors to align chromatographic data. This greatly enhances flexibility to modify chromatographic alignment. This is because users can drag and add (or remove) vectors to improve the alignment of an individual chromatograph to a chosen reference run. Indeed, no other popular data processing tool highlights the problem areas of the chromatographs and allows users to fine-tune the individual chromatographic alignment without changing the program parameters and re-running from the beginning.

During ionization, a metabolite forms multiple ions, multiplying the complexity of the dataset. Data deconvolution algorithm in Progenesis QI performs ion deconvolution based on the user’s inputs. Reviewing ion deconvolution permits users to select (or deselect) additional adducts of a metabolite (see example below).

 

Screenshot from Progenesis data deconvolution screen showing adducts of the same compound that exhibit a difference in chromatographic profile but the same mass profile

Figure 2: Uric acid was detected as [M+H]+ and [2M+H]+ ions, but because the peak shapes were different, they were not grouped by deconvolution. However, these two ions both have the same retention time and so were assigned the same ID during compound identification. I was then able to go back to deconvolution, and make changes accordingly.

The Best (Confidence in Identification)

Spectral feature annotation is probably the most difficult challenge in metabolomics (besides biological questions being asked). This is because metabolites are chemically diverse and genomic information cannot be used as a constraint to improve identification confidence. Unlike proteomics analysis, false discovery rate cannot be determined. The MetaScope search tool in Progenesis QI is powerful and flexible enough to take the advantages of DIA data.

Conventionally, fragmentation data are acquired by DDA. Herein, a hybrid mass spectrometer first performs a survey scan, from which the ions with the intensity above a predefined threshold value, are stochastically selected and fragmented. The DDA spectra are then matched against reference spectra in a database (e.g., MassBank, or NIST). Because DDA has a preference biased toward the ions having the highest intensity, less abundant ions are not fragmented or identified. This is in contrast to DIA, where all ions are fragmented non-selectively.

However, spectrum deconvolution of DIA data is very complicated, which has prevented effective use of DIA data previously. Progenesis QI performs DIA spectrum deconvolution exceedingly well. In addition to fragment ions, other physical properties such as accurate mass, isotopic pattern, retention time and collision cross-sectional area are used to filter out all possible matches from a metabolite database. The structure of the selected metabolite is shown on the screen and an overall confidence score is calculated to assist users to select the most probable metabolite for identification. Further information is easily accessible via a link to the metadata of the selected databases. Users are able to make the most informed decision to accomplish compound assignments, manually. This approach significantly reduces the possibility of false possible identifications compared to other methods that are based only on accurate mass, and then report a long list of all possible metabolites for a spectral feature. Finally, accepted metabolite IDs can be easily exported for pathway search.

 

Screenshot from Progenesis review compounds screen showing compound metadata, possible identifications list and corresponding compound structure

Figure 3: Compound identification is in my view the most difficult step in metabolomics. Progenesis QI has features to assist me in conducting the assignments. 10 possible matches were returned with less than or equal to 1.35 ppm variance; it would not have been possible to select the correct answer confidently based on this alone. When I considered the mass error, dipeptides appeared to be the most probable answers. However, by taking into account the isotope similarity, fragmentation score, and retention time, I could confidently assign the spectral feature as L-tryptophan. If I am uncertain about the assignment, or want to know more about a particular metabolite, a link therein directs me to the metabocard of the database.

If you also want to benefit from an easy to use interface that empowers you to have confidence in your ability to identify small molecules, download Progenesis QI for a free trial today.

Finally, a big thank you to Kai P. Law & Ting-Li Han for their account. If you already use Progenesis QI and would like to share your experience of using Progenesis QI, please contact us.

Progenesis plugins: gotta catch ’em all!

Data import plugin options in Progenesis QIHere at Nonlinear Dynamics, we’ve always strived to keep Progenesis QI and Progenesis QI for proteomics vendor agnostic.

This allows our users to utilise a single software package to analyse data from all of their instruments, and interface with a wide range of search methods and pathways tools.

We achieve this through our plugin architecture, which allows you to install and update your supported data formats, search methods, and pathways tools independently of Progenesis.

What are the advantages of the plugin system?

Distributing vendor specific functionality as plugins confers a number of advantages. Progenesis users can:

  • interface with multiple vendors using a single piece of software – a key distinguishing feature versus other analysis software.
  • remain up to date with new file formats and/or changes to existing file formats, without having to install a new version of Progenesis.
  • apply novel search methods and pathways tools to their existing data analyses, thus staying up to date with developments in the scientific community.

What plugins are available?

Data import plugins

Progenesis allows you to import raw data from a number of different vendors and machines. All imported data is converted to Progenesis’s unique internal peak models, so all types of data can be analysed using a consistent workflow. You can even combine data from different vendors in the same experiment (although this isn’t recommended as you may have trouble aligning the data).

Data file format Plugin FAQs Availability
Waters (.raw) QI
QI for proteomics
Provided as standard
Thermo (.raw) QI
QI for proteomics
Provided as standard
UNIFI Export Packages (.uep) QI Provided as standard (only available in QI)
AB SCIEX (.wiff) QI
QI for proteomics
Provided as standard
Agilent (.d) QI
QI for proteomics
Free download
Bruker Daltronics (.d) QI
QI for proteomics
Free download
mzXML files QI
QI for proteomics
Provided as standard
NetCDF files QI
QI for proteomics
Provided as standard in QI for proteomics
Free download for QI

Search plugins (QI)

These plugins allow you to search for small molecules or lipids in your data set, using a wide variety of data sources. Elemental composition even enables you to elucidate compound composition without the use of a dedicated compound database. Progenesis MetaScope allows you to search SDF and MSP files from any source you choose, e.g. HMDB or PubChem.

Search method Availability
Progenesis MetaScope Provided as standard
METLIN batch metabolite search Provided as standard
LipidBlast Provided as standard
Elemental composition Provided as standard
ChemSpider Provided as standard
NIST MS/MS Library Contact us for access

Search plugins (QI for proteomics)

Progenesis QI for proteomics can perform peptide search and protein inference using a number of different plugins. These encompass both database search methods like Mascot, and de novo sequencing methods such as PEAKS Studio.

Search method Alternative versions Availability
Scaffold v3.0 and v4.0 Free download
Mascot Provided as standard
Phenyx Provided as standard
SEQUEST dta and out files
dta and pepXml files
sqt and ms2 files
Dta plugins provided as standard
Free download for sqt plugin
PLGS v2.4 and v2.5
v2.3 and v3.0
Free download
Proteome Discoverer v1.3 (.xls)
pepXml
Free download
ProteinPilot Free download
Spectrum Mill Free download
PEAKS Studio pepXml import only Free download
EasyProt Free download
Byonic Free download

Inclusion list plugins

Inclusion list plugins in both QI and QI for proteomics allow you to target your ms/ms data collection for greater ms/ms coverage. Importantly, you can import new LC/MS runs into an existing experiment without having to replace peak picking and other analysis steps. This makes the use of an inclusion list workflow a powerful tool to increase ms/ms coverage in DDA experiments.

Inclusion list format Plugin FAQs Availability
AB SCIEX QI
QI for proteomics
Provided as standard in QI for proteomics
Free download in QI
MassLynx QI
QI for proteomics
Provided as standard
Thermo Finnigan QI
QI for proteomics
Provided as standard
Thermo Finnigan (4 d.p.) QI
QI for proteomics
Free download
Thermo Q exactive QI
QI for proteomics
Free download
Agilent preferred MSMS table QI
QI for proteomics
Free download
Agilent targeted MSMS table QI
QI for proteomics
Free download
Bruker Maxis QI
QI for proteomics
Free download

Pathways plugins

Progenesis provides reliable quantitative information about the changes in your experimental conditions. A number of pathways tools exist to translate such quantitative results into biologically relevant conclusions. Progenesis supports the following pathways tools, including the widely used IPA, and the multi-omics approach of IMPaLA.

Inclusion list format Plugin FAQs Availability
IMPaLA QI
QI for proteomics
Provided as standard
PANTHER classification system QI for proteomics Provided as standard (only available in QI for proteomics)
IPA QI
QI for proteomics
Provided as standard

Recent plugin updates

These are just a few examples of recent plugin releases we have made. As you can see, we regularly produce updates to Progenesis plugins, and develop new plugins when requested by customers.Progenesis web panel plugin update notification

  • In April 2016 we released an updated version of the mzML reader for Progenesis QI, introducing the ability to read indexed mzML files, as requested by our customers.
  • In January 2016 we released the IPA plugin for Progenesis QI for proteomics, giving users of Progenesis QI for proteomics v2.0 easy integration with this widely used pathway tool.
  • In November 2015 we released a new version of the Proteome Discoverer plugin, to support the newly released Proteome Discoverer v2.0 and v2.1.
  • In November 2015 we also released a brand new Thermo Q Exactive inclusion list plugin for both QI and QI for proteomics, since the Q Exactive machine uses a different inclusion list format to other Thermo machines.

Future plugins

Here at Nonlinear Dynamics we are committed to ensuring Progenesis remains vendor agnostic and supports the widest range of third party integrations possible.

As such, we’re always happy to hear from customers if they wish to use Progenesis with a third party piece of software for which a plugin does not exist. Please get in touch if you have any ideas for new plugins, or improvements to existing plugins.