Progenesis QI for proteomics speeds up biopharmaceutical purification!

 

Most recombinant protein biopharmaceuticals are produced in specially designed expression systems typically using CHO (Chinese Hamster Ovary) cells. Many CHO proteins are simultaneously expressed along with high amounts of the desired biopharmaceutical, but they need to be removed by multi-step purification processes. Residual host cell proteins (HCPs) are low-level (1-100 ppm) process-related impurities that might be present in protein biopharmaceuticals even after extensive purification. HCPs could produce unwanted immunogenic responses in patients, they can reduce the efficacy or the stability of the drug or they can be responsible for drug degradation. For these reasons, the regulatory agencies required that all HCPs are identified and quantified prior to drug approval. The Biopharmaceutical industry relies on ELISA (enzyme-linked immunosorbent assays) for measuring the total HCP concentration expressed in ppm (or ng HCPs/mg biopharmaceutical). Mass spectrometry-based HCP analysis has emerged in recent years as a powerful alternative to ELISA [1-4] because it provides more extensive (proteome-wide) HCP coverage and is able to measure individual HCP levels.

Any LC-MS workflow for HCP analysis has three major goals: 1) identification of unknown HCPs; 2) reporting of the individual HCP quantification results expressed in ng HCP/mg biopharmaceutical (ppm concentrations); 3) monitoring of the HCP levels across multiple biopharmaceutical preparations. To accomplish these goals, two different LC/MS assays are required as illustrated by the workflow displayed in figure 1.

Figure 1. Workflow of the HCP analysis.

Figure 1. Workflow of the HCP analysis.

 

 

 

 

 

 

 

 

 

 

 

The Discovery HCP assay is performed in SONARTM mode in order to identify the unknown HCPs present in the purified biopharmaceutical and Progenesis QI for proteomics (QIP) is used for a proteome-wide database search to reveal the identity of these HCPs. For example, in the case of the NIST mAb, four HCPs and three spiked proteins (ADH, PHO and BSA) were identified as illustrated by the screenshot displayed in figure 2:

Table showing identification of 4 HCPs

Figure 2. Four HCPs (highlighted by red arrows) were identified by Progenesis QIP in the highly-purified NIST mAb.

A different type of LC-MS assay is required when multiple samples, produced from the bioprocessing of the same protein biotherapeutic, need to be analyzed with increased sample throughput, for the purpose of investigating HCP clearance. In this situation, the information gained from the HCP Discovery assay can be used to speed up the HCP identification and quantification process.

Using Progenesis QIP, the MS/MS fragmentation spectra of HCP peptides identified by SONAR acquisition can be assembled into spectral libraries, containing peptide precursors, charge states, retention times and relevant fragment ions. A list of HCP peptides sequenced from the NIST mAb is presented in Figure 3.

Table showing HCP peptides identified with a combination of NIST and SONAR

Figure 3. HCP peptides identified in the NIST mAb using SONAR acquisition.

 

 

 

 

 

 

 

 

 

 

 

The MS/MS fragmentation spectra of these peptides were assembled in a spectral library using Progenesis QIP. Peptides are sorted in the increasing order of their precursors. Two MS/MS spectra were recorded for four highlighted peptides, following fragmentation of their doubly and triply charged precursors.

Higher–throughput HCP Monitoring assays relying on 30 min peptide separations and employing MSE data acquisition are used for screening biopharmaceutical samples taken at every step of the purification process. The entire LC/MSE dataset is searched with Progenesis QIP against a spectral library for HCP identification, quantification, and monitoring.

To simulate an HCP monitoring assay, three protein digests standards (ADH–yeast alcohol dehydrogenase, BSA-bovine serum albumin and PHO-rabbit phosphorylase b) were spiked at four different concentration levels in four NIST mAb digests, while one protein digest (CLP_B-Ecoli chaperone protein) was spiked at the same concentration in all 4 samples. The LC/MSE data was searched in Progenesis QIP against a spectral library of 113 SONARTM fragmentation spectra of MIX-4 peptides (ADH, BSA, CLB-B, and PHO). Spiked proteins were easily tracked down to the lowest spiked levels (~ 20 ppm) across all five samples (20 LC/MSE runs) as exemplified by the graphs shown in Figure 4.

Graph showing measurement of spiked samples

Figure 4. (A) Example of protein level results obtained for the HCP Monitoring assay: the levels of spiked ADH were accurately measured in five NIST mAb samples; (B) Peptide level results of the HCP monitoring assay.

Eleven ADH peptides showed identical trends plots across all 20 runs. Four spiked samples, identified by letters A-D in this figure, containing different levels of ADH, BSA, PHO and CLP-B protein digests were spiked in the NIST mAb digest. The sample labeled “Blk” corresponded to the non-spiked NIST mAb digest. Each sample was analyzed with four replicates.

Protein measurements were obtained from multiple peptides and excellent correlation was obtained between the spiked and measured fold changes with RSDs under 10% for all measurements.

Progenesis QIP greatly simplifies the user interaction with HCP datasets. Extracting mass chromatograms and calculating peak areas for a multitude of peptide precursors (like the 113 peptides from the MIX-4 spectral library) can be a tedious process. In addition, the data from each individual sample replicates need to be compared in order to obtain the peptide level HCP trends. Finally, the HCP peptide level results need to be translated into HCP protein levels. All these steps are automated and they are performed rapidly in Progenesis QIP without significant user intervention. This saves a significant amount of time spent on data analysis, allowing for rapid results.

The experiment with spiked protein digests described above can be easily performed as a QC test to demonstrate the capability of the entire LC/MS platform to provide reliable HCP clearance results in a timely fashion.

Our collaborators from EMD Millipore asked us to test this capability for “real” mAb samples: they wanted to know which one of their four SCX (strong cation exchange) purification protocols produced “cleaner” purifications, with lower HCP content. The results are shown in Figure 5 and one of their protocols indeed worked better than the other three.

Graph showing monitoring of HCP peptides

Figure 5. Peptide level monitoring of three HCP peptides across five mAb preparations (one Protein A eluate and 4 SCX (strong cation exchange) chromatographic purifications using four different protocols (A-D). As illustrated here, Protocol D provided the best results.

Progenesis QIP allows purification laboratories to develop and test novel purification procedures in a relatively short time.

References:

  1. Doneanu CE, Anderson M, Williams BJ, Lauber MA, Chakraborty A, Chen W. Enhanced Detection of Low-Abundance Host-Cell Protein Impurities in High-Purity Monoclonal Antibodies Down to 1 ppm Using Ion Mobility Mass Spectrometry Coupled with Multidimensional Liquid Chromatography, Anal Chem, 2015, 87, 10283-10291.
  2. Huang L, Wang N, Mitchell CE, Brownlee T, Maple SR, De Felippis MR. A Novel Sample Preparation for Shotgun Proteomics Characterization of HCPs in Antibodies, Anal Chem, 2017, 89, 5436-5444.
  3. Weibin C, Doneanu CE, Lauber MA, Koza S, Prakash K, Stapels M, Fountain KJ. Improved Identification and Quantification of Host Cell Proteins (HCPs) in Biotherapeutics Using Liquid Chromatography-Mass Spectrometry, book chapter in Technologies for Therapeutic Monoclonal antibody characterization, Vol 3, ACS Symposium Series, 2015, 357-393.
  4. Doneanu C, Lennon S, Anderson M, Reah I, Ross M, Anderson S, Morns I, Yu YQ, Chakraborty A, Denbigh L, Chen W. A Comprehensive Approach for HCP Identification, Quantification and Monitoring Based on a Single Dimension (1D) LC Separation, Waters application note 720006262en, 2018.

Acknowledgments

Catalin Doneanu, Waters Corporations, Milford, MA, USA

What a ConFirenze! Explore. Dream. Discover.

When I heard that IMSC was in Florence I stuck my hand up to be on the exhibition booth.  It’s a city I’ve wanted to visit for years.

Ironically, we were so busy I didn’t get to look around!  Myself and my colleague, Mark Bennett, were attending for Nonlinear and, from the opening reception, we had a lot of interest in Progenesis.

This varied from researchers in nuclear physics to researchers analysing skin for anti-malarial research.  Progenesis really does cover a large breadth of scientific research, it’s strength being the ability to seek out differences without identification.

Quantify, then identify, is fundamental in the Progenesis workflow.

Mark Bennett demonstrating Progenesis to interested researchers Mark Bennett demonstrating Progenesis to interested researchers

In addition to busy ‘booth traffic’ and demonstrations, we had a workshop on the Thursday lunchtime.

Three speakers gave interesting accounts of how Progenesis QI and Progenesis QI for proteomics have helped them in their research:

Progenesis3 –Three personal accounts showing the power of Progenesis QI

  • Untargeted metabolomics using Progenesis QI for small molecules: Developing ion-chromatography-mass spectrometry for the investigation of cancer metabolism –James S.O. McCullagh, University of Oxford, UK
  • Metabolomic profiling of reactive metabolites in toxicology by MSE and Progenesis –Emilien Jamin, Toxalim (Research Centre in Food Toxicology) Toulouse university, INRA, France
  • Novel strategies for discovery of cardiovascular biomarkers in human plasma – Donald JL Jones, Leicester Cancer Research Centre, RKCSB, University of Leicester, UK

All three talks were engaging and we had good attendance.  The speakers had questions for each other and were engaged in conversation long after the workshop had finished.

Don Jones, Emilien Jamin and James McCullagh in conversation post workshop Don Jones, Emilien Jamin and James McCullagh in conversation post workshop

The workshop was so uplifting for me, it’s great when you hear customers picking out things that they really like in your product.

Don’t worry if you missed these presentations, we recorded the whole session, soon to be released!

If you’d like to be notified about these recordings, please email us and we’ll inform you when they become available.

So then it was on to the conference dinner, it completely exceeded expectations (which were not low).

We gathered at the Villa Viviani, located on the hill of Settignano, with a perfect view of Florence, which sits nestled in a natural bowl, just as the sun was setting.

The villa was home to Mark Twain for a time and one of my favourite quotes of his, “Explore. Dream. Discover.” seemed particularly apt for the conference and the wonderful evening.

It was so beautiful to stand there, listening to the live band and watching the dusk progress, an unforgettable memory.

Later, we were participants in an Italian birthday song game that had us up and down out of seats non-stop.  The next day my legs were really aching!

The sun setting over Florence at the gala dinner The sun setting over Florence at the gala dinner

Finally, we all dispersed home, the Progenesis researchers we spoke to heading back to their labs all over the globe.

It was a great Confirenze!!  Now I have to go back to Florence and see it properly, as a tourist.

When is a Biomarker not a biomarker? (part 2)

In my last blog, I discussed interpretation of data from two model experiments using univariate statistical analysis (p-values and false discovery rates). It was concluded that the use of p-values alone can potentially lead to dramatic misinterpretation of results and many false discoveries, so false discovery rates (FDRs) from q-values are a vital tool to avoid this. In this blog I’ll use the same model experiments to discuss multivariate statistical analysis, specifically, Orthogonal Projections to Latent Structures-Discriminate Analysis (OPLS-DA), a method commonly used to extract biomarkers in discovery metabolomics analysis.

First, a brief re-cap of the details of our model experiments. Experiment 1 consists of 12 human urine samples in conditions B and C (Fig. 1, (i)) where C is normal patients and B is patients who’ve been given a high dosage of a mixture of analgesic drugs. In this case the PCA scores (samples, shown as coloured dots) show tight clustering within the conditions, indicating some highly significant differences between the conditions resulting from the presence of the drugs or their metabolites in condition B. Experiment 2 comprises the same data, but re-arranged into two “mixed” conditions called BC and CB (Fig.1, (ii)) for which the PCA scores show no condition-related clustering indicating (as we’d expect) that there are no differences between the conditions. After automatic processing of the data through Progenesis QI (data alignment, co-detection and adduct deconvolution) there were 5,333 compounds detected across all 12 samples with no missing values.

Experimental design and PCA bi-plot for model experiment 1 (i) and experiment 2 (ii)

Figure 1: Experimental design and PCA bi-plot for model experiment 1 (i) and experiment 2 (ii)

As mentioned in part 1 of this blog, PCA is a non-discriminate type of analysis which takes no account of the conditions of the experiment and just arranges the samples (scores) and compounds (loadings) according to how similar (or different) is their expression behaviour. In the case of the scores therefore, samples in which the compounds exhibit similar expression behaviour are clustered closer together while those with less similar behaviour are further apart on the plot. The loadings are arranged similarly according to their expression behaviour and in addition, the clustering of scores and loadings are linked, in that compounds (loadings) which show significant up-regulation in a condition are clustered closest to the samples (scores) of that condition (see Figure 1). PCA is also useful for identifying outliers in the data.

In contrast to PCA, OPLS-DA is a “discriminate” analysis which does take account of the conditions of the experiment and builds a model that best represents the differences between the conditions. The data can then be plotted in a way which represents how well each sample and compound fits the model. From Progenesis QI, our experiment 1 data can be automatically exported into the EZinfo statistical package in which OPLS-DA can be performed before importing the results back into Progenesis for further review. In EZinfo we can easily create our OPLS-DA model and initially view a Bi-plot which looks quite similar to PCA (Figure 2). However, instead of representing degrees of variance in the data, the axis now represent values related to the model of the difference between the conditions and how the scores (samples) and loadings (compounds) fit into the model. So, how does this type of analysis help us to extract good candidate biomarkers from our experiment?

OPLS-DA bi-plot for experiment 1

Figure 2: OPLS-DA bi-plot for experiment 1

If we change the data scaling from “unit variance” (where each compound abundance is divided by the compound standard deviation) to “Pareto” (where it’s divided by the square root of the standard deviation) we can create an “S-plot” of the compounds (loadings) which takes its name from the characteristic S-shape in which the “best” biomarkers are located towards the extreme of the plot. In the S-plot (Fig 3), the vertical axis defines the p(corr) correlation to the model while the horizontal axis defines the p(1) contribution to the variance between the conditions. This means that compounds located towards the vertical extremes conform best to the B Vs C difference model and are essentially the compounds where the difference between conditions B and C is most clear, while those located towards the horizontal extremes contribute most to the overall variance between the conditions, meaning they are highly abundant, have a large fold change, or both. In the case of experiment 1, we know there to be many expression changing compounds mainly up-regulated in condition B, where the drugs were administered. The S-plot supports this in that there are many compounds located towards the lower left extreme of the plot indicating they are up-regulated in condition B, while there are very few located towards the other extreme where the “up in C” compounds should be. We can see more clearly how the location of compounds on the S-plot relate to their expression behaviour, by selecting groups of them and importing them back into Progenesis QI as “tagged groups” which enables us to select them using filters and visualise their expression behaviour using the Progenesis QI tools. In this case, 4 groups of compounds have been selected indicated as A, B, C and D in figure 3.

S-plot for experiment 1

Figure 3: S-plot for experiment 1

Back in Progenesis QI, we can view the expression profiles for all of the compounds imported as tagged groups from EZinfo and in this way we can see how their location on the S-plot relates to their expression behaviour. Group A were the 3 compounds at the extreme bottom left of the plot and as such should have excellent correlation to the model along with a high contribution to the variance making them the very best candidate biomarkers. Figure 4,(i) confirms this since the clean step shape of the profiles show very clear distinction between the conditions and the accompanying table shows a combination of very low p-values and CVs, with high abundance and fold changes. Group B were not so far out as group A horizontally but equally far out vertically, so they should have similar correlation to the model but less influence on the overall difference. The step-shaped profiles in Figure 4, Bi confirm high correlation with the model and the table shows that these compounds have lower abundance than those in group A. The generally higher fold changes of this group compared to group A indicates that compound abundance is more important than fold change in determining the overall influence of the compounds on the variance between the conditions. The expression profiles shown in figure 4,A, Bi, C and D are “standardised” profiles in which the data is mean-centered and the variance normalised to 1. This results in the data being scaled to optimally display the shape of the profiles without taking account of the actual abundances.. If we view group B as “normal” (unscaled) profiles (Figure 4, Bii) we see that the abundance of the compounds in the highest condition (B) actually vary from <1,000 to >12,000. Which accounts for their relative positions on the S-plot.

Expression profiles and univariate statistical data for groups A, B, C and D from the S-plot

Expression profiles and univariate statistical data for groups A, B, C and D from the S-plot

Expression profiles and univariate statistical data for groups A, B, C and D from the S-plot

Expression profiles and univariate statistical data for groups A, B, C and D from the S-plot

Expression profiles and univariate statistical data for groups A, B, C and D from the S-plot

Figure 4: Expression profiles and univariate statistical data for groups A, B, C and D from the S-plot

Group C were in only moderately extreme positions both horizontally and vertically, so are likely to be less good candidate biomarkers and this is seen in figure 4, C in which the discrimination between the conditions is now minimal. Interestingly, the table shows that the group C compounds have much higher abundances than those of group A yet are much further from the horizontal extreme of the S-plot, showing the effect of the fold changes which in this case are very low and therefore limit the influence of the compounds on the model. Finally, group D are towards the top right extreme of the S-plot indicating they are up-regulated in condition C. However, the profiles show less clear distinction between the conditions than in groups A or B (though more than in group C), while the table shows moderately high abundances, but low fold changes as we might expect from our experiment.

We’ve established that the OPLS-DA and particularly the S-plot can help us to extract the “best” candidate biomarkers from our experiment 1, in terms of compounds displaying a combination of good conformation to the difference model, high abundance and high fold changes. But how does OPLS-DA handle the data from experiment 2? Perhaps a little surprisingly, despite there being no real expression changes in this data according to univariate analysis and PCA (see part 1 of blog), we still initially see a bi-plot in which there appears to be clear separation between the conditions (figure 4, A). However, this is not a result of any real differences between the conditions, but rather the OPLS-DA tool essentially “forcing” them into the best model which represents a difference between them. We also see an S-plot that approximates to the characteristic shape seen with the experiment 1 data which is potentially misleading. So what kind of behaviour do the compounds towards the extremes of this S-plot have?

OPLS-DA bi-plot

S-plot

Figure 5: OPLS-DA bi-plot (A) and S-plot for experiment 2

Groups A and B are both located towards (but not at) the vertical extremes of the plot so should have the best correlation to the model of any of the data. However, in both cases the expression profiles show a lot of variance within the conditions and not such clear distinction between the conditions (Figure 6). What’s more, the tables show that the p-values are only moderately low while the q values (and therefore the false discovery rates) are very high. Combined with low abundances and relatively low fold changes, none of these compounds could be good candidate biomarkers, as we’d expect from our previous knowledge of the data.

Expression profiles and uni-variate statistical data for groups A (A) and B (B) from S-plot of experiment 2

Figure 6: Expression profiles and uni-variate statistical data for groups A (A) and B (B) from S-plot of experiment 2

Groups C and D, which are further from the vertical but more towards the horizontal extremes have  profiles indicating even less difference between the conditions, particularly in group D (Figure 7). This is confirmed by the very high p and q-values plus very low fold changes shown in the tables. The reason for their location towards the horizontal extremes of the plot is their relatively high abundances which mean they will have a relatively high influence on the data model.

Expression profiles and uni-variate statistical data for groups C (A) and D (B) from S-plot of experiment 2

Figure 7: Expression profiles and uni-variate statistical data for groups C (A) and D (B) from S-plot of experiment 2

From the evidence of our two model experiments, it’s clear that when using OPLS-DA and the S-plot we need to be cautious in using them to select candidate biomarkers since there is potential to select compounds which do not in fact have any of the characteristics we are looking for. It’s important to remember that OPLS-DA will always try to create the best model which represents the differences between the conditions in the experiment and that this may lead to Bi-plots and S-plots which appear to show differences even when there are none there. The best way to check this is to view the selected compounds in Progenesis QI or a similar software that will display the compound expression profile and the uni-variate statistics such as p and q-values since these together will tell you if the selected compounds really do have characteristics we would associate with good candidate biomarkers.

When is a Biomarker not a biomarker? (part 1)

Statistics have a longstanding reputation for being potentially misleading and unreliable. It was in the 19th century that British Prime Minister Benjamin Disraeli said “There are three kinds of lies: lies, damn lies and statistics” while in the mid-20th Century, Winston Churchill added “The only statistics you can trust are the ones you have falsified yourself”. Things haven’t improved much recently as evidenced by a google search for the term “Statistics are unreliable” which returns no fewer than 6.8 million results! Discovery omics analysis and particularly p-values, which play a prominent role in the discovery of potential biomarkers, are no exception to this issue with a google search for “p-values are unreliable” producing about a quarter of a million results.

The huge complexity of discovery omics data, on the one hand, makes statistics vital in extracting results, but on the other makes interpretation of those statistics more problematic. In this article I’ll describe a simple “model” discovery omics experiment in the Progenesis QI software that highlights how misinterpretation of statistics can lead, not just to overstatement of success in an experiment, but potentially to conclusions that are the direct opposite of the reality. I’ll also discuss how you can avoid these misinterpretations and ensure that all your results are reliable. Please note that while the model experiment used here is metabolomics data, all the conclusions can equally be applied to proteomics or lipidomics analysis.  NB.  All the figures in this blog post are taken from the Progenesis QI software.

Original experimental design setup Figure 1. Original experimental design setup

Our “model” experiment uses a metabolomics data set of 12 human urine samples in two conditions B and C, as shown in the experimental design (Fig. 1). Condition C are from normal individuals while condition B are from individuals who’ve been given a high dose of a mixture of analgesic drugs. The 6 samples in each condition are technical replicates which enhances the relative differences between the conditions, but as we’re not interested in biological results, only what the statistics tell us, this is OK for our test. After automatic processing through Progenesis QI (data alignment, co-detection and adduct deconvolution) there were 5,333 compounds detected across all 12 samples with no missing values.

If we look at univariate statistics data (Fig. 2,a), we see that many compounds have extremely low p-values (some < 10-16) which might lead us to conclude a real expression change exists in those compounds. In fact, there are more than 300 compounds with p-values of < 0.0001 in this analysis indicating the presence of many significantly changing compounds (candidate biomarkers) between our two conditions. In many compounds, the fold change is also very high including some “infinity” fold changes where the compound is detectable in condition B and not in condition C.

This situation is confirmed if we now look at the PCA, a type of none-discriminate cluster analysis in which all samples are treated the same with no prior knowledge of the conditions they belong to. The samples (scores) cluster in multi-dimensional space according to how similar they are. By colour coding them by condition (Fig. 2,b), we see that the samples have clustered within their conditions and with very clear separation between conditions along the horizontal axis of principle component (PC) 1 which accounts for >21% of the total variance in the data. These then, are the kind of statistical results we expect to see where there are very distinct differences between our conditions.

Image a) Univariate statistical data table, including p and q values - Image b) PCA analysis plot Figure 2. From the original experiment:
Image a) Univariate statistical data table, including p and q values
Image b) PCA analysis plot

So far so good. Now, let’s look at an experiment in which there are no significant differences between the conditions and see how this affects the statistics. To do this, we’re going to use the same samples, but randomly mix them up and re-assign them to two arbitrary conditions which we’ll call BC and CB (Fig. 3,a). It’s now evident from the PCA clustering pattern (Fig. 3,b) that there are no significant differences between these new conditions. But do the other statistical results support this?

a) Experimental design setup b) PCA analysis plot Figure 3. From the arbitrary experiment:
a) Experimental design setup
b) PCA analysis plot

If we again look at our univariate statistics (Fig. 4), we can see that although the p-values are generally much higher than before, there are still a number of compounds where p < 0.05, which is often used (incorrectly) as a threshold of significance in discovery omics experiments. In fact there are 197 compounds with p<0.05, 25 with p < 0.005, and 4 with p < 0.001! Are any of these compounds really changing expression in a statistically significant way? The answer is no and when we consider that our original conditions have been randomly mixed together, this is the answer we might expect. So, why do we still get such low p-values when there are no actual expression changes occurring? To answer this we need to consider the experiment as a whole and not just the individual compounds.

Figure 4. Univariate statistics data table, including p and q values from the arbitrary experiment Figure 4. Univariate statistics data table, including p and q values from the arbitrary experiment

The misuse of p<0.05 as a suitable significance threshold in discovery omics is usually the result of an incorrect definition of p-values. They are often referred to as “the probability that there is no expression change occurring in the data” which, if true, would mean that p<0.05 would indicate a <5% probability of no expression change occurring (or 95% probability of one occurring) and would therefore be a very suitable threshold. However, the p-value is actually a measure of the likelihood of the data observed occurring if no real difference existed (i.e., how likely it is to occur by random chance) and in this case the significance is dependent on the number of results in the experiment, which is referred to as the “multiple testing problem”.

In an experiment where only 10 compounds are detected and measured, p<0.05 may be a suitable threshold since we’d then expect only 0.5 compounds (10 x 0.05) to have p>0.05 by random chance, meaning any compounds with this p-value range are likely to be changing significantly and therefore to be potential biomarkers. In discovery omics analysis we typically detect and measure >1,000 compounds so in this case we expect >50 (1,000 x 0.05) to have p<0.05 by random chance and using it as a threshold would produce at least that many false discoveries.

In our experiment we detected and measured 5,333 compounds, so we’d actually expect as many as 266 compounds to have p<0.05, 26 to have p<0.005 and 5 to have p<0.001 by random chance. Compare this with the actual results and we can conclude that all the results are false discoveries having come about by random chance alone.

So how do we check our p-value thresholds to see if they’re suitable for our experiments? A systematic way of doing this, is to use the q values calculated in Progenesis QI to calculate a false discovery rate (FDR). We do this by reading the highest q value (corresponding to the highest p-value) in the subset of features we extract using our p-value threshold. If we do this for our original experiment (Fig. 5, a), we see that using a threshold of. 0.0001, gives us a q-value of 0.000942, or an FDR of just below 0.1%, meaning <1 false discovery from the subset of 300 discoveries. However, using a threshold of 0.05 gives us a q-value of 0.128, or 12.8% FDR, translating to as many as 147 false discoveries from a total of 1,151 discovered compounds. With our “mixed” data set, we get far too many false discoveries no matter what threshold we use, with a threshold of 0.05 giving us a >99.95% FDR and even a threshold of 0. 001 giving an FDR of 90% for only 4 discoveries.

Tables showing the difference in FDR between the two experiments Figure 5. Tables showing the difference in FDR between the two experiments

In this study of model omics experiments we’ve seen examples of how misinterpretation of univariate statistics can lead to experimental features (in this case metabolomics compounds) being assigned as potential biomarkers when, in fact, they are nothing of the kind. However, we’ve also seen that by using appropriate safeguards (false discovery rates) these issues can be avoided, ensuring that all your results are of high confidence and reliability.

In the second part of this blog we’ll use the same data to look at the issues of interpreting multivariate statistics and how we can avoid making false discoveries using that approach.

If you would like to know more about the Progenesis QI or the Progenesis Qi for proteomics software then don’t hesitate to get in touch. More information can be found here.

Out now – Progenesis QI for proteomics v4.1

We are pleased to announce that a new version of Progenesis QI for proteomics has been developed which is now available to download.

What’s New in this v4.1 release?

Improved Spectral libraries: adding to database search flexibility

Improved spectral library searching generates huge time savings so you can quickly focus your efforts on new and interesting features.

Progenesis QI for proteomics v4.1In Progenesis QI for proteomics v4.1 you can:

  • Build up a library of verified identifications from DDA, MSE, HDMSE and SONAR data including:
    • Fragmentation patterns seen in your own experiments and with your own instrumentation
    • Retention time information
    • Highlight overlapping ions to show potential interferences
    • You can also collaborate internally and externally by sharing these libraries with collaborators. This avoids duplication of effort within the Progenesis community.
  • Search the library first in a new discovery experiment
    • Quickly find peptides that match something you’ve seen before
    • Hide strong spectral library matches
    • Submit the remaining unknowns for traditional search methods
    • Supports NIST and Mascot .msp files and SWATH Atlas .sptxt files
  • Append any new verified identifications to the library

MS/MS Spectral clean-up tools during library creation

A new tool in the “Resolve conflicts” section indicates the identified peptides that have co-eluting peptides in close proximity.  This gives the user the option to exclude the MS/MS spectra of these peptide ions from the spectral library export, thereby only exporting peptides which should give cleaner matching in future identifications.

Only identified fragments from the MS/MS spectra are included when you create a library, ensuring the MS/MS data used in the spectral library are as “clean” as possible.

cleaning

A reminder of what’s new in v4.0…

Proteolabels

Support for SILAC: Progenesis QI for proteomics now seamlessly integrates with Proteolabels software, with workflows for SILAC and dimethyl labelling.

Symphony

Integration with Symphony: you can now create a Progenesis QI for proteomics experiment from the Symphony data pipeline.

SONAR

 

Support for SONAR data: SONAR is the latest DIA mode from Waters, providing additional specificity and clarity.

PRIDE

Export to PRIDE: mzIdentML exports can now be produced for upload to the PRIDE repository.

Other improvements

  • Improved workflow for Waters MSe data, with automatic peak detection thresholding to maximise number and quality of identifications, whilst improving software performance.

For more details, please download our “What’s New” document.

Where can I download it?

If you’re an existing customer with an up to date support plan, this upgrade is totally free of charge and very simple – click on the upgrade in the software panel. In addition, if your Progenesis PC is connected to the internet, there should be a message in the experiments list sidebar notifying you of this new version – if you click this, and your dongle is plugged in, you’ll be sent to the download page.

If you are out of your support period, please contact us and we can discuss getting you back in support.

If you’re thinking of trying Progenesis QI for proteomics for the first time, you can download the software from here.

How will I know how to get the most out of the new features?

We’ve expanded our FAQs to cover the new features, as well as updating any previously available FAQs to reflect new behaviour. We’ve also updated our user guide if you’re looking for a step-by-step guide from start to finish.

Proteomics: a Peptide’s journey to emergence

We are pleased to present a blog post from one of our users, Dr. Maarten Dhaenens.  Read on…

Dr Maartens DhaenensDr. Maartens Dhaenens

Head of Proteomics Department
Lab of Pharmaceutical Biotechnology
Ghent University

 

In philosophy, systems theory, science, and art, emergence is a phenomenon whereby larger entities arise through interactions among smaller or simpler entities such that the larger entities exhibit properties the smaller/simpler entities do not exhibit. The most obvious example of emergence is life itself. Think about it: while anyone, or any algorithm, would still recognize you on a picture of 10 years ago (ok, not that one picture maybe), only a few molecules in your body are still the same as on that picture. It is as if you are the shape and matter is merely circulating through you, through time. Thus, explaining life itself by only focusing on the molecules we are built from and the laws of chemistry alone will be a dashing exploit. Yet, in proteomics, we currently have no other choice than to weigh molecular masses to fathom life. We need to be aware of this limitation and leverage knowledge with our mind, which actually is an emergent phenomenon in itself!

If I have learned anything from the dozens of collaborations at our lab, it is that the term “Proteomics” is actually very confusing to the outside world. We measure peptides. What we actually report on, the proteins is merely inferred. In a time where productivity is key, people tend to focus only on trying to automate this inference. Yet, to date, only human intervention, i.e. the human mind, can assure that the most correct or least ambiguous outcome is reported. I would argue that proteomics is in itself an emergent – not “emerging” – field. Once you start to look at it like that, facilitating human inspection of the visualized data should be the primary focus in order to fill the gap between what is measured and what can be concluded in terms of potential biomarkers or biology.

To illustrate this point, we look at histones in this webinar. These proteins are often used to normalize entire proteomes because they are rightfully considered as one of the most robust household genes in Eukaryotes. However, while these five low molecular weight proteins (10-25kD) have a very predictable expression profile, people tend to forget that they can get modified in ways that little other proteins can. Histones can theoretically generate roughly 7.1017 different proteoforms when you consider all the histone posttranslational modifications (hPTM) that have been previously reported. This translates into 50.106 different peptide forms with ArgC-like specificity. Indeed, we do not measure the proteoforms in bottom-up proteomics and we do not consider each of these hPTM in the searches we do. This, in turn, implies that it is practically impossible to quantify these proteins accurately when you apply bottom-up proteomics.

The fact that studying histone modifications is an intrinsically peptide-centric approach, however, made us realize that inferring protein abundance is extremely hard and in some cases even impossible. In this webinar, we will follow one such peptide on its journey through the Progenesis QI for proteomics workflow, to emergence. Using peak reviewing, QC metrics, conflict resolving and spectral library matching, we will detect artifacts, verify experimental reproducibility, detect outlier samples,… For histones specifically, it is invaluable that we can do several sequential searches (each with another combination of hPTM and using different search engines) and combine them all into this single analysis. Equally essential, we curate ambiguity that arises through these sequential searches by applying in-house scripting to the result files and then generate lists of tags to re-import into Progenesis QIP for manual validation and resolving conflicts. Because isobaric peptides carrying the same hPTM combinations elute very closely, we also manually verify and adjust peak picking of histone peptides. Finally, we are fascinated by the power of ion mobility separation in HDMSE acquisition and webinar concludes with a daring effort to match DDA libraries to HDMSE data.

In conclusion, while this webinar follows very specific peptides, i.e. those derived from complexly modified histones, it mainly illustrates that no automation process to date is able to anticipate the complexity of protein abundance. I thus argue that the final list of e.g. potential biomarkers should always be manually inspected and visualized in order to save time and money in the downstream validation process.

Indeed, Progenesis QIP allows us to peek behind the curtain and catch a glimpse of emergence.

YPIC Challenge

Have you ever wondered whether you can express an English sentence in Escherichia coli? No? Well, you are probably not the only one… Yet, intrigued as they are by the complexity and limits of life, the Young Proteomics Investigators Club (YPIC) went ahead and tried this exciting idea, with the help of PolyQuant. PolyQuant transfected E.Coli bacteria to see if “English can be expressed as a protein”. And guess what?! E. coli does speak English and has blessed this world with the first-ever three-dimensional grammar. Why did YPIC do this? Because they like challenges and they believe, you do too. Therefore, they dare you to join them in studying this unique protein, assembled by this fascinating creature with the sole purpose of challenging you.

Welcome to the second YPIC Challenge. Everybody is welcome, no matter where you come from!

Why would you participate? Well, because

  1. You are one of the few on this planet who can actually crack this code, since advanced proteomics skills are of the utmost importance.
  2. Because you want to become the pride of your country in this worldwide challenge. Not to mention how proud your mom will be when you win this game!
  3. Because there is a separate competition for every discipline and expertise in proteomics:
    1. Three-dimensional Grammar (find out how this sentence folds)
    2. Bioinformazing (develop the coolest bioinformatics approach to decipher the sentence)
    3. Protein Punctuation (Look for the biological equivalent of punctuation: PTMs left behind by E. coli)
    4. #Bioreactivity (can you generate and describe bioreactivity in this Twitter-sized message?)
    5. Best manuscript, the main prize
  4. A manuscript you say? Indeed, following your experimental wizardry, you should consolidate your beautiful work by writing a manuscript. The deeper all contenders can drill into the mysteries of the biology of language, the higher the impact factor of the journal that dedicates an entire issue to our joint effort. So, an extra, official publication for just playing a game? That doesn’t sound too bad now, does it?

And why is Nonlinear Dynamics posting this? Well, as a vendor-independent software platform, any team can use the Progenesis QIP functionalities to tackle any of the challenges. Just a few suggestions that could get you going:

  1. Perform several digests with different enzymes to increase the coverage. Run them separately and run a mixture of the samples, as you would do when you run a QC sample. This sample would then serve as a template to align everything and merge the different searches you do into a single analysis.
  2. There are plugins for 13 different search engines. If you used a fasta file of the oxford dictionary, you could start merging searches from different engines! And why not use spectral libraries to link words together?
  3. All this flexibility allows you to make Progenesis QIP a part of a bioinformazing pipeline.
  4. Progenesis QIP is very well suited for PTM research, did you know?
  5. Is it the #bioreactivity challenge you are most interested in? Well, digest the sentence with different enzymes and spike the peptides into your cell culture. Progenesis QIP is built to find out what these peptides do to the cellular proteome!
  6. As a part of Waters, we know that ion mobility separation could help to do a structural analysis.

At the EuBIC Winter School 2019, in January, YPIC will award the prizes. They will try to broadcast this session live, so that every competitor can follow!

We can actually feel your index finger hoovering over the mouse to register. Don’t hesitate; YPIC didn’t either, did they? And neither did PolyQuant, who took the risk of finding out that life not only speaks Latin or Greek (as one would expect?), but also likes English!

The only thing you need to do is become a YPIC member, for free (as is the challenge obviously) and gather three chosen ones in a well-considered research team. Trust us; you will need them! In last year’s challenge, 19 teams enlisted and only 7 cracked the code. And that one seemed easy: just 19 synthetic peptides together forming a sentence from a book. Just ask last year’s winners Alexander Hogrebe and Rosa R. Jersie-Christensen (Jesper V. Olsen’s group) how easy that was*. And this year, they cranked it up a notch. That means that you will not find the sentence anywhere and there probably will be some digestion involved (unless you do top down, Edmann Degradation or – hey, why not – nano pore sequencing?).

Getting suspicious about all this costless awesomeness? Don’t! Look at it from their perspective: they get to extend their membership and, in doing so, your network. That is only one of the reasons that EuPA founded YPIC at EuPA2016 in Istanbul. Actually, they want to represent all the extra-scientific aspects of being a young scientist. Just check out their survey and you will start to get the picture. A digital Proteomics job fair made by EuBIC is already in place (http://jobs.proteomics-academy.org), but there are countless other things they could do for you in the future.

Be sure to check-out their Facebook page if you want regular updates on our activities as well.

See you at EuPA2018, with all the contenders.

The new generation of science has arrived! Power to the Proteomics people,

YPIC

* Read all last year’s manuscripts here. The winning manuscript was entitled: “Sweet Google O’ Mine – The Importance of Online Search Engines for MS-facilitated, Database-independent Identification of Peptide-encoded Book Prefaces”.

What value can Progenesis QI provide in the world of co-polymer characterization?

Polymers are critical to meeting key societal needs

The use of polymeric materials in our everyday lives is increasing rapidly driven by innovations in materials development and design. Examples of the scope of polymer uses include: structural materials for cars and airplanes, fabrics for clothing, packaging materials for food and medicines, medical devices like heart valves and joint replacements and as substrates for revolutionary 3D-printing applications. The latest innovations have delivered smart materials which can change their shape or properties based upon changes in their environment.  However, this wealth of new materials must be properly characterized in order to manufacture these polymers reproducibly and to achieve the required property characteristics, thus appropriate analytical technologies and comprehensive data are needed.

There are many advanced technologies available for polymer analysis. Today we will consider Pyrolysis-Gas Chromatography/Mass Spectrometry (Py-GC/MS) and how multivariate analysis of the data it produces provides novel insights into polymer structure.

Why use Py-GC/MS and what are the limitations?

Py-GC/MS is one of major analytical techniques for chemical structural elucidation of polymers. It involves identification of the gaseous products generated from degradation of a polymer heated to 600°C under inert gas providing data from which the detailed chemical structure of polymer can be estimated.

Typically, the GC/MS in these analyses uses a hard ionization technique; electron impact (EI). However, the data obtained by such ionization becomes increasingly complex, especially when there are increasing monomer numbers in the co-polymer. Many pyrolysis products are formed and each of them generates many fragment ions upon ionization. This can prove a limitation of the approach.

A new approach to Py-GC/MS

The experimental data can be simplified using a soft ionization technique like Atmospheric Pressure GC (APGC) ionization in place of EI as the high sensitivity and soft ionization allows observation of the molecular ion without fragmentation. (See this link for a  White Paper about APGC). Reduction in fragmentation enables the determination of larger fragments from the polymer backbone, enabling the connectivity of the monomer units to be inferred. Why does this matter? Well, different arrangements of the units in a polymer like a block copolymer vs. random copolymer would result in final material having different physical properties which can affect its end use. Therefore an understanding of what type of substructure exists in the polymer is very important.

Combining a high resolution mass spectrometry instrument such as quadrupole-time-of-flight (QToF) mass spectrometer with an APGC source (see Figure 1) enables the MS and MS/MS spectrum of each peak to be simultaneously collected. This data provides the elemental composition and fragment ion information needed for elucidation of chemical structures (see Figure 2).

Py-GC/MS setup Figure 1: Py-GC/MS setup in one of Waters laboratories using an EGA/PY-3030D pyrolysis unit attached to a GC equipped with an atmospheric pressure source for GC/MS (APGC source) and a Waters Xevo G2-XS QTof mass spectrometer.

Block co-polymer low and high energy spectra from MSE data acquisition Figure 2: Block co-polymer low and high energy spectra from MSE data acquisition. This is a data independent acquisition mode enabling simultaneous acquisition of low energy and high energy spectra. The low energy spectrum provides molecular ion related information from which elemental composition can be derived. High energy spectrum contains fragments from the molecular ion which help to confirm structure.

How is Progenesis QI applied to Py-APGC/MS data ?

Applying multivariate analysis to the Py-APGC-MS data enabled the characteristic pyrolysis products from the different co-polymer types to be automatically detected and identified as structural markers. The application of PG QI software removes the need to manually sift through the vast array of spectral data generated from each sample trying to detect and identify structurally significant pyrolysis products.

The data for two acrylic acid – styrene copolymers, one block and one random, were processed using Progenesis QI and following data alignment and peak picking the samples were analysed using an OPLS-DA model to compare the two groups. We can see the two polymer types are easily distinguished in the scores plot (Figure 3).

OPLS-DA model to compare the samples Figure 3: Following replicate analysis of the two co-polymer samples the data was aligned and peak picked using the workflow presented by Progenesis QI. The resulting data was analysed using an OPLS-DA model to compare the samples. The scores plot resulting from that analysis is shown here where it can be seen that the two co-polymer types are clearly discriminated.

From this model the block co-polymer marker components were extracted from an S-plot and confirmed on a trend plot (Figure 4).  The chemical structures of the marker components were determined from the MSe spectrum as described previously. Random co-polymer marker components were extracted and chemical structures elucidated using same procedure. Some of the structures determined are shown in Figure 5 where we can see how they are representative of block and random structures.

Plotting all the identified markers on an S-Plot Figure 4: Plotting all the identified markers on an S-Plot allows extraction of those which we are most confident provide significant discrimination between the samples. The intensity of these individual markers can then be plotted against the sample identities in a Trend Plot which, in this figure, shows the abundance of markers of the block co-polymer components extracted from data.

Examples of markers Figure 5: Here we show examples of markers that were identified for the block and random samples of styrene – acrylic acid co-polymers using the elucidation workflow described in the main text. Below the structures are some of the monomer sequences that they correspond to demonstrating how this approach can provide information about co-polymer backbone substructure.

Concluding thoughts

The analysis of Py-APGC-MS data by Progenesis QI enabled the discovery of markers which contributed towards the difference between co-polymers. These structural differences can be due to different polymerization methods used to produce the materials, or different monomer ratios used during production.

This study shows the utility of a pyrolyzer connected to a gas chromatograph and a mass spectrometer using soft, atmospheric pressure ionization for the characterisation of co-polymer structure. Analysing the information rich datasets using Progenesis QI software enabled markers to be identified that provide insight into the differences in monomer connectivity in block and random copolymers. Further details on this work can be found in the poster publication at the following link – Py_GCMS_Poster.

In addition to the use with pyrolysis GC/MS, multivariate analysis with Progenesis QI is also very useful in troubleshooting product failures like discoloration in a batch of polymeric material or mechanical or chemical failure of components. In some of the latest applications, polymer chemists have utilized this approach for marker analysis to understand different product performance of functional polymers such as photoresists and color-resists related to semiconductor and display manufacturing.

So, next time you look at your phone or tv, step into your car or take your seat on an airplane; remember the critical dependence you have on polymeric materials and that a lot of analytical testing has gone into the development process to provide you with such attractive, robust, safe and functional products!

Acknowledgements

Tim Jenkins, Waters, Wilmslow
Baiba Babovska, Waters, Milford
And our colleague in Japan, Tatsuya Ezaki of Nihon Waters K.K.

Recent Advances in Food Analysis (RAFA) 2017

Recently I had the pleasure of returning to Prague for the 8th International Symposium on Recent Advances in Food Analysis (RAFA) conference. The Clarion Conference hotel again provided a great venue to host the event and accommodate many of the delegates on site making logistics very convenient for all.

The conference brings together a wide range of scientists from across Europe working in the field of food analysis covering both regulatory and research applications. The conference has grown in reputation and size over the years and, with over 550 delegates and 3 parallel sessions this year, there was a wealth of information being presented. From my perspective, it was great to see the Progenesis QI software being successfully applied and cited in many talks and posters to help in the comparison and distinction of some very complex sample matrices.

Waters booth at RAFA 2017 Lots of interest and activity at the Waters booth following Sara Stead’s lunchtime seminar.

I spent most of my time at the Waters booth discussing the Progenesis QI software with delegates and researchers who were interested in seeing how the Progenesis QI software could be implemented into their workflow. I did manage to visit one nice restaurant and see some of the old town, on one of the evenings.

Panorama of Prague Panorama of Prague.

Here’s what Dr. Sara Stead, Strategic Collaborations Manager at Waters had to say:

“The key themes at RAFA this week have been addressing the major challenges facing the food industry such as fraud, authenticity, quality and safety to secure the food supply chain and protect all consumers in an era of globally traded commodities.

The use of HRMS coupled with intelligent informatics systems is emerging as a key player in this space capable of holistic profiling, keeping pace with the industry demands.

Techniques such as QTof MS, ion mobility enabled MS and direct analysis in combination with comprehensive chemometric software packages such as Progenesis QI and LiveID are being established as the ‘go to’ solutions for pioneering researchers tackling these complex challenges.”

A recent application note details how the Progenesis QI software and the Waters Ion Mobility workflow was used in the food allergen research area. Food allergens are a hot topic for many scientists. The application note Identification and Quantitative Analysis of Egg Allergen Peptides Using Data Independent Ion Mobility Mass Spectrometry can be found here. It is most certainly an interesting read for those looking at LC-MS/MS quantitative analysis.

The Progenesis QI software is a versatile piece of software that can be used, not just in this food fraud area, but also in the fields of metabolomics, environmental research and chemical analysis to name but a few.

If you’re interested in looking further at the Progenesis QI software to see how it can help you in your research then you can download the software here or get in touch with us. We will be more than happy to answer any of your questions.

If you get the chance to attend this conference in the future I would definitely recommend it to anyone working in this field.

Don’t get stung by your Manuka Honey!

One of our Waters colleagues, Dr. Joanne Connolly, gave an interesting presentation at the 38th BMSS Annual Meeting 2017 in Manchester last month. The research involved the use of Progenesis QI in a non-targeted metabolomics approach to honey analysis and floral marker elucidation. Previous blog posts have discussed how Progenesis QI was used to detect food fraud across a wide diversity of foods in an untargeted approach.

The problem

There have been a number of food scandals in recent years, the more serious resulting in fatalities.

Why do people adulterate food?

The main reason is for financial gain, to make the key ingredient ‘go a bit further’, for substitutions, or to cut manufacturing costs.  Sometimes, it is worse with deliberate maliciousness such as reputation damage – trying to destroy a competitor’s reputation or even terrorism.

Either way, laboratories need to develop ways to test food products in an untargeted way, to explain the differences between genuine and fraudulent products.

An example: Manuka and commercial honey

Every honey is a unique, complex matrix made up of plant and bee secondary metabolites including flavonoids, phenolics and sugars.  Each honey has its own “fingerprint” which will differ depending on region, forage targets and biological properties.

A lot of honey found commercially has honey from several plant species and is known as polyfloral or multifloral honey. The term unifloral describes a honey that is derived from one plant species, and unifloral honey is becoming of more commercial interest as consumers appreciate the possibility to choose between different honey types.

Leptospermum scoparium Leptospermum scoparium

In addition to this, there have been studies discussing therapeutic or technological uses of certain honey varieties which also contributes to the demand of a reliable determination of their botanical origin. Some of the unifloral honeys are sold at premium prices, and are therefore a target for food fraud to occur (e.g. adulteration, mislabelling).   Manuka honey falls into this group.  Manuka honey is made from the nectar of Leptospermum scoparium or Manuka bush, a shrub native to New Zealand and Southern Australia.  It is reported to have biocidal activity and a unique antibacterial non-peroxide activity (NPA).  Suppliers have to demonstrate this activity for labelling and to attract premium price.

Current approaches

Understanding, deconvoluting and identifying the biochemical profile of a food sample of interest can help give manufacturers and regulators key information in the fight against fraud. Many different analytical techniques were used to determine the floral origin of honey, including MS, NIR, FT-IR, and Raman spectroscopic fingerprinting, and NMR.

Another possible approach is untargeted metabolomics, as hinted at earlier.

Identification of a MS-derived biochemical “fingerprint” is an important tool for understanding the question of “What is normal?”

Where does Progenesis QI fit in?

In this recent application note, an LC-MS metabolomics approach was taken to chemically profile four different types of unifloral honey.  Progenesis QI was used in untargeted analysis of LC-MS data (HDMSe in ESI+ and ESI- modes) to find candidate biomarkers for Manuka honey when compared to the other mono-floral honey types (Buckwheat, Heather and Rape). Each sample type was run in triplicate, plus pooled QCs.  After importing the data into Progenesis QI and processing it through the unique Progenesis alignment and co-detection workflow, PCA analysis showed Manuka was clearly separated.

Principal component analysis Principal component analysis (PCA) scores plot from EZinfo (ESI negative ion HDMSE data).

By automatically exporting data to EZinfo for discriminate analysis, it was then possible to extract the best candidate biomarkers from an S-plot of Manuka honey versus all other honey types. Tentative identifications were also generated for several of the extracted potential biomarkers. MRM was used to validate that the peak identified as Leptosperin really did differ in abundance between Manuaka and non-Manuka honey.

Manuka markers Review of standardized abundance profiles and assignment of identity for three markers of Manuka honey as displayed in Progenesis QI software.

In summary

Food authenticity, adulteration and safety is a major concern across the globe.  A non-targeted high resolution MS OMICS approach combined with multivariate data analysis can identify ‘normal’ profiles of foodstuff allowing detection of fraud during the investigative stages.  It is possible to get biologically meaningful information by comparing multiple samples using an all-in-one high-throughput guided workflow in Progenesis QI.  Confident structural assignment in Progenesis QI means markers can be annotated and identified by databases of user choice.  Combining ion mobility with MS gives ‘cleaner’ fragmentation data allowing easier identification of markers.  Validation of markers is important using complimentary technology for confirmation.  New innovative rapid evaporative techniques will open new doors into authentication techniques at “point of entry”.

Are you doing untargeted LC-MS analysis? Would you like to see how the Progenesis QI software can help you get the results you are looking for?

You can download the software or contact us. A member of our friendly team can discuss the software further with you.