Proteomics: a Peptide’s journey to emergence

We are pleased to present a blog post from one of our users, Dr. Maarten Dhaenens.  Read on…

Dr Maartens DhaenensDr. Maartens Dhaenens

Head of Proteomics Department
Lab of Pharmaceutical Biotechnology
Ghent University


In philosophy, systems theory, science, and art, emergence is a phenomenon whereby larger entities arise through interactions among smaller or simpler entities such that the larger entities exhibit properties the smaller/simpler entities do not exhibit. The most obvious example of emergence is life itself. Think about it: while anyone, or any algorithm, would still recognize you on a picture of 10 years ago (ok, not that one picture maybe), only a few molecules in your body are still the same as on that picture. It is as if you are the shape and matter is merely circulating through you, through time. Thus, explaining life itself by only focusing on the molecules we are built from and the laws of chemistry alone will be a dashing exploit. Yet, in proteomics, we currently have no other choice than to weigh molecular masses to fathom life. We need to be aware of this limitation and leverage knowledge with our mind, which actually is an emergent phenomenon in itself!

If I have learned anything from the dozens of collaborations at our lab, it is that the term “Proteomics” is actually very confusing to the outside world. We measure peptides. What we actually report on, the proteins is merely inferred. In a time where productivity is key, people tend to focus only on trying to automate this inference. Yet, to date, only human intervention, i.e. the human mind, can assure that the most correct or least ambiguous outcome is reported. I would argue that proteomics is in itself an emergent – not “emerging” – field. Once you start to look at it like that, facilitating human inspection of the visualized data should be the primary focus in order to fill the gap between what is measured and what can be concluded in terms of potential biomarkers or biology.

To illustrate this point, we look at histones in this webinar. These proteins are often used to normalize entire proteomes because they are rightfully considered as one of the most robust household genes in Eukaryotes. However, while these five low molecular weight proteins (10-25kD) have a very predictable expression profile, people tend to forget that they can get modified in ways that little other proteins can. Histones can theoretically generate roughly 7.1017 different proteoforms when you consider all the histone posttranslational modifications (hPTM) that have been previously reported. This translates into 50.106 different peptide forms with ArgC-like specificity. Indeed, we do not measure the proteoforms in bottom-up proteomics and we do not consider each of these hPTM in the searches we do. This, in turn, implies that it is practically impossible to quantify these proteins accurately when you apply bottom-up proteomics.

The fact that studying histone modifications is an intrinsically peptide-centric approach, however, made us realize that inferring protein abundance is extremely hard and in some cases even impossible. In this webinar, we will follow one such peptide on its journey through the Progenesis QI for proteomics workflow, to emergence. Using peak reviewing, QC metrics, conflict resolving and spectral library matching, we will detect artifacts, verify experimental reproducibility, detect outlier samples,… For histones specifically, it is invaluable that we can do several sequential searches (each with another combination of hPTM and using different search engines) and combine them all into this single analysis. Equally essential, we curate ambiguity that arises through these sequential searches by applying in-house scripting to the result files and then generate lists of tags to re-import into Progenesis QIP for manual validation and resolving conflicts. Because isobaric peptides carrying the same hPTM combinations elute very closely, we also manually verify and adjust peak picking of histone peptides. Finally, we are fascinated by the power of ion mobility separation in HDMSE acquisition and webinar concludes with a daring effort to match DDA libraries to HDMSE data.

In conclusion, while this webinar follows very specific peptides, i.e. those derived from complexly modified histones, it mainly illustrates that no automation process to date is able to anticipate the complexity of protein abundance. I thus argue that the final list of e.g. potential biomarkers should always be manually inspected and visualized in order to save time and money in the downstream validation process.

Indeed, Progenesis QIP allows us to peek behind the curtain and catch a glimpse of emergence.