Why does false discovery rate control matter when using OpenProt for target validation?

Using a stringent false discovery rate is necessary to account for the substantial increase in database size when using the full OpenProt database, which helps maintain confidence in novel protein identifications without affecting the most confident hits, as demonstrated in the protocol.

How does isolating the variable of ORF annotation model improve target discovery in early discovery pipelines?

By enabling polycistronic annotation of eukaryotic genomes, OpenProt allows detection of proteins from non-canonical ORFs that are missed by traditional models, thereby expanding the search space for therapeutic targets and reducing false negatives in proteomic screens.

What quantitative measurements from mass spectrometry enable confidence in novel protein discoveries using OpenProt?

Confident peptide identifications, peptide spectrum matches, and protein quantification data derived from X!Tandem searches against OpenProt databases provide the quantitative basis for validating novel proteins, including those not previously annotated.

Why are replication requirements important for cross-functional collaboration when validating OpenProt-identified proteins?

Replication across datasets using OpenProt_2_pep or OpenProt_all databases showed that most proteins from the original paper were re-identified, supporting reproducibility and enabling shared confidence in novel protein discoveries across teams.

What statistical analysis capabilities are required before implementing OpenProt in proteomic workflows for lead identification?

The ability to apply false discovery rate filtering, run peptide-to-spectrum matching via engines like X!Tandem, and perform quality control on ID filter outputs (e.g., peptide and protein counts) is essential to ensure reliable protein identification and quantification from OpenProt-based searches.

Analizy proteomiczne oparte na spektrometrii mas z wykorzystaniem bazy danych OpenProt w celu odkrycia nowych białek przetłumaczonych z niekanonicznych otwartych ramek odczytu

14.9K views

Cited by 15

07:38 min

April 11th, 2019

10.3791/59589-v

April 11th, 2019

14.9K views

Marie A. Brunet¹^,² , Xavier Roucou¹^,²

¹Department of Biochemistry, Université de Sherbrooke, ²PROTEO, Quebec Network for Research on Protein Function, Structure, and Engineering

OpenProt to swobodnie dostępna baza danych, która wymusza policistroniczny model genomów eukariotycznych. W tym miejscu przedstawiamy protokół korzystania z baz danych OpenProt podczas przeszukiwania zestawów danych spektrometrii mas. Wykorzystanie bazy danych OpenProt do analizy eksperymentów proteomicznych pozwala na odkrycie nowych i wcześniej niewykrywalnych białek.