You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
We have an ongoing project to map the Genome coordinates into ESENBML, we have been doing this for a while. Internally, PRIDE has moved into mztab long time ago. Then, our PSMs are in mztab for every project. We have a tool that read the mztab and tries to map the PSMs into Reference Genomes. However, we would like to keep that information also into the mzTab files as we did it in the mzIdentML 1.2. This is really important to us because we want to annotate our datasets.
I was checking the current implementation of mzid 1.2 this information is represented in the PeptideEvidence objects like:
<PeptideEvidencedBSequence_ref="dbseq_generic|A_ENSP00000471242.1|"peptide_ref="LALWEGR_"start="606"end="612"pre="R"post="S"isDecoy="false"id="LALWEGR_generic|A_ENSP00000471242.1|_606_612">
<userParamname="psm_count"value="1"></userParam>
<cvParamcvRef="PSI-MS"accession="MS:1002640"name="peptide end on chromosome"value="98424581"></cvParam>
<cvParamcvRef="PSI-MS"accession="MS:1002641"name="peptide exon count"value="2"></cvParam>
<cvParamcvRef="PSI-MS"accession="MS:1002642"name="peptide exon nucleotide sizes"value="11,10"></cvParam>
<cvParamcvRef="PSI-MS"accession="MS:1002643"name="peptide start positions on chromosome"value="98412025,98424571"></cvParam>
</PeptideEvidence>
<PeptideEvidencedBSequence_ref="dbseq_generic|A_ENSP00000479861.1|"peptide_ref="GRLYPWGVVEVENPEHNDFLK_"start="290"end="310"pre="R"post="L"isDecoy="false"id="GRLYPWGVVEVENPEHNDFLK_generic|A_ENSP00000479861.1|_290_310">
<userParamname="psm_count"value="2"></userParam>
<cvParamcvRef="PSI-MS"accession="MS:1002640"name="peptide end on chromosome"value="241343880"></cvParam>
<cvParamcvRef="PSI-MS"accession="MS:1002641"name="peptide exon count"value="1"></cvParam>
<cvParamcvRef="PSI-MS"accession="MS:1002642"name="peptide exon nucleotide sizes"value="63"></cvParam>
<cvParamcvRef="PSI-MS"accession="MS:1002643"name="peptide start positions on chromosome"value="241343819"></cvParam>
</PeptideEvidence>
I like to reuse the Cvparam style used in mzid but we don't have in the mzTab the peptideEvidence concept. Then, this annotation should be added into the PSM section using optional cvparameters. With optional parameters, we don't need to change the schema of mztab. The problem is that because they are PSMs, they can map to multiple genome coordinates. Suggestions?
The text was updated successfully, but these errors were encountered:
Hi @ypriverol, yes I understand the issue. There is no easy way to compress the info into one CV param, even for the case of a single mapping so you will need to use multiple CV params, if you go down this route. You would then also have to perform some complicated grouping for the case of multiple mappings.
Given that the mzTab files are largely for internal consumption (and proBed is absolutely designed for this case anyway), you should either convert to proBed properly, or just come up with a hacky userParam to cover this with your own fixed format e.g. "98424581:2:11,10:98412025,98424571;241343880:1:63:241343819" from the above example
@andrewrobertjones @timosachsenberg @jgriss :
We have an ongoing project to map the Genome coordinates into ESENBML, we have been doing this for a while. Internally, PRIDE has moved into mztab long time ago. Then, our PSMs are in mztab for every project. We have a tool that read the mztab and tries to map the PSMs into Reference Genomes. However, we would like to keep that information also into the mzTab files as we did it in the mzIdentML 1.2. This is really important to us because we want to annotate our datasets.
I was checking the current implementation of mzid 1.2 this information is represented in the
PeptideEvidence
objects like:I like to reuse the
Cvparam
style used in mzid but we don't have in the mzTab the peptideEvidence concept. Then, this annotation should be added into the PSM section usingoptional cvparameters
. With optional parameters, we don't need to change the schema of mztab. The problem is that because they are PSMs, they can map to multiple genome coordinates. Suggestions?The text was updated successfully, but these errors were encountered: