Mathias Walzer mwalzer

## lxml2df4mzq.py
from lxml import etree
mzq = "file:///path/vis_fix.mzq"
doc = etree.parse(mzq)

header = doc.xpath('/x:MzQuantML/x:PeptideConsensusList/x:AssayQuantLayer/x:ColumnIndex',
		namespaces={'x': "http://psidev.info/psi/pi/mzQuantML/1.0.0"})
col_names = ['object_ref'] + header[0].text.split(' ')

dm = doc.xpath('/x:MzQuantML/x:PeptideConsensusList/x:AssayQuantLayer/x:DataMatrix',
		namespaces={'x': "http://psidev.info/psi/pi/mzQuantML/1.0.0"})

## PRIDE_contaminants.msp
Name: IQVR/2
Comment: Spec=Consensus Mods=0 Parent=258.049 Nreps=20 Naa=4 MaxRatio=0.750 PrecursorMzRange=0.0570 Protein=sp|TRYP_PIG|
Num peaks: 32
130.886 897.48
157.784 26.99
174.812 660.3
192.273 365.64
196.799 37.08
213.811 258.83
224.825 3465.14

## biognosis_irts.csv

          
            iRT peptide
            Precursor m/z
            iRT

            
              LGGNEQVTR
              487.257
              -24.92

            
              GAGSSEPVTGLDAK
              644.823
              0.00

            
              VEATFGVDESNAK
              683.828
              12.39

            
              YILAGVENSK
              547.298
              19.79

            
              TPVISGGPYEYR
              669.838
              28.71

            
              TPVITGAPYEYR
              683.854
              33.38

            
              DGLDAASYYAPVR
              699.339
              42.26

            
              ADVTPADFSEWSK
              726.836
              54.62

            
              GTFIIDPGGVIR
              622.854
              70.52

## write_in_5_minutes.ipynb

      
              1 file
            
          
              0 forks
            
          
              0 comments
            
          
              0 stars
            
          
                mwalzer
                / write_in_5_minutes.ipynb
            
            
              Last active
              June 23, 2022 09:41
            
              
                read_in_5_minutes.ipynb
              
          
        Loading

      Sorry, something went wrong. Reload?
      Sorry, we cannot display this file.
      Sorry, this file is invalid so it cannot be displayed.
      
          Viewer requires iframe.
      
    
## qc_edges.html
<html>
<head>
<link rel="stylesheet" href="https://cdn.jsdelivr.net/npm/vis-network@latest/styles/vis-network.css" type="text/css" />
<script type="text/javascript" src="https://cdn.jsdelivr.net/npm/vis-network@latest/dist/vis-network.min.js"> </script>
<center>
<h1></h1>
</center>

<!-- <link rel="stylesheet" href="../node_modules/vis/dist/vis.min.css" type="text/css" />
<script type="text/javascript" src="../node_modules/vis/dist/vis.js"> </script>-->

## proteomicsdb_api_request.py
import json
import pprint
import time
import requests
import pandas as pd

api_target = "https://www.proteomicsdb.org/proteomicsdb/logic/api/proteinexpression.xsodata/InputParams(PROTEINFILTER='{prot_acc}',MS_LEVEL=1,TISSUE_ID_SELECTION='',TISSUE_CATEGORY_SELECTION='tissue;fluid',SCOPE_SELECTION=1,GROUP_BY_TISSUE=1,CALCULATION_METHOD=0,EXP_ID=-1)/Results?$select=UNIQUE_IDENTIFIER,TISSUE_ID,TISSUE_NAME,TISSUE_SAP_SYNONYM,SAMPLE_ID,SAMPLE_NAME,AFFINITY_PURIFICATION,EXPERIMENT_ID,EXPERIMENT_NAME,EXPERIMENT_SCOPE,EXPERIMENT_SCOPE_NAME,PROJECT_ID,PROJECT_NAME,PROJECT_STATUS,UNNORMALIZED_INTENSITY,NORMALIZED_INTENSITY,MIN_NORMALIZED_INTENSITY,MAX_NORMALIZED_INTENSITY,SAMPLES&$format=json"
results = list()
no_joy = list()

## metabo-batches.mzQC.md

      
              1 file
            
          
              0 forks
            
          
              0 comments
            
          
              0 stars
            
          
                mwalzer
                / metabo-batches.mzQC.md
            
            
              Last active
              December 11, 2020 18:18
            
          
    Metabolomics batch runs example

Here, we describe details of a metabolomics mzQC JSON document used to describe a Studies' quality before and after batch correction methods are applied.
For description of the general structure of mzQC, see the Single-Run Example of mzQC.
Find the complete file at the bottom of this document or in the example folder.
The mzQC file is made from the acquisions of GC-ToF-MS polar metabolite data of an Arabidopsis nucleotype-plasmotype diallel study as described in Improved batch correction in untargeted MS-based metabolomics.
    "description": "This dataset is based on the analysis of polar extracts from a nucleotype-plasmotype combination study of Arabidopsis for 58 different genotypes. For details of the used plant material we refer to Flood (2015). Analysis of the polar, derivatized metabolites by GC-ToF-MS (Agilent 6890 GC coupled to a Leco Pegasus III MS) and processing of the data were done as described in Villaf


## QC2-sample-example.mzQC.md

      
              1 file
            
          
              0 forks
            
          
              0 comments
            
          
              0 stars
            
          
                mwalzer
                / QC2-sample-example.mzQC.md
            
            
              Last active
              December 11, 2020 17:43
            
          
    QC Sample-Run Example of mzQC

Here, we describe details of a mzQC JSON document used for a QC sample mass spectrometry run.
For description of the general structure of mzQC, see the Single-Run Example of mzQC.
Find the complete file at the bottom of this document or in the example folder.
The mzQC file is made from the acquision of a QC2 sample as described in QCloud: A cloud-based quality control system for mass spectrometry-based proteomics laboratories.
Optional (detailed) descriptions about the file can be placed into mzQC next to the general information about the file.
    "description": "This is an example of an mzQC file produced from a proteomics QC2 sample. 20 ug dried Pierce HeLa protein digest standard from Thermo Fisher Scientific (Part number: 88329) are dissolved in 200 uL of 0.1% formic acid in water to a final concentration of 100 ng/uL. A total amount of 1 uL (100ng) is injected per analysis.",

The metrics describe simple values lik

  
## single-run.mzQC.md

      
              1 file
            
          
              0 forks
            
          
              0 comments
            
          
              0 stars
            
          
                mwalzer
                / single-run.mzQC.md
            
            
              Last active
              December 11, 2020 17:22
            
          
    Single-Run Example of mzQC

Here, we describe a mzQC JSON document used for QC of a single mass spectrometry run.
Find the complete file at the bottom of this document or in the example folder.
The documents main anchor is between the outer curly brackets:
{ "mzQC": {
...
}


## metabo-batches.mzQC
{
  "mzQC": {
    "creationDate": "2020-12-09T11:04:16",
    "contactName": "Mathias Walzer",
    "contactAddress":  "walzer@ebi.ac.uk",
    "version": "1.0.0",
    "description": "This dataset is based on the analysis of polar extracts from a nucleotype-plasmotype combination study of Arabidopsis for 58 different genotypes. For details of the used plant material we refer to Flood (2015). Analysis of the polar, derivatized metabolites by GC-ToF-MS (Agilent 6890 GC coupled to a Leco Pegasus III MS) and processing of the data were done as described in Villafort Carvalho et al. (2015). Here, the number of metabolites (75) is much lower than in the other two data sets, partly because the focus was on the primary rather than the secondary metabolites. The number of samples was 240, with a percentage of non-detects of 16 %; the maximum fraction of non-detects in individual metabolites is 92 %. All metabolites were retained in the analysis. Four batches of 31-89 samples were employed, containing 2-6 QCs per batch, 1
	from lxml import etree
	mzq = "file:///path/vis_fix.mzq"
	doc = etree.parse(mzq)

	header = doc.xpath('/x:MzQuantML/x:PeptideConsensusList/x:AssayQuantLayer/x:ColumnIndex',
	namespaces={'x': "http://psidev.info/psi/pi/mzQuantML/1.0.0"})
	col_names = ['object_ref'] + header[0].text.split(' ')

	dm = doc.xpath('/x:MzQuantML/x:PeptideConsensusList/x:AssayQuantLayer/x:DataMatrix',
	namespaces={'x': "http://psidev.info/psi/pi/mzQuantML/1.0.0"})
	Name: IQVR/2
	Comment: Spec=Consensus Mods=0 Parent=258.049 Nreps=20 Naa=4 MaxRatio=0.750 PrecursorMzRange=0.0570 Protein=sp\|TRYP_PIG\|
	Num peaks: 32
	130.886 897.48
	157.784 26.99
	174.812 660.3
	192.273 365.64
	196.799 37.08
	213.811 258.83
	224.825 3465.14
iRT peptide	Precursor m/z	iRT
LGGNEQVTR	487.257	-24.92
GAGSSEPVTGLDAK	644.823	0.00
VEATFGVDESNAK	683.828	12.39
YILAGVENSK	547.298	19.79
TPVISGGPYEYR	669.838	28.71
TPVITGAPYEYR	683.854	33.38
DGLDAASYYAPVR	699.339	42.26
ADVTPADFSEWSK	726.836	54.62
GTFIIDPGGVIR	622.854	70.52
	<html>
	<head>
	<link rel="stylesheet" href="https://cdn.jsdelivr.net/npm/vis-network@latest/styles/vis-network.css" type="text/css" />
	<script type="text/javascript" src="https://cdn.jsdelivr.net/npm/vis-network@latest/dist/vis-network.min.js"> </script>
	<center>
	<h1></h1>
	</center>

	<!-- <link rel="stylesheet" href="../node_modules/vis/dist/vis.min.css" type="text/css" />
	<script type="text/javascript" src="../node_modules/vis/dist/vis.js"> </script>-->
	import json
	import pprint
	import time
	import requests
	import pandas as pd

	api_target = "https://www.proteomicsdb.org/proteomicsdb/logic/api/proteinexpression.xsodata/InputParams(PROTEINFILTER='{prot_acc}',MS_LEVEL=1,TISSUE_ID_SELECTION='',TISSUE_CATEGORY_SELECTION='tissue;fluid',SCOPE_SELECTION=1,GROUP_BY_TISSUE=1,CALCULATION_METHOD=0,EXP_ID=-1)/Results?$select=UNIQUE_IDENTIFIER,TISSUE_ID,TISSUE_NAME,TISSUE_SAP_SYNONYM,SAMPLE_ID,SAMPLE_NAME,AFFINITY_PURIFICATION,EXPERIMENT_ID,EXPERIMENT_NAME,EXPERIMENT_SCOPE,EXPERIMENT_SCOPE_NAME,PROJECT_ID,PROJECT_NAME,PROJECT_STATUS,UNNORMALIZED_INTENSITY,NORMALIZED_INTENSITY,MIN_NORMALIZED_INTENSITY,MAX_NORMALIZED_INTENSITY,SAMPLES&$format=json"
	results = list()
	no_joy = list()
	{
	"mzQC": {
	"creationDate": "2020-12-09T11:04:16",
	"contactName": "Mathias Walzer",
	"contactAddress": "walzer@ebi.ac.uk",
	"version": "1.0.0",
	"description": "This dataset is based on the analysis of polar extracts from a nucleotype-plasmotype combination study of Arabidopsis for 58 different genotypes. For details of the used plant material we refer to Flood (2015). Analysis of the polar, derivatized metabolites by GC-ToF-MS (Agilent 6890 GC coupled to a Leco Pegasus III MS) and processing of the data were done as described in Villafort Carvalho et al. (2015). Here, the number of metabolites (75) is much lower than in the other two data sets, partly because the focus was on the primary rather than the secondary metabolites. The number of samples was 240, with a percentage of non-detects of 16 %; the maximum fraction of non-detects in individual metabolites is 92 %. All metabolites were retained in the analysis. Four batches of 31-89 samples were employed, containing 2-6 QCs per batch, 1