Skip to content

Instantly share code, notes, and snippets.

@lindenb
Created March 21, 2018 17:39
Show Gist options
  • Save lindenb/63c85ee0e8c21b6cc3e7d44b77dd93db to your computer and use it in GitHub Desktop.
Save lindenb/63c85ee0e8c21b6cc3e7d44b77dd93db to your computer and use it in GitHub Desktop.
parsing drugbang XML->TSV using XSLT . keywords java xslt jvarkit xsltstream drugbank

converting drugbank to TSV using XSLT and xsltstream http://lindenb.github.io/jvarkit/XsltStream.html

e.g:

java -jar dist/xsltstream.jar \
    -n '{http://www.drugbank.ca}drug' \
    -t drugbank2tsv.xsl \
    /path/to/full_database.xml
   
Lepirudin	approved		CHEMBL1201666		46507011
Cetuximab	approved		CHEMBL1201577		46507042
Dornase alfa	approved		CHEMBL1201431		46507792
Denileukin diftitox	approved->investigational		CHEMBL1201550		46506950
Etanercept	approved->investigational		CHEMBL1201572		46506732
Bivalirudin	approved->investigational	OIRCOABEOLEUMC-GEJPAHFPSA-N	CHEMBL2103749	16129704	46507415
Leuprolide	approved->investigational		CHEMBL1201199		46507635
Peginterferon alfa-2a	approved->investigational		CHEMBL1201560		46504860
Alteplase	approved		CHEMBL1201593		46507035
Sermorelin	approved->withdrawn				46507399
<?xml version='1.0' encoding="UTF-8" ?>
<xsl:stylesheet xmlns:d="http://www.drugbank.ca" xmlns:xsl='http://www.w3.org/1999/XSL/Transform' version='1.0'>
<xsl:output method="text"/>
<xsl:template match="d:drugbank">
<xsl:apply-templates select="d:drug"/>
</xsl:template>
<xsl:template match="d:drug">
<xsl:value-of select="d:name/text()"/>
<xsl:text> </xsl:text>
<xsl:for-each select="d:groups/d:group">
<xsl:if test='position()>1'>-&gt;</xsl:if>
<xsl:value-of select="./text()"/>
</xsl:for-each>
<xsl:text> </xsl:text>
<xsl:for-each select="d:calculated-properties/d:property[d:kind/text()='InChIKey']/d:value">
<xsl:if test='position()>1'> </xsl:if>
<xsl:value-of select="./text()"/>
</xsl:for-each>
<xsl:text> </xsl:text>
<xsl:for-each select="d:external-identifiers/d:external-identifier[d:resource/text()='ChEMBL']/d:identifier">
<xsl:if test='position()>1'> </xsl:if>
<xsl:value-of select="./text()"/>
</xsl:for-each>
<xsl:text> </xsl:text>
<xsl:for-each select="d:external-identifiers/d:external-identifier[d:resource/text()='PubChem Compound']/d:identifier">
<xsl:if test='position()>1'> </xsl:if>
<xsl:value-of select="./text()"/>
</xsl:for-each>
<xsl:text> </xsl:text>
<xsl:for-each select="d:external-identifiers/d:external-identifier[d:resource/text()='PubChem Substance']/d:identifier">
<xsl:if test='position()>1'> </xsl:if>
<xsl:value-of select="./text()"/>
</xsl:for-each>
<xsl:text>
</xsl:text>
</xsl:template>
</xsl:stylesheet>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment