Skip to content

Instantly share code, notes, and snippets.

@lindenb
Created September 10, 2020 12:23
Show Gist options
  • Star 0 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save lindenb/1d23b5d53f1a3694bbd1a7631935140f to your computer and use it in GitHub Desktop.
Save lindenb/1d23b5d53f1a3694bbd1a7631935140f to your computer and use it in GitHub Desktop.
Question: is there any way to filter NCBI datasets by sample type?
<?xml version='1.0' encoding="ISO-8859-1"?>
<xsl:stylesheet
xmlns:xsl='http://www.w3.org/1999/XSL/Transform'
xmlns:x="http://www.ncbi.nlm.nih.gov/geo/info/MINiML"
version='1.0'
>
<xsl:output method="text"/>
<xsl:template match="/">
<xsl:apply-templates select="x:MINiML/x:Sample/x:Accession[@database='GEO']"/>
</xsl:template>
<xsl:template match="x:Accession">
wget -q -O - "https://www.ncbi.nlm.nih.gov/geo/query/acc.cgi?acc=<xsl:value-of select="text()"/>&amp;targ=self&amp;form=text&amp;view=quick" | grep Sample_source_name | sed 's%^%<xsl:value-of select="text()"/> %'
</xsl:template>
</xsl:stylesheet>
$ wget -q -O - "https://www.ncbi.nlm.nih.gov/geo/query/acc.cgi?acc=GSE19826&targ=self&form=xml&view=quick" |\
  xsltproc biostar460606.xsl - | bash
 
GSM495051 !Sample_source_name_ch1 = noncancer tissue
GSM495052 !Sample_source_name_ch1 = gastric cancer tissue
GSM495053 !Sample_source_name_ch1 = noncancer tissue
GSM495054 !Sample_source_name_ch1 = gastric cancer tissue
GSM495055 !Sample_source_name_ch1 = noncancer tissue
GSM495056 !Sample_source_name_ch1 = gastric cancer tissue
GSM495057 !Sample_source_name_ch1 = noncancer tissue
GSM495058 !Sample_source_name_ch1 = gastric cancer tissue
GSM495059 !Sample_source_name_ch1 = noncancer tissue
GSM495060 !Sample_source_name_ch1 = gastric cancer tissue
GSM495061 !Sample_source_name_ch1 = noncancer tissue
GSM495062 !Sample_source_name_ch1 = gastric cancer tissue
GSM495063 !Sample_source_name_ch1 = noncancer tissue
GSM495064 !Sample_source_name_ch1 = gastric cancer tissue
GSM495065 !Sample_source_name_ch1 = noncancer tissue
GSM495066 !Sample_source_name_ch1 = gastric cancer tissue
GSM495067 !Sample_source_name_ch1 = noncancer tissue
GSM495068 !Sample_source_name_ch1 = gastric cancer tissue
GSM495069 !Sample_source_name_ch1 = noncancer tissue
GSM495070 !Sample_source_name_ch1 = gastric cancer tissue
GSM495071 !Sample_source_name_ch1 = noncancer tissue
GSM495072 !Sample_source_name_ch1 = gastric cancer tissue
GSM495073 !Sample_source_name_ch1 = noncancer tissue
GSM495074 !Sample_source_name_ch1 = gastric cancer tissue
GSM495075 !Sample_source_name_ch1 = normal gastric tissue
GSM495076 !Sample_source_name_ch1 = normal gastric tissue
GSM495077 !Sample_source_name_ch1 = normal gastric tissue
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment