Skip to content

Instantly share code, notes, and snippets.

@seandavi
Last active May 10, 2023 21:22
Show Gist options
  • Save seandavi/76d49378f59422e5e3e4123ac845c159 to your computer and use it in GitHub Desktop.
Save seandavi/76d49378f59422e5e3e4123ac845c159 to your computer and use it in GitHub Desktop.
Using GPT-4 to augment AnnotationHub resource metadata
I would like you to describe the UCSC genome browser table called "cytoBand". The first few lines of the table are here:
chrom chromStart chromEnd name gieStain
chr1 0 2300000 p36.33 gneg
chr1 2300000 5300000 p36.32 gpos25
chr1 5300000 7100000 p36.31 gneg
chr1 7100000 9100000 p36.23 gpos25
chr1 9100000 12500000 p36.22 gneg
chr1 12500000 15900000 p36.21 gpos50
chr1 15900000 20100000 p36.13 gneg
chr1 20100000 23600000 p36.12 gpos25
chr1 23600000 27600000 p36.11 gneg
chr1 27600000 29900000 p35.3 gpos25
Some additional metadata about the table are given here in json format.
{"title":"Chromosome Band","dataprovider":"UCSC","species":"Homo sapiens","taxonomyid":9606,"genome":"hg19","description":"GRanges object from UCSC track 'Chromosome Band'","coordinate_1_based":1,"maintainer":"Marc Carlson <mcarlson@fhcrc.org>","rdatadateadded":"2013-03-26","preparerclass":"UCSCFullTrackImportPreparer","tags":{"AH5012":["cytoBand","UCSC","track","Gene","Transcript","Annotation"]},"rdataclass":"GRanges","rdatapath":"goldenpath/hg19/database/cytoBand_0.0.1.RData","sourceurl":"rtracklayer://hgdownload.cse.ucsc.edu/goldenpath/hg19/database/cytoBand","sourcetype":"UCSC track"}
Provide a general description of the table, the context of the UCSC genome browser, and a column description for each column. Output the results in json format and include the metadata already provided in addition to any other fields you'd like to add. If there are tags associated with the dataset, create a record for each tag that looks like {"tag": <tag_value>, "description": <free text description of tag that you supply>}.
{
"title": "Chromosome Band",
"dataprovider": "UCSC",
"species": "Homo sapiens",
"taxonomyid": 9606,
"genome": "hg19",
"description": "The cytoBand table in the UCSC Genome Browser provides information on the cytogenetic bands of human chromosomes. These bands are visible under a microscope when the chromosomes are stained using specific techniques. The table contains data on the location and staining pattern of each band, which can be useful for identifying chromosomal abnormalities and mapping genes to specific regions.",
"coordinate_1_based": 1,
"maintainer": "Marc Carlson mcarlson@fhcrc.org",
"rdatadateadded": "2013-03-26",
"preparerclass": "UCSCFullTrackImportPreparer",
"tags": {
"AH5012": [
{
"tag": "cytoBand",
"description": "The name of the table in the UCSC Genome Browser that contains information on the cytogenetic bands of human chromosomes."
},
{
"tag": "UCSC",
"description": "The University of California, Santa Cruz, which hosts the Genome Browser, a widely used web-based tool for visualizing and analyzing genomic data."
},
{
"tag": "track",
"description": "A visual representation of a specific type of genomic data in the UCSC Genome Browser, allowing users to view and explore the data in the context of other genomic features."
},
{
"tag": "Gene",
"description": "A segment of DNA that contains the information necessary to produce a functional product, usually a protein or RNA molecule."
},
{
"tag": "Transcript",
"description": "An RNA molecule that is produced from a gene and serves as a template for protein synthesis or performs other functions in the cell."
},
{
"tag": "Annotation",
"description": "Information added to genomic data to provide context and facilitate interpretation, such as gene names, functional descriptions, and sequence features."
}
]
},
"rdataclass": "GRanges",
"rdatapath": "goldenpath/hg19/database/cytoBand_0.0.1.RData",
"sourceurl": "rtracklayer://hgdownload.cse.ucsc.edu/goldenpath/hg19/database/cytoBand",
"sourcetype": "UCSC track",
"column_descriptions": {
"chrom": "The chromosome on which the cytogenetic band is located.",
"chromStart": "The starting base pair position of the cytogenetic band on the chromosome (0-based).",
"chromEnd": "The ending base pair position of the cytogenetic band on the chromosome (non-inclusive).",
"name": "The name of the cytogenetic band, which includes the chromosome arm (p or q) and the band number.",
"gieStain": "The staining pattern of the cytogenetic band, which can be gneg (negative), gpos25 (lightly positive), gpos50 (moderately positive), gpos75 (strongly positive), or gpos100 (very strongly positive)."
}
}
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment