Last active
May 10, 2023 21:22
-
-
Save seandavi/76d49378f59422e5e3e4123ac845c159 to your computer and use it in GitHub Desktop.
Using GPT-4 to augment AnnotationHub resource metadata
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
I would like you to describe the UCSC genome browser table called "cytoBand". The first few lines of the table are here: | |
chrom chromStart chromEnd name gieStain | |
chr1 0 2300000 p36.33 gneg | |
chr1 2300000 5300000 p36.32 gpos25 | |
chr1 5300000 7100000 p36.31 gneg | |
chr1 7100000 9100000 p36.23 gpos25 | |
chr1 9100000 12500000 p36.22 gneg | |
chr1 12500000 15900000 p36.21 gpos50 | |
chr1 15900000 20100000 p36.13 gneg | |
chr1 20100000 23600000 p36.12 gpos25 | |
chr1 23600000 27600000 p36.11 gneg | |
chr1 27600000 29900000 p35.3 gpos25 | |
Some additional metadata about the table are given here in json format. | |
{"title":"Chromosome Band","dataprovider":"UCSC","species":"Homo sapiens","taxonomyid":9606,"genome":"hg19","description":"GRanges object from UCSC track 'Chromosome Band'","coordinate_1_based":1,"maintainer":"Marc Carlson <mcarlson@fhcrc.org>","rdatadateadded":"2013-03-26","preparerclass":"UCSCFullTrackImportPreparer","tags":{"AH5012":["cytoBand","UCSC","track","Gene","Transcript","Annotation"]},"rdataclass":"GRanges","rdatapath":"goldenpath/hg19/database/cytoBand_0.0.1.RData","sourceurl":"rtracklayer://hgdownload.cse.ucsc.edu/goldenpath/hg19/database/cytoBand","sourcetype":"UCSC track"} | |
Provide a general description of the table, the context of the UCSC genome browser, and a column description for each column. Output the results in json format and include the metadata already provided in addition to any other fields you'd like to add. If there are tags associated with the dataset, create a record for each tag that looks like {"tag": <tag_value>, "description": <free text description of tag that you supply>}. |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
{ | |
"title": "Chromosome Band", | |
"dataprovider": "UCSC", | |
"species": "Homo sapiens", | |
"taxonomyid": 9606, | |
"genome": "hg19", | |
"description": "The cytoBand table in the UCSC Genome Browser provides information on the cytogenetic bands of human chromosomes. These bands are visible under a microscope when the chromosomes are stained using specific techniques. The table contains data on the location and staining pattern of each band, which can be useful for identifying chromosomal abnormalities and mapping genes to specific regions.", | |
"coordinate_1_based": 1, | |
"maintainer": "Marc Carlson mcarlson@fhcrc.org", | |
"rdatadateadded": "2013-03-26", | |
"preparerclass": "UCSCFullTrackImportPreparer", | |
"tags": { | |
"AH5012": [ | |
{ | |
"tag": "cytoBand", | |
"description": "The name of the table in the UCSC Genome Browser that contains information on the cytogenetic bands of human chromosomes." | |
}, | |
{ | |
"tag": "UCSC", | |
"description": "The University of California, Santa Cruz, which hosts the Genome Browser, a widely used web-based tool for visualizing and analyzing genomic data." | |
}, | |
{ | |
"tag": "track", | |
"description": "A visual representation of a specific type of genomic data in the UCSC Genome Browser, allowing users to view and explore the data in the context of other genomic features." | |
}, | |
{ | |
"tag": "Gene", | |
"description": "A segment of DNA that contains the information necessary to produce a functional product, usually a protein or RNA molecule." | |
}, | |
{ | |
"tag": "Transcript", | |
"description": "An RNA molecule that is produced from a gene and serves as a template for protein synthesis or performs other functions in the cell." | |
}, | |
{ | |
"tag": "Annotation", | |
"description": "Information added to genomic data to provide context and facilitate interpretation, such as gene names, functional descriptions, and sequence features." | |
} | |
] | |
}, | |
"rdataclass": "GRanges", | |
"rdatapath": "goldenpath/hg19/database/cytoBand_0.0.1.RData", | |
"sourceurl": "rtracklayer://hgdownload.cse.ucsc.edu/goldenpath/hg19/database/cytoBand", | |
"sourcetype": "UCSC track", | |
"column_descriptions": { | |
"chrom": "The chromosome on which the cytogenetic band is located.", | |
"chromStart": "The starting base pair position of the cytogenetic band on the chromosome (0-based).", | |
"chromEnd": "The ending base pair position of the cytogenetic band on the chromosome (non-inclusive).", | |
"name": "The name of the cytogenetic band, which includes the chromosome arm (p or q) and the band number.", | |
"gieStain": "The staining pattern of the cytogenetic band, which can be gneg (negative), gpos25 (lightly positive), gpos50 (moderately positive), gpos75 (strongly positive), or gpos100 (very strongly positive)." | |
} | |
} |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment