Skip to content

Instantly share code, notes, and snippets.

@rmzelle
Last active August 29, 2015 14:13
Show Gist options
  • Save rmzelle/22fc9c537ddf8f40de4b to your computer and use it in GitHub Desktop.
Save rmzelle/22fc9c537ddf8f40de4b to your computer and use it in GitHub Desktop.
##gff-version 3
scaffold_28 prediction gene 1 402 0 + . ID=545184;Name=545184
scaffold_28 prediction gene 805 981 0 - . ID=93782;Name=93782
scaffold_28 prediction gene 2030 2721 0 + . ID=545205;Name=545205
scaffold_28 prediction gene 3273 3545 0 - . Name=YOL159C-A;Synteny=no_synteny;SystematicGeneName=YOL159C-A;ID=38792
scaffold_28 prediction gene 5318 5833 0 - . Name=YOL159C;Synteny=no_synteny;SystematicGeneName=YOL159C;ID=38793
scaffold_28 prediction gene 6780 8600 0 - . Name=ENB1;Synteny=no_synteny;SystematicGeneName=YOL158C;StandardGeneName=ENB1;ID=38794
scaffold_28 prediction gene 9698 11467 0 - . Name=IMA4;Synteny=no_synteny;SystematicGeneName=YJL221C;StandardGeneName=IMA4;ID=38795
import gffutils
import os
def generatorShiftFeatures(features, shift):
for feature in features:
print("original position: " + str(feature.start) + " " + str(feature.end))
feature.start += shift
feature.end += shift
print("new position: " + str(feature.start) + " " + str(feature.end))
yield feature
def updateGenome():
genomeGFF="genome-minimal-fail.gff"
genomeGFFdb = genomeGFF.replace(".gff", ".db")
#create database for genome
db = gffutils.create_db(genomeGFF, dbfn=genomeGFFdb, force=True, merge_strategy="merge")
#shift coordinates of features that follow the original integration site by deletion size
downstreamGenomeFeatures = list(db.region('scaffold_28:6001-11467', completely_within=True))
db.delete(db.region('scaffold_28:6001-11467', completely_within=True))
db.update(generatorShiftFeatures(downstreamGenomeFeatures,10000), merge_strategy="replace")
allFeatures = db.all_features()
for feature in allFeatures:
print(feature.start)
return
updateGenome()
@rmzelle
Copy link
Author

rmzelle commented Jan 22, 2015

Edit: this comment relates to the initial version of the gist at https://gist.github.com/rmzelle/22fc9c537ddf8f40de4b/fb3c0e885cbdaac4189155893836d0046300ef46

Gives output:

original position: 6780 8600
new position: 16780 18600
original position: 9698 11467
new position: 19698 21467
1
805
2030
3273
5318
6780
9698

The last two values should be "16790" and "19698", but they're still at their original values.

@rmzelle
Copy link
Author

rmzelle commented Jan 23, 2015

Edit: this comment relates to the second version of the gist at https://gist.github.com/rmzelle/22fc9c537ddf8f40de4b/0813fd6c8594647b5e934c84621475a4f7166998
(see the diff at https://gist.github.com/rmzelle/22fc9c537ddf8f40de4b/revisions)

@daler, I was able to fix this by creating a list with all the features I wanted to shift, and then deleting the original features from the database. See https://gist.github.com/rmzelle/22fc9c537ddf8f40de4b/revisions

This now gives the desired output:

original position: 6780 8600
new position: 16780 18600
original position: 9698 11467
new position: 19698 21467
1
805
2030
3273
5318
16780
19698

It seems to me that this shouldn't be necessary, though. It looks like db.update() with merge_strategy="replace" doesn't update the feature coordinates when they change.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment