Skip to content

Instantly share code, notes, and snippets.

@cschin
Last active December 21, 2015 19:59
Show Gist options
  • Save cschin/6357971 to your computer and use it in GitHub Desktop.
Save cschin/6357971 to your computer and use it in GitHub Desktop.
Some details to make good vitalization for the overlapping data within Celera Assembler
(1) If we use Celera Assembler's bogart unitiger, it will generate a file called "bests.edges" in the "4-unitigger" as one of its output directories.
(2) I wrote a simple script converting the edge list to a GML. The script can be downloaded https://github.com/PacificBiosciences/HBAR-DTK/blob/master/src/CA_best_edge_to_GML.py
(3) Load the graph into gephi (https://gephi.org/)
(4) I typically use the following steps of different layout algorithms in Gephi to get a good layout,
1) "YifanHu's Multilevel" to get a rough layout, the output usually catches good large scale structure and detangle the graph reasonably one so I can start to see features of the assembly overlap graph (or the string graph.)
2) "ForceAtlas 2" to smooth the path in the graph. It is a physics based layout algorithm. If you tune the "Gravity" and "Repulsion" parameters right, one can the space-filling-curve-like layout that I showed you yesterday.
3) The "ForceAtlas 2" layout algorithm has a tendency to collapse the bubbles. I need to use the "Yifan Hu Proportional" to "open them up".
A lot of this requires some try-and-errors. Some of the knowledge from my physics education and rough understanding of the layout algorithm helps.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment