Wayne's Bioinformatics Code Portal fomightez

## latexy.ipynb

      
              1 file
            
          
              0 forks
            
          
              0 comments
            
          
              0 stars
            
          
                fomightez
                / latexy.ipynb
            
            
              Last active
              January 3, 2016 06:58
            
              
                plotly IPython Notebook, originally from https://github.com/plotly/IPython-plotly/blob/master/Plotly%20gets%20LaTeXy.ipynb . I want it so I can paste Wikipedia formulas and have them rendered beautifully
              
          
      Sorry, something went wrong. Reload?
      Sorry, we cannot display this file.
      Sorry, this file is invalid so it cannot be displayed.
      
          Viewer requires iframe.
      
    
## test.ipynb

      
              1 file
            
          
              0 forks
            
          
              0 comments
            
          
              0 stars
            
          
                fomightez
                / test.ipynb
            
            
              Last active
              January 3, 2016 06:59
            
              
                test
              
          
      Sorry, something went wrong. Reload?
      Sorry, we cannot display this file.
      Sorry, this file is invalid so it cannot be displayed.
      
          Viewer requires iframe.
      
    
## beautiful_soup_to_mine_example_links_from_concepts_page.py
from bs4 import BeautifulSoup


file_name = "concepts.html"
start_of_example_urls = "http://www.codeskulptor.org/#exampl"


soup = BeautifulSoup(open(file_name))

#print(soup.prettify())

## get accession numbers regex.md

      
              1 file
            
          
              0 forks
            
          
              0 comments
            
          
              0 stars
            
          
                fomightez
                / get accession numbers regex.md
            
            
              Last active
              August 29, 2015 14:07
            
              
                get list of just accession numbers from fasta sequence entries list using regular expressions
              
          
    Step 1: eliminate all but description line of FASTA entries

First to reduce to just lines beginning with carets, i.e., leave only the description line   (<---from http://stackoverflow.com/questions/7310598/remove-all-lines-without-an-character-in-notepad)
FIND:
^[^>]*$

REPLACE:

  
## remove blank lines regex.md

      
              1 file
            
          
              17 forks
            
          
              23 comments
            
          
              105 stars
            
          
                fomightez
                / remove blank lines regex.md
            
            
              Last active
              February 22, 2024 09:49
            
              
                remove all blank lines using regular expressions
              
          
    REGEX remove blank lines:
FROM: http://www.ultraedit.com/support/tutorials_power_tips/ultraedit/remove_blank_lines.html

FIND:
^(?:[\t ]*(?:\r?\n|\r))+


## genus_species_fasta regex.md

      
              1 file
            
          
              1 fork
            
          
              0 comments
            
          
              0 stars
            
          
                fomightez
                / genus_species_fasta regex.md
            
            
              Last active
              June 5, 2018 17:28
            
              
                Regex to make genus species as one word in fasta entries.
              
          
    REGEX to make genus species as one word in modified fasta entries.

NOTE: These FASTA entries were first put through my namerv.1.py Python program to put scientific name at start instead of lots of codes that you get back in versions from BATCH ENTREZ.
FIND:
(>\w)\w+ (\w+)

REPLACE:
\1.\2


## unique_id_for_fasta regex.md

      
              1 file
            
          
              0 forks
            
          
              0 comments
            
          
              0 stars
            
          
                fomightez
                / unique_id_for_fasta regex.md
            
            
              Last active
              June 5, 2018 17:29
            
              
                Regex to make unique id with genus species in modified fasta entries
              
          
    REGEX to make unique id with genus species in modified fasta entries.

NOTE: These FASTA entries were first put through my namerv.1.py Python program to put scientific name at start instead of lots of codes that you get back in versions from BATCH ENTREZ.
WAIT!!!! This didn't quite work. For example, failed on all like '>Pichia kudriavzevii |gi|695112010|gb|KGK38559.1|' and '>Colletotrichum higginsianum |gi|380481846|emb|CCF41606.1|'. NEEDS PERFECTING
FIND:
(>\w)\w+ (\w+) (\w+)


## test_table.md

      
              1 file
            
          
              0 forks
            
          
              0 comments
            
          
              0 stars
            
          
                fomightez
                / test_table.md
            
            
              Last active
              August 29, 2015 14:08
            
              
                test_table
              
          
Left-Aligned
Center Aligned
Right Aligned


col 3 is
some wordy text
$1600


col 2 is
centered
$12


zebra stripes
are neat
$1


col 3 is
some wordy text
$1600


col 2 is
centered
$12


zebra stripes
are neat
$1


col 3 is
some wordy text
$1600


col 2 is
centered
$12


## using sed to do find replace in a file.md

      
              1 file
            
          
              0 forks
            
          
              0 comments
            
          
              0 stars
            
          
                fomightez
                / using sed to do find replace in a file.md
            
            
              Last active
              December 7, 2017 20:33
            
              
                using sed to do find replace in a file
              
          
    adapted from here
I had a huge file I wanted to find replace in and it was seeming to make Sublime Text unresponsive so I tried
sed 's/transmit 0.780000/transmit 0.97/g' <test_78.txt >test_97.txt

and it worked great.
The INPUT file is test_78.txt and OUTPUT file is test_97.txt.

  
## ELink_keeping_order.py
print "\n\n\n\n\nSetting up .... \n"
from Bio import Entrez
Entrez.email = "A.N.Other@example.com"     # Always tell NCBI who you are
protein_gi_numbers = ["148908191", "297793721", "48525513", "507118461"]
print "protein_gi_numbers to get are" + str(protein_gi_numbers)
taxonomy_uids = []

#ELink step
print "performing ELink step....\n"
handle = Entrez.elink(dbfrom="protein", db="taxonomy", id=protein_gi_numbers)
	from bs4 import BeautifulSoup


	file_name = "concepts.html"
	start_of_example_urls = "http://www.codeskulptor.org/#exampl"


	soup = BeautifulSoup(open(file_name))

	#print(soup.prettify())
Left-Aligned	Center Aligned	Right Aligned
col 3 is	some wordy text	$1600
col 2 is	centered	$12
zebra stripes	are neat	$1
col 3 is	some wordy text	$1600
col 2 is	centered	$12
zebra stripes	are neat	$1
col 3 is	some wordy text	$1600
col 2 is	centered	$12
	print "\n\n\n\n\nSetting up .... \n"
	from Bio import Entrez
	Entrez.email = "A.N.Other@example.com" # Always tell NCBI who you are
	protein_gi_numbers = ["148908191", "297793721", "48525513", "507118461"]
	print "protein_gi_numbers to get are" + str(protein_gi_numbers)
	taxonomy_uids = []

	#ELink step
	print "performing ELink step....\n"
	handle = Entrez.elink(dbfrom="protein", db="taxonomy", id=protein_gi_numbers)