Bryan Smith BSCowboy

## regex.txt
Perl and PHP Regular Expressions

PHP regexes are based on the PCRE (Perl-Compatible Regular Expressions), so any regexp that works for one should be compatible with the other or any other language that makes use of the PCRE format. Here are some commonly needed regular expressions for both PHP and Perl. Each regex will be in string format and will include delimiters.
All Major Credit Cards

This regular expression will validate all major credit cards: American Express (Amex), Discover, Mastercard, and Visa.

    //All major credit cards regex
    '/^(?:4[0-9]{12}(?:[0-9]{3})?|5[1-5][0-9]{14}|6011[0-9]{12}|622((12[6-9]|1[3-9][0-9])|([2-8][0-9][0-9])|(9(([0-1][0-9])|(2[0-5]))))[0-9]{10}|64[4-9][0-9]{13}|65[0-9]{14}|3(?:0[0-5]|[68][0-9])[0-9]{11}|3[47][0-9]{13})*$/'

## jinja2_file_less.py
#!/usr/bin/env/python
#
# More of a reference of using jinaj2 without actual template files.
# This is great for a simple output transformation to standard out.
#
# Of course you will need to "sudo pip install jinja2" first!
#
# I like to refer to the following to remember how to use jinja2 :)
# http://jinja.pocoo.org/docs/templates/
#

## tree.md

      
              1 file
            
          
              186 forks
            
          
              49 comments
            
          
              929 stars
            
          
                hrldcpr
                / tree.md
            
            
              Last active
              July 31, 2024 15:13
            
              
                one-line tree in python
              
          
    One-line Tree in Python

Using Python's built-in defaultdict we can easily define a tree data structure:
def tree(): return defaultdict(tree)
That's it!

  
## GBT_CaliforniaHousing.py
# =============
# Introduction
# =============
# I've been doing some data mining lately and specially looking into `Gradient
# Boosting Trees <http://en.wikipedia.org/wiki/Gradient_boosting>`_ since it is
# claimed that this is one of the techniques with best performance out of the
# box.  In order to have a better understanding of the technique I've reproduced
# the example of section *10.14.1 California Housing* in the book `The Elements of Statistical Learning <http://www-stat.stanford.edu/~tibs/ElemStatLearn/>`_.
# Each point of this dataset represents the house value of a property with some
# attributes of that house. You can get the data and the description of those

## xsections.ipynb

      
              1 file
            
          
              0 forks
            
          
              2 comments
            
          
              2 stars
            
          
                pmarshwx
                / xsections.ipynb
            
            
              Created
              March 7, 2013 04:29
            
              
                Create cross sections in Python
              
          
        Loading

      Sorry, something went wrong. Reload?
      Sorry, we cannot display this file.
      Sorry, this file is invalid so it cannot be displayed.
      
          Viewer requires iframe.
      
    
## useful_pandas_snippets.md

      
              1 file
            
          
              637 forks
            
          
              63 comments
            
          
              1441 stars
            
          
                bsweger
                / useful_pandas_snippets.md
            
            
              Last active
              April 19, 2024 18:04
            
              
                Useful Pandas Snippets
              
          
    Useful Pandas Snippets

A personal diary of DataFrame munging over the years.
Data Types and Conversion

Convert Series datatype to numeric (will error if column has non-numeric values)

(h/t @makmanalp)

  
## search.py
#!/usr/bin/python

import sys;
import re;
import slate;
import pickle;
import nltk;
import glob;
import os;

## distcorr.py
from scipy.spatial.distance import pdist, squareform
import numpy as np
import copy


def distcorr(Xval, Yval, pval=True, nruns=500):
    """ Compute the distance correlation function, returning the p-value.
    Based on Satra/distcorr.py (gist aa3d19a12b74e9ab7941)

    >>> a = [1,2,3,4,5]

## separator.py
def splitDataFrameList(df,target_column,separator):
    ''' df = dataframe to split,
    target_column = the column containing the values to split
    separator = the symbol used to perform the split

    returns: a dataframe with each entry for the target column separated, with each element moved into a new row.
    The values in the other columns are duplicated across the newly divided rows.
    '''
    def splitListToRows(row,row_accumulator,target_column,separator):
        split_row = row[target_column].split(separator)

## metutils.py
#!/usr/bin/env python
"""
    Tropical Cyclone Risk Model (TCRM) - Version 1.0 (beta release)
    Copyright (C) 2011  Geoscience Australia

    This program is free software: you can redistribute it and/or modify
    it under the terms of the GNU General Public License as published by
    the Free Software Foundation, either version 3 of the License, or
    (at your option) any later version.
	Perl and PHP Regular Expressions

	PHP regexes are based on the PCRE (Perl-Compatible Regular Expressions), so any regexp that works for one should be compatible with the other or any other language that makes use of the PCRE format. Here are some commonly needed regular expressions for both PHP and Perl. Each regex will be in string format and will include delimiters.
	All Major Credit Cards

	This regular expression will validate all major credit cards: American Express (Amex), Discover, Mastercard, and Visa.

	//All major credit cards regex
	'/^(?:4[0-9]{12}(?:[0-9]{3})?\|5[1-5][0-9]{14}\|6011[0-9]{12}\|622((12[6-9]\|1[3-9][0-9])\|([2-8][0-9][0-9])\|(9(([0-1][0-9])\|(2[0-5]))))[0-9]{10}\|64[4-9][0-9]{13}\|65[0-9]{14}\|3(?:0[0-5]\|[68][0-9])[0-9]{11}\|3[47][0-9]{13})*$/'
	#!/usr/bin/env/python
	#
	# More of a reference of using jinaj2 without actual template files.
	# This is great for a simple output transformation to standard out.
	#
	# Of course you will need to "sudo pip install jinja2" first!
	#
	# I like to refer to the following to remember how to use jinja2 :)
	# http://jinja.pocoo.org/docs/templates/
	#
	# =============
	# Introduction
	# =============
	# I've been doing some data mining lately and specially looking into `Gradient
	# Boosting Trees <http://en.wikipedia.org/wiki/Gradient_boosting>`_ since it is
	# claimed that this is one of the techniques with best performance out of the
	# box. In order to have a better understanding of the technique I've reproduced
	# the example of section 10.14.1 California Housing in the book `The Elements of Statistical Learning <http://www-stat.stanford.edu/~tibs/ElemStatLearn/>`_.
	# Each point of this dataset represents the house value of a property with some
	# attributes of that house. You can get the data and the description of those
	#!/usr/bin/python

	import sys;
	import re;
	import slate;
	import pickle;
	import nltk;
	import glob;
	import os;
	from scipy.spatial.distance import pdist, squareform
	import numpy as np
	import copy


	def distcorr(Xval, Yval, pval=True, nruns=500):
	""" Compute the distance correlation function, returning the p-value.
	Based on Satra/distcorr.py (gist aa3d19a12b74e9ab7941)

	>>> a = [1,2,3,4,5]
	def splitDataFrameList(df,target_column,separator):
	''' df = dataframe to split,
	target_column = the column containing the values to split
	separator = the symbol used to perform the split

	returns: a dataframe with each entry for the target column separated, with each element moved into a new row.
	The values in the other columns are duplicated across the newly divided rows.
	'''
	def splitListToRows(row,row_accumulator,target_column,separator):
	split_row = row[target_column].split(separator)
	#!/usr/bin/env python
	"""
	Tropical Cyclone Risk Model (TCRM) - Version 1.0 (beta release)
	Copyright (C) 2011 Geoscience Australia

	This program is free software: you can redistribute it and/or modify
	it under the terms of the GNU General Public License as published by
	the Free Software Foundation, either version 3 of the License, or
	(at your option) any later version.