REGEX remove blank lines:
FROM: http://www.ultraedit.com/support/tutorials_power_tips/ultraedit/remove_blank_lines.html
FIND:
^(?:[\t ]*(?:\r?\n|\r))+
# List unique values in a DataFrame column | |
df['Column Name'].unique() | |
# To extract a specific column (subset the dataframe), you can use [ ] (brackets) or attribute notation. | |
df.height | |
df['height'] | |
# are same thing!!! (from http://www.stephaniehicks.com/learnPython/pages/pandas.html | |
# -or- | |
# http://www.datacarpentry.org/python-ecology-lesson/02-index-slice-subset/) |
REGEX remove blank lines:
FROM: http://www.ultraedit.com/support/tutorials_power_tips/ultraedit/remove_blank_lines.html
FIND:
^(?:[\t ]*(?:\r?\n|\r))+
#Useful examples for when converting command line commands from Jupyter/IPython back to Pure Python | |
# This is party for when need to speed up a `.ipy` script running. It will run much faster as `.py` than as `.ipy` if there | |
# are a lot of calls to command line / shell commands because saves time by not spawning new shell instance for | |
# each. (`.ipy` version great for quicker development and proto-typing but `.py` MUCH FASTER for running.) | |
# The Python versions also have the advantage that you can use them inside functions (I think) because don't have problem like | |
# with `!cp fn unsanitized_{fn}`or `%store` where actually run in global namespace which cannot see Python variable `fn` | |
# local to the function. | |
# RELATED NOTE: You can use the IPython `history` (via the "hist command (with -n to remove line numbers)") to | |
# help convert `.ipy` code or Jupyter code with exclamation marks and shell commands BACK TO PYTHON, see | |
# https://stackoverflow.com/a/1040640/8508004 (especially also see the comment by Mic |
ORDERNUMBER,Quantity Ordered,Price Each,ORDERLINENUMBER,SALES,"ORDERDATE","STATUS",QTR_ID,Month,YEAR_ID,"Product",MSRP,"PRODUCTCODE","CUSTOMERNAME","PHONE","ADDRESSLINE1","ADDRESSLINE2","City","STATE","POSTALCODE","COUNTRY","TERRITORY","CONTACTLASTNAME","CONTACTFIRSTNAME" | |
10107,30,95.7,2,2871,"2/24/2003 0:00","Shipped",1,2,2003,"Motorcycles",95,"S10_1678","Land of Toys Inc.","2125557818","897 Long Airport Avenue",,"NYC","NY","10022","United States","NA","Yu","Kwai" | |
10121,34,81.35,5,2765.9,"5/7/2003 0:00","Shipped",2,5,2003,"Motorcycles",95,"S10_1678","Reims Collectables","26.47.1555","59 rue de l'Abbaye",,"Reims",,"51100","France","EMEA","Henriot","Paul" | |
10134,41,94.74,2,3884.34,"7/1/2003 0:00","Shipped",3,7,2003,"Motorcycles",95,"S10_1678","Lyon Souveniers","+33 1 46 62 7555","27 rue du Colonel Pierre Avia",,"Paris",,"75508","France","EMEA","Da Cunha","Daniel" | |
10145,45,83.26,6,3746.7,"8/25/2003 0:00","Shipped",3,8,2003,"Motorcycles",95,"S10_1678","Toys4GrownUps.com","6265557265","78934 Hillside Dr.",,"Pasade |
# These are meant to work in both Python 2 and 3, except where noted. | |
# See my useful_pandas_snippets.py for those related to dataframes (such as pickling/`df.to_pickle(save_as)`) | |
# https://gist.github.com/fomightez/ef57387b5d23106fabd4e02dab6819b4 | |
# also see https://gist.github.com/fomightez/324b7446dc08e56c83fa2d7af2b89a33 for examples of my | |
# frequently used Python functions and slight variations for more expanded, modular structures. | |
#argparse | |
# good snippet collection at https://mkaz.tech/code/python-argparse-cookbook/ |
# Use `%%capture` to hush 'noisy' stdout and stderr streams, but still combine with getting `%%time` after | |
%%capture out_stream | |
%%time | |
---rest of a cell that does something with LOTS of output-- | |
#In cell after, put following to get time of completion from that: | |
#time it took to run cell above | |
for x in out_stream.stdout.split("\n")[-3:]: | |
print(x) | |
Benefits:
Options:
Benefits: