Skip to content

Instantly share code, notes, and snippets.

@jobel-code
jobel-code / Light data with Google Science Journal.ipynb
Created February 25, 2019 13:18 — forked from fperez/Light data with Google Science Journal.ipynb
Real-world data in a few minutes with Google Science Journal
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
@jobel-code
jobel-code / ProgrammaticNotebook.ipynb
Created February 21, 2019 11:09 — forked from fperez/ProgrammaticNotebook.ipynb
Creating an IPython Notebook programatically
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
@jobel-code
jobel-code / README.md
Created February 21, 2019 11:09 — forked from fperez/README.md
Polyglot Data Science with IPython

Polyglot Data Science with IPython & friends

Author: Fernando Pérez.

A demonstration of how to use Python, Julia, Fortran and R cooperatively to analyze data, in the same process.

This is supported by the IPython kernel and a few extensions that take advantage of IPython's magic system to provide low-level integration between Python and other languages.

See the companion notebook for data preparation and setup.

@jobel-code
jobel-code / safe_unicode_to_ascii.py
Created January 30, 2019 12:38
Removes unicode characters, by replacing them with ascii equivalents. Useful to safe_filenames and dictionary keys that will be casted as JSON
def safe_unicode_to_ascii(s: str):
"""removes unicode characters, by replacing them with ascii equivalents
:param s:
:return:
"""
# source: https://stackoverflow.com/questions/1207457/convert-a-unicode-string-to-a-string-in-python-containing-extra-symbols
normalized = unicodedata.normalize('NFKD', s.strip().replace('/', '-')).encode('ascii', 'ignore')
return normalized.decode('utf-8').replace(' ', '_').replace(':', '__').replace('?', '').replace('!', '').replace('&', '_')
@jobel-code
jobel-code / gist_regex_column_names_text_match.py
Created January 24, 2019 12:48
regex find all columns in a pandas dataframe if the column includes matching text
import re
r = re.compile("depth", re.IGNORECASE)
# regex to find all the columns names that matches depth at any place in the string.
depth_cols = sorted(list(filter(r.search, subset_df)))
@jobel-code
jobel-code / gist_aggregate_multiple_tsv_files_into_one.bash
Created January 23, 2019 09:10
Aggregate text files with same header into single file
# TO AGGREGATE RUN # THIS WILL ADD THE HEADERS ON FOR EACH FILE. USE ONLY FOR DEBUG
# cat *.tsv > aggregated_files_with_headers.csv
# TO FINAL AGGREGATION RUN IN CONSOLE
# SOURCE: https://unix.stackexchange.com/questions/60577/concatenate-multiple-files-with-same-header
# The first line of the awk script matches the first line of a file (FNR==1)
# except if it's also the first line across all files (NR==1).
# When these conditions are met, the expression while (/^<header>/) getline; is executed,
# which causes awk to keep reading another line (skipping the current one) as long as
# the current one matches the regexp ^<header>.
@jobel-code
jobel-code / gist_how2_install_qgis.md
Last active January 15, 2019 15:07
how to install qgis in Ubuntu 18.04

Install gdal

sudo add-apt-repository -y ppa:ubuntugis/ubuntugis-unstable
sudo apt update 
sudo apt upgrade # if you already have gdal 1.11 installed 
sudo apt install gdal-bin python-gdal python3-gdal

Using vim open the sources.list file.

sudo vim /etc/apt/sources.list

@jobel-code
jobel-code / gist_find_duplicates.py
Created January 14, 2019 12:13
Find duplicates in a list
# Find duplicates in list
import collections
def find_duplicates(my_list:list)->list:
return [item for item, count in collections.Counter(my_list).items() if count > 1]
@jobel-code
jobel-code / gist_split_large_files_with_header.bash
Created December 12, 2018 08:52
Splits large files into smaller files of 1000 lines, keeping the header on each small file
%%bash
# echo $filepath
in_file=$in_filepath
DIR=$(dirname "$in_filepath")
filename=$(basename -- "$in_filepath")
extension="${filename##*.}"
#filename="${filepath##*/}" # This one will keep the extension
@jobel-code
jobel-code / gist_find_all_files.bash
Last active December 10, 2018 09:31
Bash collection of find files with given text
# List all the files that contains `Text to find` in the `dirpath`
grep -l "Text to find" ~/dirpath/*
# Find all the yaml files, case insensitive (-i) that have "Text to find" and save it to to a text file.
# add -r for recursion.
grep -i --include="*.yaml *.yml" "text to find" dirptahToSearch/* > ~/saveResultsHere.txt