Skip to content

Instantly share code, notes, and snippets.

View gmaze's full-sized avatar
🤩
Last argopy pre-release v0.1.14 is out !

Guillaume Maze gmaze

🤩
Last argopy pre-release v0.1.14 is out !
View GitHub Profile
name: obidam
channels: !!python/tuple
- !!python/unicode
'anaconda-fusion'
- !!python/unicode
'conda-forge'
- !!python/unicode
'defaults'
dependencies:
- _license=1.1=py27_1
@gmaze
gmaze / multiprocessing_eg_02datarmor.py
Last active November 23, 2017 13:08
Simplest single-core multi-cpu parallel run on Datarmor
#!/usr/bin/env python
#
# This example shows how to launch multiple processes in parallel on a single machine with multiple cpus
# There is no communication between processes and no data are gathered in the end.
# Each process executes the same function but with different arguments.
# The script wait for all sub-processes to be done, then execute another task.
#
# How to run on your computer:
# python multiprocessing_eg_02.py
#
@gmaze
gmaze / numpy_arange_vs_linspace_trunc.py
Created December 11, 2017 12:29
Demonstrate the different behaviours of numpy arange and linspace with regard to truncation error
import numpy as np
eps = np.finfo(np.float64).eps
print "This is epsilon:", eps
print "\nnumpy.arange"
dX = 0.1
X = np.arange(34.,38.,dX)
udX = np.unique(np.diff(X))
print "Unique dX:", udX
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
@gmaze
gmaze / obidam_storage.md
Last active May 17, 2018 09:27
OBIDAM: dataset fast access issue

To run data mining algorithms on ocean's large datasets, we need to optimise access to datasets with possibly up to 6-dimensions.

A generalised 6-dimensional dataset is [X,Y,Z,T,V,E] where:

  • X,Y,Z,T are the space/time dimensions,
  • V is the variable dimension (eg: temperature, salinity, zonal velocity) and,
  • E the ensemble dimensions (list of realisations or members).

Running data mining algorithms on this dataset mostly implies to re-arrange the 6 dimensions into 2-dimensional arrays with, following the statistics vocabulary "sampling" vs "features" dimensions. The sampling dimension is along rows, the features along columns. A large dataset can have billions of rows and hundreds of columns.

Eg:

#~/usr/bin/env python
#
# Useful functions for xarray time series analysis
# (c) G. Maze, Ifremer
#
import numpy as np
import xarray as xr
import pandas as pd
from statsmodels.tsa.seasonal import seasonal_decompose
@gmaze
gmaze / passing_options.py
Last active March 19, 2020 15:28
Passing/merging options from function to subfunctions
def base_fct(**kwargs):
defaults = {'sharey':'row', 'dpi':80, 'facecolor':'w', 'edgecolor':'k'}
options = {**defaults, **kwargs}
return options
def fct(**kwargs):
defaults = {'sharey':'cols'}
return base_fct(**{**defaults, **kwargs})
print("Default base options:\n", base_fct())
@gmaze
gmaze / GenerateMovie.sh
Last active March 20, 2020 11:03
Create a movie from a collection of image files
#!/usr/bin/env bash
#
# Gerenate mp4 videos from a collection of image files
#
# Video files are saved into ./videos
#
# Folder with image files:
src="/home/datawork-lops-oh/somovar/WP1/data/dashboard/img/monthly" # This is an example
@gmaze
gmaze / Parallel_images.py
Created March 20, 2020 12:44
Parallel figure generation in python
#!/usr/bin/env python
# coding: utf-8
#
# $ time ./Parallel_images.py
# Use 8 processes
# 107.249u 2.444s 0:17.10 641.4% 0+0k 0+0io 1056pf+0w
#
import os
import numpy as np
@gmaze
gmaze / Compare_time_response_erddap.py
Last active April 8, 2020 09:41
Compare_time_response_erddap
#!/bin/env python
# -*coding: UTF-8 -*-
import requests
import time
# Request full data:
t0 = time.time()
url = 'http://www.ifremer.fr/erddap/tabledap/ArgoFloats.csv?data_mode,latitude,longitude,position_qc,time,time_qc,direction,platform_number,cycle_number,pres,temp,psal,pres_qc,temp_qc,psal_qc,pres_adjusted,temp_adjusted,psal_adjusted,pres_adjusted_qc,temp_adjusted_qc,psal_adjusted_qc,pres_adjusted_error,temp_adjusted_error,psal_adjusted_error&platform_number=~"5900446"&distinct()&orderBy("time,pres")'
requests.get(url)