Jonathan Chambers mangecoeur

## tessa_data_vizualisation.ipynb

      
              1 file
            
          
              0 forks
            
          
              0 comments
            
          
              0 stars
            
          
                mangecoeur
                / tessa_data_vizualisation.ipynb
            
            
              Last active
              March 18, 2024 13:09
            
          
        Loading

      Sorry, something went wrong. Reload?
      Sorry, we cannot display this file.
      Sorry, this file is invalid so it cannot be displayed.
      
          Viewer requires iframe.
      
    
## jhub-the-hard-way.md

      
              1 file
            
          
              0 forks
            
          
              0 comments
            
          
              4 stars
            
          
                mangecoeur
                / jhub-the-hard-way.md
            
            
              Created
              November 5, 2019 16:19
            
              
                Install Jupyterhub and Jupyterlab The Hard Way
              
          
    Install Jupyterhub and Jupyterlab The Hard Way

The combination of Jupyterhub and Jupyterlab
is a great way to make shared computing resources available to group.
These instruction are a guide for a manual, 'bare metal' install of Jupyterhub
and Jupyterlab. This is ideal for running on a single server: build a beast
of a machine and share it within your lab, or use a virtual machine from any VPS or cloud provider.
This guide has similar goals to that of The Littlest Jupyerhub setup

  
## config.py
"""
Background
----------

One of the simplest configuration approaches in python is to just use python files,
giving you the full power of python - the least hassle approach in a trusted environment.
However, importing config modules can be problematic in interactive environments.

For example, when using jupyter notebooks organised into sub-folders,
we want to access a common config file in the overall project root.

## slim_silhouette.py
from sklearn.utils import check_X_y
from sklearn.preprocessing import LabelEncoder
from sklearn.metrics.cluster.unsupervised import check_number_of_labels

from numba import jit

@jit(nogil=True, parallel=True)
def euclidean_distances_numba(X, Y=None, Y_norm_squared=None):
    # disable checks
    XX_ = (X * X).sum(axis=1)

## init.coffee
# # we need a reference to the snippets package
# snippetsPackage = require(atom.packages.getLoadedPackage('autocomplete-snippets').path)
#
# # we need a reference to the original method we'll monkey patch
# __oldGetSnippets = snippetsPackage.getSnippets
#
# snippetsPackage.getSnippets = (editor) ->
#   snippets = __oldGetSnippets.call(this, editor)
#
#   # we're only concerned by ruby files

## archive_to_nc.py
import datetime
import shutil
import tempfile
import tarfile

from collections import namedtuple
from pathlib import Path
from enum import IntEnum

import numpy as np

## description.md

      
              2 files
            
          
              5 forks
            
          
              10 comments
            
          
              19 stars
            
          
                mangecoeur
                / description.md
            
            
              Last active
              March 30, 2021 21:34
            
              
                Pandas PostgresSQL support for loading to DB using fast COPY FROM method
              
          
    This small subclass of the Pandas sqlalchemy-based SQL support for reading/storing tables uses the Postgres-specific "COPY FROM" method to insert large amounts of data to the database. It is much faster that using INSERT. To acheive this, the table is created in the normal way using sqlalchemy but no data is inserted. Instead the data is saved to a temporary CSV file (using Pandas' mature CSV support) then read back to Postgres using Psychopg2 support for COPY FROM STDIN.

  
## concurrent.futures-intro.md

      
              1 file
            
          
              14 forks
            
          
              8 comments
            
          
              130 stars
            
          
                mangecoeur
                / concurrent.futures-intro.md
            
            
              Last active
              July 20, 2024 10:30
            
              
                Easy parallel python with concurrent.futures
              
          
    Easy parallel python with concurrent.futures

As of version 3.3, python includes the very promising concurrent.futures module, with elegant context managers for running tasks concurrently. Thanks to the simple and consistent interface you can use both threads and processes with minimal effort.
For most CPU bound tasks - anything that is heavy number crunching - you want your program to use all the CPUs in your PC. The simplest way to get a CPU bound task to run in parallel is to use the ProcessPoolExecutor, which will create enough sub-processes to keep all your CPUs busy.
We use the context manager thusly:
with concurrent.futures.ProcessPoolExecutor() as executor:

  
## pandas-sqlalchemy-read.py
"""
Collection of query wrappers / abstractions to both facilitate data
retrieval and to reduce dependency on DB-specific API.
"""
from pandas.core.api import DataFrame


def _safe_fetch(cur):
    try:
        result = cur.fetchall()

## a-conda-workon-tool.md

      
              2 files
            
          
              3 forks
            
          
              7 comments
            
          
              23 stars
            
          
                mangecoeur
                / a-conda-workon-tool.md
            
            
              Last active
              February 9, 2021 14:53
            
              
                A "virtualenv activate" for Anaconda environments
              
          
    A "virtualenv activate" for Anaconda environments

I've been using the Anaconda python package from continuum.io recently and found it to be a good way to get all the complex compiled libs you need for a scientific python environment. Even better, their conda tool lets you create environments much like virtualenv, but without having to re-compile stuff like numpy, which gets old very very quickly with virtualenv and can be a nightmare to get correctly set up on OSX.
The only thing missing was an easy way to switch environments - their docs suggest running python executables from the install folder, which I find a bit of a pain. Coincidentally I came across this article -  Virtualenv's bin/activate is Doing It Wrong - which desribes a simple way to launch a sub-shell with certain environment variables set. Now simple was the key word for me since my bash-fu isn't very strong, but I managed to come up with the script below. Put this in a text file called conda-work
	"""
	Background
	----------

	One of the simplest configuration approaches in python is to just use python files,
	giving you the full power of python - the least hassle approach in a trusted environment.
	However, importing config modules can be problematic in interactive environments.

	For example, when using jupyter notebooks organised into sub-folders,
	we want to access a common config file in the overall project root.
	from sklearn.utils import check_X_y
	from sklearn.preprocessing import LabelEncoder
	from sklearn.metrics.cluster.unsupervised import check_number_of_labels

	from numba import jit

	@jit(nogil=True, parallel=True)
	def euclidean_distances_numba(X, Y=None, Y_norm_squared=None):
	# disable checks
	XX_ = (X * X).sum(axis=1)
	# # we need a reference to the snippets package
	# snippetsPackage = require(atom.packages.getLoadedPackage('autocomplete-snippets').path)
	#
	# # we need a reference to the original method we'll monkey patch
	# __oldGetSnippets = snippetsPackage.getSnippets
	#
	# snippetsPackage.getSnippets = (editor) ->
	# snippets = __oldGetSnippets.call(this, editor)
	#
	# # we're only concerned by ruby files
	import datetime
	import shutil
	import tempfile
	import tarfile

	from collections import namedtuple
	from pathlib import Path
	from enum import IntEnum

	import numpy as np
	"""
	Collection of query wrappers / abstractions to both facilitate data
	retrieval and to reduce dependency on DB-specific API.
	"""
	from pandas.core.api import DataFrame


	def _safe_fetch(cur):
	try:
	result = cur.fetchall()