Skip to content

Instantly share code, notes, and snippets.

View jacquesfize's full-sized avatar

Jacques Fize jacquesfize

  • Gap, FR
View GitHub Profile
@jacquesfize
jacquesfize / paginate.sh
Last active October 9, 2018 07:14
Bash script that paginates a pdf file
#! /bin/bash
# If you use MacOS, download the latest version of pdftk from here :
# https://www.pdflabs.com/tools/pdftk-the-pdf-toolkit/pdftk_server-2.02-mac_osx-10.11-setup.pkg
function exists(command){
if ! [ -x "$(command -v $command)" ]; then
echo "Error: $command is not installed." >&2
exit 1
fi
@jacquesfize
jacquesfize / terminology_matcher.py
Last active February 21, 2019 08:29
A python class to match element (from a terminology) in text using Spacy module.
# coding= utf-8
import warnings
import re
import importlib
import glob
import copy
import pandas as pd
import numpy as np
from tqdm import tqdm
@jacquesfize
jacquesfize / remove_token.py
Last active June 11, 2022 14:59
A function to delete tokens from a spacy Doc object without losing associated information (PartOfSpeech, Dependance, Lemma, ...)
def remove_tokens(doc, index_to_del, list_attr=[LOWER, POS, ENT_TYPE, IS_ALPHA, DEP, LEMMA, LOWER, IS_PUNCT, IS_DIGIT, IS_SPACE, IS_STOP]):
"""
Remove tokens from a Spacy *Doc* object without losing
associated information (PartOfSpeech, Dependance, Lemma, extensions, ...)
Parameters
----------
doc : spacy.tokens.doc.Doc
spacy representation of the text
index_to_del : list of integer
# -*- coding: utf-8 -*-
# Natural Language Toolkit: Interface to the TreeTagger POS-tagger
#
# Copyright (C) Mirko Otto
# Author: Mirko Otto <dropsy@gmail.com>
# Modified by: Jacques Fize
"""
A Python module for interfacing with the Treetagger by Helmut Schmid.
@jacquesfize
jacquesfize / argparse_configreader.py
Created January 17, 2020 14:00
Initialize ArgParse with a Json config file
import argparse
import os
import json
class ArgParseConfigurationReader(object):
"""
Example of config JSON :
{
"description": "Program",
@jacquesfize
jacquesfize / healpix_diameter_km.py
Created June 18, 2020 09:18
Compute the diameter size (km) of a healpix cell
# install dependencies : pip install geopandas pandas numpy matplotlib descartes healpy
import pandas as pd
import numpy as np
import geopandas as gpd
from shapely.geometry import Polygon,MultiPolygon,Point,MultiLineString,LineString
import healpy
@jacquesfize
jacquesfize / bert_classification.py
Last active December 19, 2022 17:01
Script to use Bert for text classification
# REQUIREMENTS : pandas keras torch numpy transformers
"""
Strongly based from the article : https://mccormickml.com/2019/07/22/BERT-fine-tuning/
by Chris McCormick
"""
import os
import time
import random
@jacquesfize
jacquesfize / grid.py
Created September 10, 2021 11:48
Build a hexagon grid over a shapely geometry
import geopandas as gpd
from shapely.geometry import shape
CB = None # replace by a shapely geometry object.
r=0.2 # radius of the hexagon
h = (r * np.sqrt(3))/2
j = r/2