Skip to content

Instantly share code, notes, and snippets.

View louisguitton's full-sized avatar
Focusing

Louis Guitton louisguitton

Focusing
View GitHub Profile
@louisguitton
louisguitton / miserables_graph.gexf
Created February 8, 2024 15:54
miserables_graph
<?xml version="1.0" encoding="UTF-8"?>
<gexf version="1.2" xmlns="http://www.gexf.net/1.2draft" xmlns:viz="http:///www.gexf.net/1.1draft/viz">
<meta>
<title>Les Miserables.gexf</title>
<authors>Gephi 0.9.3</authors>
</meta>
<graph defaultedgetype="undirected">
<attributes class="node">
<attribute id="modularity_class" title="modularity_class" type="integer"/>
</attributes>
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
@louisguitton
louisguitton / poc.ipynb
Last active January 4, 2023 07:20
Proof of concept with tagspace and starspace
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
@louisguitton
louisguitton / get_air_traffic_data.py
Last active February 6, 2022 21:38
Collect air traffic data of french airports from aviation-civile.gouv.fr
import unicodedata
import html
import datetime
from dateutil.relativedelta import relativedelta
import csv
# how to make a GET request in python? -> learn how to use the 'requests' library
import requests
# how to use an external library to create a progress bar?
@louisguitton
louisguitton / git_cleanup.py
Last active June 3, 2021 10:05
Clean a local git repo that uses the rebase flow. Use with care!
# pip install tqdm GitPython
from git import Repo
from git.exc import GitCommandError
from tqdm import tqdm
DEFAULT_BRANCH = "master"
IGNORED_BRANCHES = [] # develop, ...
repo = Repo(".")
branches = repo.branches
@louisguitton
louisguitton / sql_queries_with_between_clause.json
Created February 4, 2021 18:55
output data of the sql_queries_with_between_clause fixture
{
"models/staging/localytics/stg_localytics__users_retained.sql_1": {
"date_column": "\n birth_date ",
"left_bound": " dateadd('day',-370,'{{ env_var(\"EXECUTION_DATE\") }}'::date)\n ",
"right_bound": "'{{ env_var(\"EXECUTION_DATE\") }}'::date"
},
"models/staging/localytics/dim_localytics__users.sql_1": {
"date_column": " occurred_at ",
"left_bound": " dateadd('day', -2, '{{ env_var(\"EXECUTION_DATE\") }}') ",
"right_bound": "dateadd('ms', -1, '{{ env_var(\"EXECUTION_DATE\") }}' + 1)\n"
classroom team
student candidate
teacher hiring manager
teaching assistant team member
assignment take-home code challenge
grading review
teaching process hiring process and employee growth
@louisguitton
louisguitton / tokenizations_post.md
Created July 10, 2020 15:35 — forked from tamuhey/tokenizations_post.md
How to calculate the alignment between BERT and spaCy tokens effectively and robustly

How to calculate the alignment between BERT and spaCy tokens effectively and robustly

image

site: https://tamuhey.github.io/tokenizations/

Natural Language Processing (NLP) has made great progress in recent years because of neural networks, which allows us to solve various tasks with end-to-end architecture. However, many NLP systems still requires language-specific pre- and post-processing, especially in tokenizations. In this article, I describe an algorithm which simplifies calculating of correspondence between tokens (e.g. BERT vs. spaCy), one such process. And I introduce Python and Rust libraries that implement this algorithm.

Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
## Hi, here are commands that will set you up for the git and github training.
## Open a terminal and copy paste those commands group by group
## Read the output to check if the command really worked.
/usr/bin/ruby -e "$(curl -fsSL https://raw.githubusercontent.com/Homebrew/install/master/install)"
brew install git
curl https://raw.githubusercontent.com/git/git/master/contrib/completion/git-completion.bash > ~/.git-completion.bash