Skip to content

Instantly share code, notes, and snippets.

View tchiavegatti's full-sized avatar

Tiago Chiavegatti tchiavegatti

View GitHub Profile

Keybase proof

I hereby claim:

  • I am tchiavegatti on github.
  • I am tchiavegatti (https://keybase.io/tchiavegatti) on keybase.
  • I have a public key ASASpRfR_QhFOzvmRIl-YIgHbm-kvRAE4ZL3umgDJL7oego

To claim this, I am signing this object:

Gitea to mirror github

In the instance you want to migrate all of your GitHub repositories to gitea, instead of manually importing all repos to github and setting them up to mirror we can import them all at once. When I add a repository to Gitea and specify I want it to be mirrored, Gitea will take charge of periodically querying the source repository and pulling changes in it. I’ve mentioned Gitea previously, and I find it’s improving as it matures. I’ve been doing this with version 1.7.5.

After setting up Gitea and creating a user, I create an API token in Gitea with which I can create repositories programatically. The following program will obtain a list of all Github repositories I have, skip those I’ve forked from elsewhere, and then create the repository in Gitea.

#!/usr/bin/env python -B
@tchiavegatti
tchiavegatti / install-python
Created July 28, 2022 18:32 — forked from SerhatTeker/install-python
Install Python (with desired version) on an Ubuntu Host
#!/usr/bin/env bash
# Bash safeties: exit on error, no unset variables, pipelines can't hide errors
set -o errexit
set -o nounset
set -o pipefail
# INFO
# --------------------------------------------------------------------------------------
# A shell script that downloads and installs Python on an Ubuntu host
@tchiavegatti
tchiavegatti / dtype_convertion_error.py
Created June 7, 2021 15:51
[dtype conversion error] Find and locate invalid values #pandas #python
import pandas as pd
df = pd.read_csv('data/data_8.csv')
is_error = pd.to_numeric(df['Grade'], errors='coerce').isna()
df[is_error]
### Keybase proof
I hereby claim:
* I am tcchiavegatti on github.
* I am tchiavegatti (https://keybase.io/tchiavegatti) on keybase.
* I have a public key ASDOmKRHWfA_a2J6AF_VQ5Az-fXH0QpbkzWCHJEx-TMNtwo
To claim this, I am signing this object:
@tchiavegatti
tchiavegatti / ExampleJoinByFussyLookup.py
Last active September 25, 2020 17:41
Join by string match #pandas
#https://towardsdatascience.com/joining-dataframes-by-substring-match-with-python-pandas-8fcde5b03933
import pandas as pd
df1 = pd.DataFrame([
['ABC', 'P1']
, ['BCD', 'P2']
, ['CDE', 'P3']
]
,columns = ['task_name', 'pipeline_name']
)
@tchiavegatti
tchiavegatti / counter.py
Created September 25, 2020 17:08
[Counter] Count values in a list
from collections import Counter
lst = [4, 3, 3, 2, 4, 3]
print(Counter(lst))
Counter({3: 3, 4: 2, 2: 1})
@tchiavegatti
tchiavegatti / fix_column_names.py
Created February 11, 2020 21:08
[Fix column names] Remove whitespace so you can use the df.ColumnName notation #pandas
# From https://medium.com/@chaimgluck1/working-with-pandas-fixing-messy-column-names-42a54a6659cd
#Sometimes you load in that DataFrame from a csv or excel file that some unlucky excel user created and you just wish everyone used Python. Why do they have to make the column names uppercase, with spaces, and whitespace all around? Do they like doing this to you? They probably hate you, that’s it. They did this:
#The nerve!
#Now you can’t reference the columns with the convenient .{column name here} notation. You’ll have to do the [''] thing. It’s not the most horrible thing that ever happened, you’ve been through a #lot. You can take it. But you’re used to the other way. Plus, what’s with those parentheses? You can’t have that.
#Luckily, pandas has a convenient .str method that you can use on text data. Since the column names are an ‘index’ type, you can use .str on them too. You can fix all these lapses of judgement by #chaining together a bunch of these .str functions. Like so:
df.columns = df.columns.str.s
@tchiavegatti
tchiavegatti / combine_xlsx.py
Created February 4, 2020 17:55
[Combine Excel files] #pandas
#!/usr/bin/env python3
from pathlib import Path
import pandas as pd
import xlsxwriter
# Set filename tag
tag = 'client'
# Set filepaths
@tchiavegatti
tchiavegatti / func_filepath.py
Created January 31, 2020 19:25
[Storage filepath] Function to set a filepath to a dataframe storage in feather, HDF5, CSV or Excel #pandas
def filepath(filetype, destination, version=None, tag=tag):
"""
Returns a filepath to for file export.
Attributes:
filetype: `str`, 'feather', 'hdf', 'csv', 'excel'
Type of the output file. Implemented to date are feather, hdf5 (h5), csv and Excel (xlsx).
destination: `str`, 'data' or 'output'.
Whether the file should be exported to the data folder or to the output folder. \
HDF5 and feather files storing temporary data can go to the data folder. \