Skip to content

Instantly share code, notes, and snippets.

View Dixhom's full-sized avatar

dixhom Dixhom

View GitHub Profile
@Dixhom
Dixhom / pdf_to_img.py
Created November 12, 2024 08:28
convert a pdf file to images
import os
from pathlib import Path
from pdf2image import convert_from_path
# For Windows users, download poppler Release-24.08.0-0.zip from here (https://github.com/oschwartz10612/poppler-windows/releases)
# unzip it and add the poppler path to the environmental variable.
poppler_dir = "<path-to-poppler>/poppler-24.08.0/Library/bin"
os.environ["PATH"] += os.pathsep + str(poppler_dir)
pdf_path = Path("<path-to-pdf>/file.pdf")
@Dixhom
Dixhom / unique_dicts.py
Created July 26, 2024 11:03
an example of getting a list of unique dicts using named tuple
'''an example of getting a list of unique dicts using named tuple'''
from collections import namedtuple
dict_list = [
dict(a=1, b=2, c=3),
dict(a=10, b=20, c=30),
dict(a=100, b=200, c=300),
dict(a=1, b=2, c=3),
dict(a=10, b=20, c=30),
@Dixhom
Dixhom / gist:bfb9630e5ffbf2347e76ac4bf24e8f82
Last active July 13, 2024 22:40
Numeric string converter
def parse_price(price):
'''Convert a mixture of arabic and Chinese numerics to arabic numerics. For example, price="4,256円"や"百三十二円"
漢数字等が混じった数字文字列を数字に変換する。例えば、price="4,256円"や"百三十二円"'''
price = price.strip()
# 削除する文字列
delete_chars = [',', '、', '円', '¥', '\\']
for char in delete_chars:
price = price.replace(char, '')
# 漢数字や全角数字、半角数字
@Dixhom
Dixhom / custom_tqdm.py
Last active November 12, 2023 05:42
custom tqdm
"""
advantages over normal tqdm:
1. progress can be checked more in detail.
a progress bar comes with a number at the end "########6".
This means the last portion is done 60%.
2. The total number of iteration can be changed.
`obj.total_count = new_count`
"""
@Dixhom
Dixhom / day_of_the_week.c
Created July 7, 2023 13:51
Yet another code to get the day of the week from (year, month, day)
int base_year = 1970;
// cumulative days of the year for each month
int daysCumsum[] = {0, 31, 59, 90, 120, 151, 181, 212, 243, 273, 304, 334, 365};
char* dayOfTheWeek[] = {"Mon", "Tue", "Wed", "Thu", "Fri", "Sat", "Sun"};
int getLeapYear(int year) {
// get num leap years from the year 1
int div4 = year / 4;
int div100 = year / 100;
@Dixhom
Dixhom / day_of_the_week.py
Created July 7, 2023 13:48
Yet another code to get the day of the week from (year, month, day)
def getLeapYear(year):
# get num leap years from the year 1
div4 = year // 4
div100 = year // 100
div400 = year // 400
# number of leap years until `year`
return div4 - div100 + div400
def isLeapYear(year):
# whether `year` is a leap year
@Dixhom
Dixhom / scrape.py
Created June 10, 2023 11:30
scraping-snippet
import requests
from bs4 import BeautifulSoup
url = "https://mansion-market.com/mansions/categories/brands"
r = requests.get(url)
soup = BeautifulSoup(r.content, "html.parser")
# be careful, select() returns a list.
[a.text for a in soup.select('div.article_list')[0].select('li')]
@Dixhom
Dixhom / req_create.py
Last active May 9, 2023 05:10
This code reads import statements from python scripts and creates a requirements_txt.
# About:
# - This code reads import statements from python scripts and creates a requirements_txt.
# Caveats:
# - Sometimes library names in pip and python scripts are different. e.g.) pip install scikit-learn vs import sklearn. The users need to fix the name by themselves.
# Todo:
# - Prepare a conversion table between the library names in pip and python scripts and create the requirements.txt properly.
import pkg_resources
@Dixhom
Dixhom / n_digit_power.py
Created January 15, 2023 06:46
What n-digit number doesn't change the last n-digit of its power?
n = 8
div = 10**n
for i in range(10**(n-1), 10**n):
if (i * i) % div == i:
print(i)
# 2: 25, 76
# 3: 376, 625
# 4: 9376
# 5: 90625
@Dixhom
Dixhom / rational.py
Created November 1, 2022 11:49
Rational numer class
class Rational:
"""
a class to handle rational numbers
"""
def __init__(self, num, denom=1):
self.num = num # numerator
self.denom = denom # denominator
def __gcd(self,u,v):
"""