Skip to content

Instantly share code, notes, and snippets.

def merge_sort(array):
if len(array) >= 2:
mid = len(array) // 2
left = merge_sort(array[:mid])
right = merge_sort(array[mid:])
return merge(left, right)
else:
return array
def merge(left, right):
@Weizhang2017
Weizhang2017 / scraping_dynamic.md
Last active December 7, 2023 10:12
Scraping dynamic HTML in Python with Selenium

Scraping dynamic HTML in Python with Selenium

When a web page is opened in a browser, the browser will automatically execute JavaScript and generate dynamic HTML content. It is common to make HTTP request to retrieve the web pages. However, if the web page is dynamically generated by JavasSript, a HTTP request will only get source codes of the page. Many websites implement Ajax to send information to and retrieve data from server without reloading web pages. To scrape Ajax-enabled web pages without losing any data, one solution is to execute JavaScript using Python packages and scrape the web page that is completely loaded. Selenium is a powerful tool to automate browsers and load web pages with the functionality to execute JavaScript.

1. Start Selenium with a WebDriver

Selenium does not contain a web browser. It calls an API on a WebDriver which opens a browser. Both Firefox and Chrome have their own WebDrivers that int

@Weizhang2017
Weizhang2017 / tunnel.conf
Last active October 26, 2022 18:55
Supervisor configuration for ssh tunnel
[program:tunnel]
command=ssh -L 5432:localhost:5432 remotehost
directory=/home/john
user=wei
autostart=true
autorestart=true
stdout_logfile=/var/log/supervisor/tunnel.log
redirect_stderr=true
numprocs=1
@Weizhang2017
Weizhang2017 / EmailValidationFlask.conf
Created March 19, 2020 07:39
Supervisor configuration for flask app
[program:EmailValidationFlask]
command=gunicorn -c gunicorn.conf.py wsgi:app
directory=/root/EmailValidationFlask/
user=root
autostart=true
autorestart=true
redirect_stderr=true
stdout_logfile=/root/EmailValidationFlask/gunicorn_supervisor.log
stdout_logfile_maxbytes=100MB
@Weizhang2017
Weizhang2017 / property_methods.py
Created December 6, 2019 01:53
A function to avoid repetitive property method
def typed_property(name, expected_type):
storage_name = '_' + name
@property
def prop(self):
return getattr(self, storage_name)
@prop.setter
def prop(self, value):
@Weizhang2017
Weizhang2017 / java.yml
Created November 17, 2019 04:07
Install java with Ansible
- hosts: remote
handlers:
- name: update package list
apt:
update_cache: yes
tasks:
- name: download Oracle java
get_url:
url: "{{ java_donwload_url }}"
@Weizhang2017
Weizhang2017 / retry.py
Created November 8, 2019 09:55
a decorator to retry upon Exception
import time
from functools import wraps
import logging
logger = logging.getLogger(__name__)
def retry(retries=3, delay=10, backoff=10, logger=None):
"""Retry calling the decorated function using an exponential backoff.
:param tries: number of times to try (not retry) before giving up
@Weizhang2017
Weizhang2017 / context_manager.py
Last active December 4, 2019 06:30
context manager in python
from pymongo import MongoClient
import time
class MongoConnect:
'''connect to mongodb with context manager'''
def __init__(self, host='127.0.0.1', port=27017, callbackOnExit=None, waitingTime=0):
self.host = host
self.port = port
self.con = None
self.callbackOnExit = callbackOnExit
@Weizhang2017
Weizhang2017 / database.py
Created October 30, 2019 02:03
Python interacting with database
#At low level, interacting with a database from SQL statements is straightforward
stocks = [
('GOOG', 100, 490.1),
('AAPL', 50, 545.75),
('FB', 150, 7.45),
('HPQ', 75, 33.2)
]
import sqlite3
db = sqlite3.connect('database.db')
@Weizhang2017
Weizhang2017 / binary_gap.py
Created October 11, 2019 02:13
Calculate the largest number of zeros between ones in a binary number
def conver_to_binary(N):
remainder = list()
quotient = N // 2
remainder.append(quotient%2)
while quotient > 0:
remainder.append(quotient%2)
quotient = quotient // 2
return remainder[::-1]
binary = conver_to_binary(N)
zeros = 0