Skip to content

Instantly share code, notes, and snippets.

@dpapathanasiou
dpapathanasiou / text_grabber.py
Created October 27, 2012 15:18
How to extract just the text from html page articles
"""
A series of functions to extract just the text from html page articles
"""
from lxml import etree
default_encoding = "utf-8"
def newyorker_fp (html_text, page_encoding=default_encoding):
"""For the articles found on the 'Financial Page' section of the New Yorker's website
@dpapathanasiou
dpapathanasiou / mergesort.py
Created March 25, 2019 01:01
Merge Sort in Python
#!/usr/bin/env python
"""
An implementation in python, inspired by the haskell version via
Cormen
(https://github.com/dpapathanasiou/algorithms-unlocked-haskell/blob/master/algorithms-for-sorting-and-searching/MergeSort.hs)
"""
def merge (a, b, c=[]):
@dpapathanasiou
dpapathanasiou / quicksort.py
Created March 25, 2019 01:02
Quicksort in Python
#!/usr/bin/env python
"""
An implementation in python, inspired by the version in "Learn You
a Haskell for Great Good!"
(http://learnyouahaskell.com/recursion#quick-sort)
"""
def quicksort (a):
@dpapathanasiou
dpapathanasiou / binary_search_tree.py
Last active March 31, 2019 21:40
Binary Search Tree in Python
#!/usr/bin/env python
"""
A binary search tree implementation, from:
"Python Algorithms: Mastering Basic Algorithms in the Python Language"
by Magnus Lie Hetland
ISBN: 9781484200551
@dpapathanasiou
dpapathanasiou / .bash_aliases
Created March 15, 2020 14:27
Scripting the pdftk-java jar to work like the pdftk binary
alias pdftk='$HOME/.pdftk.sh'
#!/usr/bin/env python
"""
An implementation of the "sliding window" technique to find shortest matching subarrays, inspired by:
https://leetcode.com/problems/find-all-anagrams-in-a-string/discuss/92007/sliding-window-algorithm-template-to-solve-all-the-leetcode-substring-search-problem
"""
from sys import maxint
@dpapathanasiou
dpapathanasiou / dst.py
Created August 16, 2014 15:42
How to tell if Daylight Savings Time is in effect using Python
from datetime import datetime
import pytz
def is_dst ():
"""Determine whether or not Daylight Savings Time (DST)
is currently in effect"""
x = datetime(datetime.now().year, 1, 1, 0, 0, 0, tzinfo=pytz.timezone('US/Eastern')) # Jan 1 of this year
y = datetime.now(pytz.timezone('US/Eastern'))
@dpapathanasiou
dpapathanasiou / trie.py
Last active April 16, 2022 09:47
Ternary Search Tree in python
#!/usr/bin/env python
"""
A ternary search tree implementation, inspired by:
http://www.drdobbs.com/database/ternary-search-trees/184410528 and
https://lukaszwrobel.pl/blog/ternary-search-tree/
https://github.com/djtrack16/tst/blob/master/ternarysearchtree.py
"""
@dpapathanasiou
dpapathanasiou / reformat_csv.py
Last active April 21, 2022 22:48
A simple python script to reduce or reformat a csv file, producing a csv with a specific sub-set of columns, stripping out any undesired characters from the individual row values
'''
A simple python script to reduce or reformat a csv file, producing a
csv with a specific sub-set of columns, stripping out any undesired
characters from the individual row values.
'''
import csv
#!/usr/bin/env python
"""
A breadth-first search implementation from:
"MIT Open CourseWare: Introduction to Algorithms"
Lecture 13: Breadth-First Search
https://ocw.mit.edu/courses/electrical-engineering-and-computer-science/6-006-introduction-to-algorithms-fall-2011/lecture-videos/lecture-13-breadth-first-search-bfs/